Underscore & Hyphens

Hello Experts,

Some history: I have downloadable files on my website that originally has a space within their titles. For example: Gone with the wind would read after uploading Gone%20with%20the%20wind so I changed the titles as Gone_with_the_wind but Google reads this not as four separate words but as Gonewiththewind.

Questions: Is it just Goggle or do various/all search engines work this way? Would it be best practice to change the format to Gone-with-the-wind as Google does treat each word separately.

Morrile

Placing the dashes will give you a better SEO return than leaving the title all smashed up as it will allow for a more exacting match. It’s a negligible return, but everything helps.

I do not think Google would treat gone-with-the-wind and gone_with_the_wind differently or anything. Even in domain name, it is the same.

However, as the “_” is more difficult to type (2 key combination) compare to “-”, so the “-” is more recommendable.

Here’s an interesting read - it’s a bit outdated, but I wouldn’t be surprised if the tenets aren’t still true: http://www.mattcutts.com/blog/dashes-vs-underscores/

Edit:

Here’s a 2011 video update which says the same thing, so I’d guess it’s still valid.

I googled around, and it would seem that Bing behaves differently than Google in this regard. Bing treats both hyphens and underscores as word separators, whereas Google treats only hyphens as separators.

I would say yes. It may help – in some small way – with your Google ranking.

EDIT: Dave beat me to it. :slight_smile:

Although as Google is perfectly capable of figuring out that gonewiththewind is “gone with the wind” and not “go new ITHT hew in D”, I’m not so sure it’s a big deal. Unless you’re in danger of running into the ol’ Pen Island problem, I wouldn’t worry too much … and even then, there should be enough other clues in your site content as to how the concatenated string should be broken down that Google shouldn’t have any difficulty deciphering it. The bigger issue is how people type the words in … and there, hyphens are better than dashes - first, they are easier to type, and second, links are usually underlined and underlined-dashes-are-easy-to-spot, but underlined_underscores_can_be_missed.

Google’s latest video explain why they treat them both differently and it seems only Google do this. So looks like I will have to rename all the files and update the links. Keeps me out of trouble

Many thanks to everyone’s input. Much appreciated :slight_smile:

(the previous post should have been a reply with quotes but getting interrupted means I failed to notice, mea culpa)

I agree that it’s not worth worrying about this (although I also agree that hyphens are preferable because they are easier for users to read).

And there’s another point. Morrile, you say that these are downloadable files. In that case, does it really matter how Google treats it? If someone is searching for a download of Gone with the Wind, you want Google to show them the page on your site from which they can download it - not the file itself.

I don’t know about you, but I would hesitate to click on a Google search result that caused an immediate download. I’d at least want to know where I was downloading it from. For that reason, I’d prefer Google to direct me to the hosting page, not the actual file. In fact, I’m not even sure Google would show the file itself in its search results. If that’s right, there’s even less reason to worry about hyphens vs underscores.

(By the way, I’m sure you’re just citing Gone with the Wind as an example, and you’re not really offering it for download, given that it’s still in copyright.)

Mike

You could use URL rewriting if you don’t want to rename all the files.

Google certainly does offer direct downloads, and will index and rank PDF, DOC/DOCX, XLS/XLSX, PPT/PPTX files just as readily as HTML pages. Because these files are content-rich and low on cruft, they often do well in the search results … although that doesn’t necessarily help you as a site owner, because there’s often no way to get back to the site itself.

In the old days, it was easy enough for a savvy webby to get into the site – they could just right-click on the link in Google and copy it, then paste it into the address bar, remove the file name and keep trying each directory level until they struck gold. Easy as pie. Of course, with Google now screwing up the URL as soon as you left- or right-click on it*, you can’t do that – so if your landing page is a non-HTML page, the chances are that your visitors will just bounce, and you’ll never get them to a page where they can interact, buy anything or even click on adverts.

For that reason, I would always put non-HTML pages that are likely to be indexed into a folder and mark it as noindex, with gateway pages to each file that have enough content on to act as link bait. While some people might be slightly annoyed at having to go through the extra click, it will work out better in a lot more cases.

I was on the understanding that files (in my case it’s eBooks & PDF files - but not ‘gone with the wind’) were listed by site maps generators from the various CEO tools. Therefore a hyphen would prove to be prudent than an underscore when they are listed. As I am designing a new FGL website I will rename them all. My personal website is more about information and not about selling. I do know that I made various mistakes on the older websites and wish to reduce that to zero this time.

I have noticed the Google http://www.google.com/url?sa=t&rct=j&q=sitepoint&source=web&cd=1&cad=rja&ved=0CCwQFjAA&url=http%3A%2F%2Fwww.sitepoint.com%2F&ei=xYb5UY7yKsWp0QXPs4CgCw&usg=AFQjCNEeuIPTt5iaDfru5m-xvex9Ug_YJg&bvm=bv.49967636,d.d2k but is this just to maintain their dominance in the market place or a new method of keeping track?

[ot]

I’ve always assumed it’s their tracking method, because there are several other referral networks that use the same obnoxious trick. It’s hard to imagine there isn’t a more user-friendly way of achieving the same result.[/ot]

Stevie D,

Many thanks. Sorry about the Off Topic, email habbit.

Morrile