A Pound Sign in an URL for passing data?

JustWondering · January 18, 2011, 12:49am

I remember seeing this in a couple of places, the last place was Google Translate, take a look:

http://translate.google.com/#auto|en|&#1054;&#1076;&#1080;&#1085;&#1086;&#1095;&#1077;&#1089;&#1090;&#1074;&#1086;

What language is that? Is it some mod_rewrite magic? I have seen it in a couple of other places but didn’t think much of it. Is it a new standard? Something that has to do with Ruby on Rails? Or is it some sort of witchcraft?

Thanks!

felgall · January 18, 2011, 1:19am

They may be doing a mod rewrite with that particular one since the characters following the # are not a valid id. Normally the # is used to indicate that the id following it should be located within the page and moved as close as possible to the top of the viewport.

For example nextpage.html#content should go to nextpage.html and then find the <div id=“content”> tag and scroll the page down so that tag is at the top edge of the browser viewport.

rpkamp · January 18, 2011, 1:59am

It isn’t mod_rewrite, since browsers don’t send the #whatever part of the URL to the server in the HTTP request when requesting the page.

I’ve seen it used for AJAX heavy applications. Say for example for google translate when I change the text and like to send the URL of what I’m seeing to you I can’t because the URL of the page doesn’t change; everything happens through AJAX. I’d have to tell you to go to google translate and what text to type in. Javascript can change the URL in the address bar, but when it does it refreshes the page to that new URL, making the use of AJAX pointless.

What javascript can change without any consequences is the # part of the URL. So they just do that and when I send my URL with the # to you, your browsers opens the page, javascript checks the part after the # (window.location.hash if I’m not mistaken) and send an AJAX request to the server with the information found in that hash. Your browser will than load that information into the page and you will see what I saw when I sent you that link.

JustWondering · January 18, 2011, 2:06am

Yes, I know about that function of it, and how it can be used in links to link to certain divs with a certain id or even used for JavaScript links.

It’s just that I felt it wasn’t a coincidence for me to see it a couple of times used on different websites, esp. when Google is involved. What’s wrong with the plain old get variables? I don’t know what the advantage of this may be, maybe it’s more about SEO (search engines ignore the pound sign).

JustWondering · January 18, 2011, 2:13am

rpkamp:

It isn’t mod_rewrite, since browsers don’t send the #whatever part of the URL to the server in the HTTP request when requesting the page.

I’ve seen it used for AJAX heavy applications. Say for example for google translate when I change the text and like to send the URL of what I’m seeing to you I can’t because the URL of the page doesn’t change; everything happens through AJAX. I’d have to tell you to go to google translate and what text to type in. Javascript can change the URL in the address bar, but when it does it refreshes the page to that new URL, making the use of AJAX pointless.

What javascript can change without any consequences is the # part of the URL. So they just do that and when I send my URL with the # to you, your browsers opens the page, javascript checks the part after the # (window.location.hash if I’m not mistaken) and send an AJAX request to the server with the information found in that hash. Your browser will than load that information into the page and you will see what I saw when I sent you that link.

That makes more sense.

What’s wrong with http.open? Isn’t it more adequate, and they can store the data in variables.

Eureka! that way they know that the variables won’t be lost (sticky variables?) if the page is refreshed - the sole reason why you could open that page and see the definition, that wasn’t possible with Google Translate in the past.

BTW, if it can’t be mod rewrite, then how can they make it unobtrusive?

Now that makes a lot more sense. Thanks!

dklynn · January 18, 2011, 3:33am

JW,

Those look like letters from the Cyrillic alphabet, i.e., Russian.

When the query string or page anchor follows the /, it’s a lazy way to specify that the first DirectoryIndex file found will be served (with the query string or, in this case, with the page anchor).

Regards,

DK

JustWondering · January 18, 2011, 4:11am

David,

That indeed is Russian. The word means loneliness, pronounces adinotchistva, a very common word in Russian songs. I was just wondering why they were using it like that.

I usually do the same thing, only with get variables, or with slashes and mod_rewrite. I do my fair share of reading. To me, if Google is using something, it has to be cutting-edge. That’s what they did with Ajax (though they were not the first). So, I was wondering whether there was a new trend that I missed.

I don’t know Ruby on Rails, so whenever I see something weird I usually attribute it to it. I just was wondering about the reason why they may include their variables like that, and the advantages of such a strange approach.

The answer probably is that they’re trying to maintain the data between refreshes, make sure that they don’t fill up search engines with junk pages (duplicate content issues), and at the same time keep their URLs short.

rpkamp · January 18, 2011, 9:41am

Well, they don’t.
If you don’t javascript enabled, the whole # functionality doesn’t work anymore: it won’t change the URL as you type, nor will it react to the # part of the URL if someone sends you a link with that in it.
The app will simply revert to using POST variables and will still work using that.

This principle is known as Progressive enhancement: those that have javascript get a little bit extra, but that little bit extra is not crucial to the working of the site, and the app still works perfectly fine without it.

As a side note, this technique is also being used for full screen flash applications where the URL also doesn’t change as you browse through the flash. Same reason: if you send the link to a friend they will still the page you’re currently on, so you don’t have to tell them “go to this page, then click here, then there” (etc)

BTW. Even though you can indeed store variables this way like you suggested, it would be a bad idea to store sensitive information this way since everyone can see that data if they obtain a link with that information in it. For sensitive information it’s still better to use the session management of whatever back-end language it is you’re writing in (PHP, ASP, etc)

EnderMB · January 18, 2011, 11:34am

I know this doesn’t necessarily answer the question, but the hash symbol is used quite a lot in URL’s nowdays, with jQuery and recently for use in SEO.

A lot of Google, Facebook and Twitter links now use the “shebang”, a crossover from Unix. It is used as a way for AJAX applications to be crawled by Google, as detailed [url=http://code.google.com/web/ajaxcrawling/docs/faq.html#whentousewhich]here.

A good guide to explain this is The Single Page Interface Manifesto.

JustWondering · January 18, 2011, 7:04pm

That’s the exact answer I have been looking for. Thanks!

EnderMB · January 18, 2011, 9:13pm

Ha! I was thinking the exact same thing a few days ago, and after a bit of Googling during work I discovered the whole shebang thing. It’s pretty clever, and definitely information that isn’t easy to come across.

Anyway, you’re welcome.