How does this work? (URL question)

I’m confused by this url:

http://us.battle.net/sc2/en/forum/topic/375110697

The numbers at the end represent an actual file, right? Because it’s definitely not a directory (375110697 is not a directory, or else visiting http://us.battle.net/sc2/en/forum/topic/375110697/ with a slash on the end would bring you to the same page).

So I tried visiting http://us.battle.net/sc2/en/forum/topic/375110697.html
http://us.battle.net/sc2/en/forum/topic/375110697.php
http://us.battle.net/sc2/en/forum/topic/375110697.aspx

but no luck with any of them. What type of file is it, and how can I find out? How can I always be sure or check (from a coding standpoint, (i use PHP)) that a page URL ends in a file and not a directory?

Something like mod_rewrite will be occurring, where all the topics use the same page to load. That loading page then parses from the URL the number that’s there.

I see. So it’s kind of like using a “.php?post=375110697” at the end of a URL?

This is a lot to take in, but thank you very much I am going to read it.
In the mean time, do you know of any way off the top of your head how to detect if such a page does this? My main goal is to separate this kind of page from a page with a URL that is a directory at the end

e.g.: 
separate
[COLOR="Blue"]http://example.com/375110697[/COLOR] (an absolute page)
from
[COLOR="Blue"]http://example.com/directory[/COLOR] (a directory)

All you need to do is to create a directory called 375110697 in the http://us.battle.net/sc2/en/forum/topic directory and then put an index.htm page into it. When you use the full URL the index page will load. I tried it in another context and it worked fine.

I know that will work, but that’s not what I’m trying to accomplish. Thanks for trying though. What I need to do is be able to differentiate between a page using this method of absolute pages (in this case threads) and directories. Because usually absolute pages have a file extension at the end (e.g. example.com/about.html), but in this case, that is NOT the case.

As far as I know it’s not possible to determine that kind of thing until you get in to intimate (non-public) with the server itself.

Sorry I’m not 100% sure I understood what you meant there. However, thank you for your feedback. :slight_smile:

If anybody does know the answer, I will always be wondering (I’ll be sure to post the answer if I figure it out).

What you are no doubt dealing with here are URLs output by a content management system (CMS), which is very common. That often means that none of those segments in the URL are really ‘folders’ in the traditional, ‘static’ website sense. Each CMS has its own way of delivering up content and writing URLs.

Just to add that there are pros and cons to this kind of URL rewriting.

The advantage is that it is easier to type in, and looks more user-friendly. The disadvantage is that if it masks the actual structure, that can cause problems for search engines and for users.

The problems are only really likely to come about when you get into multivariate queries. For example if you have a URL with a query structure like
website.com/products.php?type=printer&make=canon&use=officethe server will usually parse that exactly no matter what order you put the queries in, and search engines know that and can treat the URLs as equivalent. Whereas if you make it look like a static URL, by doing something like
website.com/products/printer/canon/office, Google will assume that’s a different page to website.com/products/office/canon/printer, even though it will work out as the same.

Equally, you should always be able to remove the last part of the URL and get a meaningful result, and not a “directory doesn’t exist” type error.

If you don’t have access to the server, the only way I can think of is to try the URL with and without a trailing slash, to see if it works both ways. If it does, it’s a directory. If it doesn’t, it’s a rewrite.