Mod Rewrite...on IIS, but uses Apache-style

The IIS Mod Rewrite I’m using CLAIMS 100% cut-n-paste support, ie, if Apache does it this way, do it here and it’ll work. So here’s my issue:

Here is my home computer directory structure:

inetpub
__wwwroot
____projects
______projectA
________files
________www
______projectB
________files
________www

Basically, under the root is a projects folder, each containing a project, which contains subfolders.

For examples:
I request:[B] http://www.projectA.localhost/this/that[/B]
It converts to: /projects/projectA/www/index.cfm?/this/that

I request: [B]http://files.projectA.localhost/this/that[/B]
It converts to: /projects/projectA/files/index.cfm?/this/that

So here’s what I’ve done so far:

==========================================

RewriteCond %{HTTP_HOST} ^((acp|clients|files|images|secure|www)\\.){0,1}([a-z0-9\\-]+)\\.(com|net|org|localhost)(.*) [NC]

# Subdomain Provided
RewriteCond $2 (.*)+ [NC,OR]
RewriteRule ^(.*) \\/projects\\/$3\\/$2\\/index.cfm?\\$ses=$5 [L]

# No subdomain, use www as default
RewriteCond $2 !(.*)+ [NC]
RewriteRule ^(.*) \\/projects\\/$3\\/www\\/index.cfm?\\$ses=$5 [L]

==========================================

But what’s happening is that it seems that it is converting the request to: /projects/index.cfm

You can see that I have to provide 2 conditions/rules, the first assumes the user provided a subdomain (like files.projectB.localhost) or the “else” says "if nothing is provided, use the “www” folder (if they call projectB.localhost)

Any help?

Also:

“Apache’s backreferences work within a RewriteRule and its associated RewriteConditions ONLY! Therefore, there is only $1 for this RewriteRule and RewriteCond to work with. What are the $2 and $3 and $5 values supposed to represent”

That makes sense as to why the rewrite wasn’t finding anything and only saw “projects/index.cfm”. Originally, those references were to the groupings in my first RewriteCond. I had no idea that it wouldn’t backreference, but the error I got hints that this was the issue.

aaron,

Since M$ isn’t compatible with M$, I’m shocked that you’d believe any claim like that!

Okay, lacking a “specification,”, here’s what your “mod_rewrite” code is telling me:

# needs RewriteEngine on
RewriteEngine on

# optionally match acp or clients or files or images or secure or www followed by a dot
# rather than {0,1}, optional is normally just a ?
# followed by one or more lowercase letters, digits or "escaped hyphens"?
# if you want to include a hyphen (dash), it must be the first character
# or the last so it's not confused with a contiguous range definition, i.e., a-z
# followed by .com, .net, .org, .localhost then a garbage collector
RewriteCond %{HTTP_HOST} ^((acp|clients|files|images|secure|www)\\.){0,1}([a-z0-9\\-]+)\\.(com|net|org|localhost)(.*) [NC]
[indent]Aside from my comments above, this is ALL WRONG! Use a script and print the value of the HTTP_HOST and you'll see it's either localhost or subdomain.domain.tld. It will NOT have any "garbage" after it.  Was that a list of subdomains?  REALLY?[/indent]

# Subdomain Provided
RewriteCond $2 (.*)+ [NC,OR]
RewriteRule ^(.*) \\/projects\\/$3\\/$2\\/index.cfm?\\$ses=$5 [L]
[indent]Apache's backreferences work within a RewriteRule and its associated RewriteConditions ONLY!  Therefore, there is only $1 for this RewriteRule and RewriteCond to work with.  What are the $2 and $3 and $5 values supposed to represent?[/indent]
# No subdomain, use www as default
RewriteCond $2 !(.*)+ [NC]
RewriteRule ^(.*) \\/projects\\/$3\\/www\\/index.cfm?\\$ses=$5 [L]
[indent]Ditto[/indent]

Please provide a “specification” so we know what you’re trying to do.

Regards,

DK

Hey dklynn, thanks for the response, let me see if I can clarify a little more.

I need this script to support running on my local copy of IIS7/Win7x64 as well as the host (who is running IIS7/Server2008) This is the URL Rewrite mod they support. So hopefully I can get a rewrite written that’ll work on both locations, what I call development and production.

The structure I’m looking for on production is:

(a).(b).(c)(/d) ie www.aaronmartone.com/this/that

a = “www”, but the script needs to support someone specifying NO subdomain either.
b = “aaronmartone”, the domain name
c = “com” the tld
d = everything after the tld, including the “/”.

On localhost, the server is localhost, but I use DNS mappings so that I can create entries like:

aaronmartonecom.localhost 127.0.0.1
acp.aaronmartonecom.localhost 127.0.0.1
www.aaronmartonecom.localhost 127.0.0.1

Basically it’s so my computer doesn’t go out to the net and try to lookup these addresses; they stay and reside on localhost.

OK. Now, the first part is the subdomain. The acceptable values I want are: acp, clients, files, images, secure and www. BUT they can also provide NOTHING (remember this for later) If they DO provide a subdomain, I only want to group catch the characters, not the following “.” separating it between the domain name.

Then, the domain. I was using [a-z0-9\-] because the only acceptable characters I want in my domains are a-z, 0-9 and hyphens. I also need this value to be captured, not so important on production environments, but in development, it lets me know which project I’m working on.

Next, the tld (on production, acceptable ones: com, net, org) or “localhost” if on development.

Lastly, anything that comes after this. Am I right in saying that www.domain.com is the value of the [HTTP_HOST] variable?

So for production it would take:

files.aaronmartone.com/this/that and convert it to: /files/index.cfm?$ses=/this/that

On development:

[URL=“http://www.aaronmartonecom.localhost/this/that”]clients.aaronmartonecom.localhost/this/that converts to: /projects/aaronmartonecom/clients/index.cfm?$ses=/this/that
project-name.localhost/this/that converts to: /projects/project-name/www/index.cfm?$ses=/this/that (see how no subdomain points to www by default?)

See, this is why I needed to catch “aaronmartonecom”, because my directory structure has the project names as folders in it, so for things to work locally, from the root of the server it has to get to:

c:\inetpub\wwwroot\projects\aaronmartonecom\www\index.cfm

Has this cleared up anything? (Hope so, I’m new to RegEx, and it’s really greek to me) I had a friend helping me at work, but the day’s over and I won’t see him til next monday! Thanks for the assistance.

Hi aaron!

For all the “newbie” questions I’ve seen here, I’ll tell you that you’re far more advanced than you think you are. The simple fact that you’re also struggling to do this on an M$ box! :tup:

Okay, I think that my best procedure is to go through your post again and insert comments.

I’m concerned with the DocumentRoot of the subdomains you have and the key name “$ses”.

Regards,

DK

I popped in your code and I’m getting a 404 from IIS (trying to determine where it’s looking for a file and not finding it)

As for the $ses URL variable, I think after the URL rewriting is done, since I am running ColdFusion, IIS will then pass off that URL to ColdFusion (hoping it will), and CF will use this to determine what file they WANTED to load, and load it up internally.

But, a 404 wouldn’t make sense even it sent a re-request to itself with the new re-written URL, because the index.cfm file specified in the rewrite DOES exist.

Maybe I should enable RewriteLog and take a look at what it’s doing internally.

As for the DocumentRoots of the subdomains, when I’m in IIS, I can “CREATE NEW SITE” and in there, you specify the filepath to the root. Here I specify:

c:\inetpub\wwwroot\projects\projectname

So if I wrote code in there prefaced with “/” like “/www/includes/file.cfm”, it would look for it in:

c:\inetpub\wwwroot\projects\projectname\www\includes\file.cfm.

and not:

c:\inetpub\wwwroot\www\includes\file.cfm

DNS mappings? Okay, I’ll treat that like a VirtualHost in Apache and ask whether each subdomain’s DocumentRoot is shared with the main domain or points to a specific location in your file structure (development should match production if at all possible else your rewrite code will necessarily be different which defeats the reason to use a development server).
[I][FONT=&quot]

[/FONT][/I]By DNS mappings, suppose you want to call localhost “myserver”. I can goto a file on my computer, make an entry where “myserver” is equal to “127.0.0.1” (same IP as localhost) and when I type in http://myserver, it will resolve to a 127.0.0.1 address. The server’s root is c:\inetpub\wwwroot. All projects running on localhost reside in a folder below wwwroot called “projects”. In there is a separate folder per project which when included with the previous paths becomes the root address of that project. So I think the “points to a specific location in your file structure” is what I’m doing.

From my production server’s root down it will match my aaronmartonecom project folder down. (Makes it easier to move FTP stuff over as well)

I make entries such as “www.aaronmartonecom.localhost” = 127.0.0.1, because if I didn’t, typing that address into my browser would have it go out to the internet looking for an address rather than staying locally. Now, I’m not sure if this is the right way to do things or not. It let me know what (optional) subdomain and project I was working on locally, and I was hoping the rewrite could deal with it on development.

IIS does support virtual directories, but I am not using those (should I be?) Instead, in IIS, I can create a “New Site” running on the same IP/Port that localhost is, but I give it a host header of “aaronmartonecom.localhost”. I also give it a home directory inside my webserver’s root. For example, I put this at c:\inetpub\wwwroot\projects\aaronmartonecom. All absolute path requests I make from that folder down will only go up to the aaronmartonecom folder (seeing it as that site’s “root”)

Preferably, use [-a-z0-9]+ rather than incorrectly trying to escape the hyphen in the character range definition.
OK, cool. I had always thought that since hyphen was a special character it needed a \ in front of it. Sure enough when I parsed your suggestion, it went through fine and regex testing it with input strings had it catch everything perfectly.

I can change the URL variable to “ses” instead of “$ses”. I think someone said that it was more secure with coldfusion since people cannot type a $ into the URL without CF seeing it as it’s URL-encoded format, preventing them from messing with your internal “$ses” variable. But I have no issue temporarily taking off the “$” until things are working, and then add it on to see if that breaks.

Check out the 404 though…

Error Summary HTTP Error 404.0 - Not Found

[B]The resource you are looking for has been removed, had its name changed, or is temporarily unavailable.[/B]

 Detailed Error Information 

Module: IIS Web Core
Notification: MapRequest
Handler: HandlerStaticFile
Error Code: 0x80070002
Requested URL: http://www.aaronmartonecom.localhost:80/this/that
Physical Path: C:\inetpub\wwwroot\ his\ hat
Logon Method: Anonymous
Logon User: Anonymous

Whoa. The PHYSICAL PATH seemed to traverse all the way up the server outside of that project’s root path… Maybe I should be using virtual directories? (never used them, little experience)

** UPDATE **

Oh man I DID have IIS project sites setup improperly in IIS. I didn’t need to make a new site for every project. I just needed to make the DNS entries (which I did) to point to 127.0.0.1. I’ve fixed this since and now we’ve made SIGNIFICANT progress.

I goto: http://acp.aaronmartonecom.localhost/hello
It rewrites to: /projects/aaronmartonecom/acp/index.cfm?$ses=/hello (Perfect!)

I goto: http://www.aaronmartonecom.localhost/this/that
It rewrites to: /projects/aaronmartonecom/www/index.cfm?$ses=/this/that (PERFECT!)

I goto: http://aaronmartonecom.localhost/this/that
It rewrites to: /projects/aaronmartonecom/index.cfm?$ses=/this/that (Uh-oh)

It seems the “no subdomain specified” rewrite code is having an issue. Here’s what I have:

#Handles Development calls without Subdomain
RewriteCond %{HTTP_HOST} ^((acp|clients|files|images|secure|www)\.)?([-a-z0-9]+)\.(localhost) [NC]
RewriteCond %1 =“”
RewriteCond %{REQUEST_URI} !^(acp|clients|files|images|secure|www)/
RewriteRule .? /projects/%2/www/index.cfm?\$ses=%{REQUEST_URI} [L]

That seems to be the last issue to resolve before this is working perfectly.

While I’m at this, I think I’m having an issue where files that don’t require URL rewriting (CSS, JS, images) are being rewritten because they’re seen as unique HTTP gets.

Would it be advisable to add the following rule before the others:

# Prevent rewriting on direct file calls
RewiteCond %{REQUEST_URI} (.css|.js|.gif|.jpg|.png|.zip) [L]

Would this prevent URL rewriting being done on the aforementioned file extensions?

Hi aaron!

Wow! I can’t keep up with you!

Remember about relative links, too, when you’re changing “directory level” from the original request.

Regards,

DK

For the “no subdomain specified” rule not working,

I ran that Regex through a tester for the value: aaronmartone.localhost
and it reports back the Back References:

$1:
$2:
$3: aaronmartonecom
$4: localhost

So yeah, the 3rd back reference is the project domain name, but it shows $3 and not %3. I’m using %3 in the rewrite now, but it’s not working, I just get a blank white page (makes it harder to troubleshoot). Is there a difference between $3 and %3?

Also, for clarification, is %{REQUEST_URI} everything that’s after %[HTTP_HOST]?

aaron,

Testers have to work in a peculiar way. The way that mod_rewrite works is to create $n (0 is the entire string, 1 <= n <= 9 in order of creation) from atoms in the RewriteRULE and %n (1 <= n <= 9 in order of creation) from RewriteCOND directives.

Check the redirection (use R=301 flag) as I suspect that the eariler version would contain // in the redirection (because %1 or %2 is empty). For this special case, change the redirection to eliminate that reference (change the RewriteCond’s, too, for that matter).

http://sub.domain.tld/path/to/file.name?key=value

http is the protocol, :// are delimiters, sub.domain.tld is the {HTTP_HOST}, i.e., fully qualified domain, sub is the subdomain, domain is the name and tld is the top level domain. / is another delimiter which has some weird handling which is different between Apache 1 and Apache 2. The {REQUEST_URI} is path/to/file.name - everything between the / and the ?. The ? is a reserved character which is a delimiter to set the {QUERY_STRING} apart from the {REQUEST_URI}. key=value is obviously the {QUERY_STRING}.

Regards,

DK

OK, we’ve gotten a lot closer. After messing with some IIS setups, here’s what’s happening:

Now, here’s what I’ve modified in your rewrite code (note that I didn’t remove everything else you wrote, I just temporarily commented it out and am not showing it here)

WITH REWRITING MODIFIED FROM WHAT YOU PROVIDED (SEE BELOW):

RewriteCond %{HTTP_HOST} ^((acp|clients|files|images|ssl|www)\.)?([-a-z0-9]+)\.(localhost) [NC]
RewriteRule .? /%2/index.cfm?\$ses=%{REQUEST_URI} [L]

=============================

What I’m looking to do is change the re-write depending on if it finds a file at the specified location

Does this make sense? If the REQUEST_URI ends up being an actual FILE, then it needs to serve the file like:

Get: http://[COLOR=Red]files[/COLOR].aaronmartonecom.localhost/ui/css/file.css
(A file is found at /files/ui/css/file.css so we do the following rewrite: )
Rewrite As: /files/ui/css/file.css

But if the REQUEST_URL is not a file, then it needs to assume it is an address I will be passing off to CF with the URL.$ses variable.

Get: http://[COLOR=Red]acp[/COLOR].aaronmartonecom.localhost/this/that
(No file is found at /acp/this/that, so we pass it off to index.cfm and $ses:)
Rewrite As: /acp/index.cfm?$ses=/this/that

Is this doable?

p.s. Forgive me because I know this is different than what I was asking eariler. I think because I didn’t have my IIS setup the right way, we were going down the wrong path (again, my fault). I feel I have a clearer understanding of subdomain setup in IIS, and now need to cater my rewrite code to this.

aaron,

Because you are redirecting the “no subdomain” version differently than the “subdomain” versions of the code, I believe you must eliminate the ? (make the subdomain required for the subdomain version) from

RewriteCond %{HTTP_HOST} ^((acp|clients|files|images|ssl|www)\\.)[COLOR="Red"][B]?[/B][/COLOR]([-a-z0-9]+)\\.(localhost) [NC]
RewriteRule .? /%2/index.cfm?\\$ses=%{REQUEST_URI} [L]

Doing that will leave the “no subdomain” version easier to handle (it won’t have been co-opted by the “subdomain” code).

Of course that makes sense - and is the SMART thing to do! :tup: If you take a look at WordPress’s mod_rewrite code, they will check whether a request exists as a file or directory before redirecting EVERYTHING to their handler file (index.php). Failure to make those checks would mean that their handler would have to include css, js, jpg, gif, png, pdf, etc. files in their handler (a DUMB thing to do). Just remember that these checks require the PHYSICAL path, not just the URI so {REQUEST_FILENAME} is the Apache variable to use (or {DOCUMENT_ROOT} with path/file added).

Does that help?

Regards,

DK

I agree with you about requiring the subdomain, so I am now utilizing that code.

I have a couple questions which I think are preventing me from getting the big picture.

To my understand, RewriteCond is a CONDITIONAL check (checking first argument against second) If there are multiple RewriteCond’s, all have to prove true for the next RewriteRule to run, so in essence I can do:

RewriteCond
RewriteCond
RewriteRule [L]
RewriteRule [L]

And RewriteCond 1 and 2, if true, will run the first RewriteRule (and stop, due to [L]), if not, will run the second RewriteRule (and again, stop due to [L]) Correct?

Is there a way I can just dump all the variables like {REQUEST_URI}, etc? That way I can actively see what they are and better understand them.

http://www.am.localhost/ui => /projects/am/www/ui

BUT, the am project’s root in IIS is /projects/am, so is:
http://www.am.localhost/ui => /www/ui instead?

I would think it is. I know that “www.am.localhost” and “files.am.localhost” etc. all basically equal 127.0.0.1 or just “localhost”, the DNS entries are kinda like “aliases” and point to the same location.

I have confirmed this because I made an index.cfm in each subdomain folder, when I called:

files.am.localhost/index.cfm => It output “This is the files folder root”
acp.am.localhost/index.cfm => It output “This is the acp folder root”

So I knew that it was calling the right locations. My concern is that, in the end, the <base> tag’s HREF value will be: <base href=“http://www.am.localhost/” /> so that all relative links begin from THAT URL. For example, if I have an image who’se SRC = “ui/images/file.jpg” then it’s full URL should be “http://www.am.localhost/ui/images/file.jpg” which SHOULD be “/www/ui/images/file.jpg”

Oh and here’s the rewrite’s compatibility:

http://www.micronovae.com/ModRewrite/ref/Compatibility.html

Seems there’s only 1 thing it doesn’t support and that’s due to IIS limitations, the use of the “-F” tag.

Here’s what I have:

#Ensure localhost is calling valid subdomain and project
RewriteCond %{HTTP_HOST} ^((acp|clients|files|images|ssl|www)\.)([-a-z0-9]+)\.(localhost) [NC]
#Don’t allow the user to directly request into the root subdomain folders
RewriteCond %{REQUEST_URI} !^(acp|clients|files|images|ssl|www)/
RewriteCond %{REQUEST_URI} -f
RewriteRule .? /index.cfm?\$isFile=%{REQUEST_URI} [L,NC]

RewriteRule .? /index.cfm?\$notFile=%{REQUEST_URI} [L,NC]

…Needles to say it is redirecting to the $notFile. Don’t mind the variable names, I just put them there to determine which rule was firing off…

I’m calling: http://www.aaronmartonecom.localhost/ui/css/am.reset.css
Which SHOULD be calling: /www/ui/css/am.reset.css
Which DOES exist…

I think I’ve found a new problem! (After MANY test-based changes to the rewrite file)

I don’t understand back references! :smiley:

The first RewriteCond can be referenced with %n variables, but after the second RewriteCond, those variables are now empty… This is why my current -f check is failing.

I need to find a reference for %n and $n variables, but putting those terms into Google; well, Google pretty much ignores those strings.

Hi aaron,

Okay, good thinking about the subdomain.

Yes, RewriteCond is a conditional check which will enable the ASSOCIATED RewriteRule. mod_rewrite’s processing (dual - it goes through the server and virtual host config files then converts to URI before addressing .htaccess code) is to loop through the RewriteRULEs looking for a match. When a match is found, it goes through the associated RewriteCOND statements (the order is actually backward, something explained as an historic anomaly, so the RewriteRule’s RewriteCond statements are BEFORE the rule). If all RewriteCond statements evaluate as true (or there are NO associated RewriteCond statements), then the RewriteRule’s redirection is made and RewriteRule processing continues (if allowed by the flags). Looping is continued until there are no further matches.

The Last flags in your example are the flags which dictate whether the loop continues. It tells mod_rewrite to cease processing and recalculate the URI. If this results in access to the same directory, then the mod_rewrite looping will start again. Therefore, you are correct IF the first RewriteRule is not redirected (and again if the first IS redirected the first time but not the second time - assuming that the redirection is back to this directory).

Yes, dump $_SERVER.

The F flag is for a redirect - the f flag is what is typically used and that’s supported.

[I]#Ensure localhost is calling valid subdomain and project[/I]
RewriteCond %{HTTP_HOST} ^((acp|clients|files|images|ssl|www)\\.)([-a-z0-9]+)\\.(localhost) [NC]
[I]#Don't allow the user to directly request into the root subdomain folders[/I]
RewriteCond %{REQUEST_URI} !^(acp|clients|files|images|ssl|www)/
RewriteCond %{REQUEST_[COLOR="Red"][B]FILENAME[/B][/COLOR]} [COLOR="Red"][B]![/B][/COLOR]-f
RewriteRule .? /index.cfm?\\$isFile=%{REQUEST_URI} [L,NC]

RewriteRule .? /index.cfm?\\$notFile=%{REQUEST_URI} [L,NC]

What about when using {REQUEST_FILENAME} - when it’s NOT a file?

Regards,

DK