mod_rewrite: A Beginner’s Guide to URL Rewriting

Share this article

What is the mod_rewrite Solution?

But what does it exactly do? Hey! Here comes the whole point of this article!

mod_rewrite catches URLs that meet specific conditions, and rewrites them as it was told to.

For example, you can have a non-existing

http://www.mysite.com/anything

URL that is rewritten to:

http://www.mysite.com/deep/stuff/very_complicated_url?text=having_lots_of_extra_characters

Did you expect something more? Be patient…

<IfModule mod_rewrite.c>  
RewriteEngine on  
RewriteRule ^/shortcut$ /complicated/and/way/too/long/url/here  
</IfModule>

Of course this, too, should go into the httpd.conf file again, (you can even put it into a virtualhost context).

After you restart Apache (you’ll get used to it soon!) you can type this into your browser:

http://localhost/shortcut

If there’s a directory structure /complicated/and/way/too/long/url/here existing in your document root, you’re going to be “redirected” there, where you’ll see the contents of this directory (eg, the directory listing, index.html, whatever there is).

To understand mod_rewrite better, it’s important to know that this is not true redirection. “Classic” redirection is done with the Location: header of the HTTP protocol, and tells the browser itself to go to another URL. There are numerous ways to do this, for example, in PHP you could write:

<?  
// this PHP file is located at http://localhost/shortcut/index.php  
header  
("Location: /complicated/and/way/too/long/url/here");  
?>

This code shows the same page by sending a HTTP header back to the browser. That header tells the browser to move to another URL location instantly. But, what mod_rewrite does is totally different: it ‘tricks’ the browser, and serves the page as if it were really there – that’s why this is an URL rewriter and not a simple redirector (you can even verify the HTTP headers sent and received to understand the difference).

But it’s not just shortening paths that makes mod_rewrite the “Swiss Army Knife of URL manipulation”…

Rules

You’ve just seen how to specify a really simple RewriteRule. Now let’s take a closer look…

RewriteRule Pattern Substitution [Flag(s)]

RewriteRule is a simple instruction that tells mod_rewrite what to do. The magic is that you can use regular expressions in the Pattern and references in the Substitution strings. What do you think of the following rule?

RewriteRule /products/([0-9]+) /siteengine/products.php?id=$1

Now you can use the following syntax in your URLs:

http://localhost/products/123

After restarting Apache, you’ll find this is translated as:

http://localhost/siteengine/products.php?id=123

If you use only ‘fancy’ URLs in your scripts, there will be no way for your visitor to find out where your script resides (/siteengine in the example), what its name is (products.php), or what the name of the parameter to pass (productid) is! Do you like it? We’ve just completed two of our tasks, look!

  • Search-engine compatibility: there are no fancy characters in the URL, so the engines will explore your whole site
  • Security: ScriptKiddie(tm)-modified URLs will cause no error, as they’re verified with the regular expression first to be a number – URLs with no proper syntax can’t even reach the script itself.

Of course, you can create more complex RewriteRules. For example, here’s a set of rules I’m using on a site:

  RewriteRule ^/products$ /content.php  
  RewriteRule ^/products/([0-9]+)$ /content.php?id=$1  
  RewriteRule  
    ^/products/([0-9]+),([ad]*),([0-9]{0,3}),([0-9]*),([0-9]*$)   
    /marso/content.php?id=$1&sort=$2&order=$3&start=$4

Thanks to these rules I can use the followings links in the application:

  • Show an opening page that contains product categories:http://somesite.hu/products
  • Product listing, categoryid is 123, page 1 (as default), default order:http://somesite.hu/products/123
    http://somesite.hu/products/123,,,,
  • Product listing, categoryid is 123, page 2, descending order by third field (d for descending, 3 for third field):http://somesite.hu/products/123,d,3,2

This is also an example of the use of multiple RewriteRules. When there’s a RegExp match, the proper substitution occurs, mod_rewrite stops running and Apache serves the page with the substituted URL. Should there be no match (after processing all the rules), a usual 404 page comes up. And of course you can also define one or more rules (eg. ^.*$ as last pattern) to specify which script(s) to run depending on the mistaken URL.

The third, optional part of RewriteRule is:

RewriteRule Pattern Substitution Flag(s)

With flags, you can send specific headers to the browser when the URL matches the pattern, such as:

  • forbidden‘ or ‘f‘ for 403 forbidden,
  • gone‘ or ‘g‘ for 410 gone,
  • you may also force redirection, or force a MIME-type.

You can even use the:

  • nocase‘ or ‘NC‘ flag to make the pattern case-insensitive
  • next/N‘ to loop back to the first rule (‘next round‘ — though this may result in an endless loop, be careful with it!)
  • skip=N/'S=N‘ to skip the following N rules

…and so on.

I hope you feel like I felt while playing around with this module for the first time!

Go to page: 1 | 2 | 3 | 4
Tamas TurcsanyiTamas Turcsanyi
View Author

Tamas is the founder of Demoscene, and has created dozens of PHP-based sites. Now he's doing ebusiness work for IFS Ltd. in Hungary, and composing jazzy drumnbass and bigbeat tunes, which he hopes to have released.

Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week