mod_rewrite: A Beginner’s Guide to URL Rewriting

This article was written in 2002 and remains one of our most popular posts. If you’re keen to learn more about URLs, you may find this recent article on the “www” prefix, by Craig Buckler, of great interest.

So you’re a web developer who has all the bells and whistles on your site, creates Web-based applications that are both beautiful and work well. But what about these issues?

Applications Must Be Safe

A user must not be able to harm your site in any way by modifying a URL that points to your applications. In order to ensure your site’s safe, check all the GET variables coming from your visitors (I think it’s trivial to mention that the POST variables are a must to examine).

For example, imagine we have a simple script that shows all the products in a category.

Generally, it’s called like this:

myapp.php?target=showproducts&categoryid=123

But what will this application do if ScriptKiddie(tm) comes and types this in his browser:

myapp.php?target=showproducts&categoryid=youarebeinghacked

Well, many of the sites I’ve seen will drop some error message complaining about use of the wrong SQL query, invalid MySQL resource ID, and so on. These sites are not secure. And can anyone guarantee that a site-to-be-finished-yesterday will have all the parameter verifications — even in a programmer group having only two or three people?

Applications Must Be Search-Engine Friendly

It’s not generally known, but many of the search engines will not index your site in depth if it contains links to dynamic pages like the one mentioned above. They simply take the “name” part of the URL (that’s everything before the question mark, which contains the parameters that are needed for most of the scripts to run correctly), and then try to fetch the contents of the page. To make it clear, here are some links from our fictitious page:

myapp.php?target=showproducts&categoryid=123 myapp.php?target=showproducts&categoryid=124 myapp.php?target=showproducts&categoryid=125

Unfortunately, there’s a big chance that some of the search engines will try to download the following page:

myapp.php

In most cases calling a script like this causes an error but if not, I’m sure it will not show the proper contents the link was pointing to. Just try this search at google.com:

“you have an error in your sql syntax” .php -forum

There are both huge bugs and security threats in the scripts listed — again, these scripts are not search-engine friendly.

Applications must be user-friendly

If your application uses links like:

http://www.downloadsite.com?category=34769845698752354

then most of your visitors will find it difficult to get back to their favourite category (eg. Nettools/Messengers) every time they start from the main page of your site. Instead, they’d like to see URLs like this:

http://www.downloadsite.com/Nettools/Messengers

It’s even easier for the user to find (pick) the URL from the browsers’ drop-down list as they type into the Location field (though of course this only works if the user has visited that previously).

And what about you?

Now you have everything you need to answer the following questions:

  • Is your site really safe enough?
  • Can you protect your site from hackers?
  • Are your Websites search-engine compatible?
  • Are the URLs on your site ‘user friendly’ — are they easy to remember? Would you like them to be?

(everyone who answered ‘yes’ to all five questions: have a beer!)

An elegant solution

Okay, okay, I think you want to know the solution. Well, let’s get started. You’ll need:

  • everyone’s favourite Apache Webserver installed (v1.2 or later)
  • optionally, your favourite CGI scripts configured for Apache. Yes, I’ve said optionally, since what we’re going to do will happen right inside Apache and not PHP, or Perl, etc.
  • since (nearly) everything in Apache is controlled through its configuration files (httpd.conf, .htaccess, etc.), being familiar with these files might help you. You’ll also need to have write access to this file, and access to restart the Apache. I’d strongly recommend you do everything on a private testserver first, rather than on your own, or your company’s, production server!

Most of you will have read and/or heard about mod_rewrite — yes, it’s an Apache module, and it’s even installed by default! Go and check your modules directory (note that under *nix operating systems there’s a chance that your Apache was compiled with missing mod_rewrite, in which case, consult your sysadmin).

We’re going use this tiny module to achieve everything mentioned above. To use this module, first we have to enable it, since it’s initially disabled in the configuration file. Open the httpd.conf file and uncomment the following lines (remove the trailing #s):

#LoadModule rewrite_module modules/mod_rewrite.so #AddModule mod_rewrite.c

The first line tells Apache to load the mod_rewrite module, while the second one enables the use of it. After you restart Apache, mod_rewrite should be enabled, but not yet running.

Go to page: 1 | 2 | 3 | 4

Win an Annual Membership to Learnable,

SitePoint's Learning Platform

No Reader comments

Comments on this post are closed.