How to Avoid 404s and Redirect Old URLs in PHP
Nothing can be said to be certain, except death and taxes. And URL changes.It’s often necessary to reorganize your site and change the URL structure but, assuming you have similar content, users should rarely encounter a “page not found” error. Producing unnecessary 404 pages is one of my top 10 development mistakes.In this article, we’ll create an automated PHP redirection system that converts old URLs to a new address. It’s not production code, but it will illustrate the basics so that you can adapt it for your own website.
1. Create a 404 error-handling file
If you’re yet to have a “not found” page, create a basic one named 404.php in the root of your website:
<?php// basic 404 error pageheader('HTTP/1.1 404 Not Found');header('Status: 404 Not Found');?><!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"><title>Page not found</title></head><body><h1>Page not found</h1><p>Sorry, we cannot find that page.</p><p><a href="/">Please return to the home page…</a></p></body></html>
404 is the HTTP error number returned when a resource is unable to be located on the server. The PHP code at the top of the above file returns this code to ensure systems such as search engines don’t mistake the page for real content.
2. Configure your server
You now need to tell your server that all 404 errors should be handled by the 404.php file. If you’re using Apache, add the following line to an .htaccess file in the root of your website:
errordocument 404 /404.php
For IIS, open the Internet Information Services Manager. In IIS7, double-click the “Error Pages” icon. (Users of previous versions must select the “Custom Errors” tab of the website properties.) Edit the 404 error code, choose a type of “URL”, and enter “/404.php” as the address.If you now visit a nonexistent page, such as http://yoursite.com/non-existent.url, you should see the error page we created above.
3. Create the redirection system
We’ll place our redirection code in another file named redirect.php, to keep the functionality separate from the 404 content.Add the following code at the top of your 404.php file just after the <?php declaration:
include('redirect.php');
Now create redirect.php in the website root and add the following code:
<?php// current address$oldurl = strtolower($_SERVER['REQUEST_URI']);// new redirect address$newurl = '';
The current page address is stored in $oldurl, e.g. /non-existent.url.We now need to examine that address and, if possible, translate it to a new URL (stored in the $newurl variable). How this is achieved will depend on the structure of your old and new URLs. For example, if the only change is that your ‘blog’ folder has been renamed ‘blogs,’ the following code could be sufficient:
$newurl = str_replace('blog', 'blogs', $oldurl);
You may be able to use a series of substitutions or regular expressions. Alternatively, a mapping of every old URL to its new address could be defined in an array or database table.If you have fairly simple redirect requirements, the following code could be used:
$redir = array( 'blog' => '/blogs/', 'video' => '/videos/', 'demo' => '/demonstrations/main/');while ((list($old, $new) = each($redir)) && !$newurl) { if (strpos($oldurl, $old) !== false) $newurl = $new;}
The $redir array defines a number of value pairs that can be configured accordingly. If the first string can be found anywhere within the old URL, the redirection address is set to the second string. In the example above, if the word ‘blog’ is found in the missing page URL, the user will be redirected to ‘/blogs/’. If the old URL contains two or more of those words, the first one takes precedence; for example, /video/demonstrations would find ‘video’ first and redirect to ‘/videos/’.Once you have a $newurl value, you could optionally double-check that it exists before redirecting. Unfortunately, that poses a few risks:
- The new URL may lack a physical file; for example, you’re using WordPress permalinks. PHP’s file_exists() function would fail to find anything.
- You could use a function such as file() or http_get() to check that the URL exists; however, if it doesn’t, you’ll end up back at your redirect.php file–which could incur recursive redirect attempts.
Personally, I’d avoid double-checking the URL unless it’s easy to do so. If your redirect URLs are incorrect, it’ll soon become apparent during testing or by examining your server log files.Finally, if we have a $newurl, we’ll redirect to that page. Otherwise, we’ll show the 404.php error:
// redirectif ($newurl != '') { header('HTTP/1.1 301 Moved Permanently'); header("Location: $newurl"); exit();}?>
This code is a simple example, but you should be able to adapt it for any situation. You could also consider further options such as:
- logging all unmapped URLs to a file for later inspection
- preventing multiple redirection mistakes by storing a cookie, using a session value, or passing an argument to the new URL
I hope you find it useful. I’ll show you how to modify the code for a WordPress installation in a new SitePoint article coming soon …