How to Avoid 404s and Redirect Old URLs in PHP

Contributing Editor

Nothing can be said to be certain, except death and taxes. And URL changes.It’s often necessary to reorganize your site and change the URL structure but, assuming you have similar content, users should rarely encounter a “page not found” error. Producing unnecessary 404 pages is one of my top 10 development mistakes.In this article, we’ll create an automated PHP redirection system that converts old URLs to a new address. It’s not production code, but it will illustrate the basics so that you can adapt it for your own website.

1. Create a 404 error-handling file

If you’re yet to have a “not found” page, create a basic one named 404.php in the root of your website:

<?php// basic 404 error pageheader('HTTP/1.1 404 Not Found');header('Status: 404 Not Found');?><!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"><title>Page not found</title></head><body><h1>Page not found</h1><p>Sorry, we cannot find that page.</p><p><a href="/">Please return to the home page&hellip;</a></p></body></html>
Note: What’s a 404?

404 is the HTTP error number returned when a resource is unable to be located on the server. The PHP code at the top of the above file returns this code to ensure systems such as search engines don’t mistake the page for real content.

2. Configure your server

You now need to tell your server that all 404 errors should be handled by the 404.php file. If you’re using Apache, add the following line to an .htaccess file in the root of your website:

errordocument 404 /404.php

For IIS, open the Internet Information Services Manager. In IIS7, double-click the “Error Pages” icon. (Users of previous versions must select the “Custom Errors” tab of the website properties.) Edit the 404 error code, choose a type of “URL”, and enter “/404.php” as the address.If you now visit a nonexistent page, such as http://yoursite.com/non-existent.url, you should see the error page we created above.

3. Create the redirection system

We’ll place our redirection code in another file named redirect.php, to keep the functionality separate from the 404 content.Add the following code at the top of your 404.php file just after the <?php declaration:

include('redirect.php');

Now create redirect.php in the website root and add the following code:

<?php// current address$oldurl = strtolower($_SERVER['REQUEST_URI']);// new redirect address$newurl = '';

The current page address is stored in $oldurl, e.g. /non-existent.url.We now need to examine that address and, if possible, translate it to a new URL (stored in the $newurl variable). How this is achieved will depend on the structure of your old and new URLs. For example, if the only change is that your ‘blog’ folder has been renamed ‘blogs,’ the following code could be sufficient:

$newurl = str_replace('blog', 'blogs', $oldurl);

You may be able to use a series of substitutions or regular expressions. Alternatively, a mapping of every old URL to its new address could be defined in an array or database table.If you have fairly simple redirect requirements, the following code could be used:

$redir = array(	'blog' => '/blogs/',	'video' => '/videos/',	'demo' => '/demonstrations/main/');while ((list($old, $new) = each($redir)) && !$newurl) {	if (strpos($oldurl, $old) !== false) $newurl = $new;}

The $redir array defines a number of value pairs that can be configured accordingly. If the first string can be found anywhere within the old URL, the redirection address is set to the second string. In the example above, if the word ‘blog’ is found in the missing page URL, the user will be redirected to ‘/blogs/’. If the old URL contains two or more of those words, the first one takes precedence; for example, /video/demonstrations would find ‘video’ first and redirect to ‘/videos/’.Once you have a $newurl value, you could optionally double-check that it exists before redirecting. Unfortunately, that poses a few risks:

  • The new URL may lack a physical file; for example, you’re using WordPress permalinks. PHP’s file_exists() function would fail to find anything.
  • You could use a function such as file() or http_get() to check that the URL exists; however, if it doesn’t, you’ll end up back at your redirect.php file–which could incur recursive redirect attempts.

Personally, I’d avoid double-checking the URL unless it’s easy to do so. If your redirect URLs are incorrect, it’ll soon become apparent during testing or by examining your server log files.Finally, if we have a $newurl, we’ll redirect to that page. Otherwise, we’ll show the 404.php error:

// redirectif ($newurl != '') {	header('HTTP/1.1 301 Moved Permanently');	header("Location: $newurl");	exit();}?>

This code is a simple example, but you should be able to adapt it for any situation. You could also consider further options such as:

  • logging all unmapped URLs to a file for later inspection
  • preventing multiple redirection mistakes by storing a cookie, using a session value, or passing an argument to the new URL

I hope you find it useful. I’ll show you how to modify the code for a WordPress installation in a new SitePoint article coming soon …

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Herbalite

    One area I’d like to see addressed in this article is why certain redirection hurts search engine rankings, even when using a 301 permanent redirection.

    • http://www.optimalworks.net/ Craig Buckler

      Really? Why would that happen … surely a redirect would be better than showing a “page not found”?
      The only reason I could imagine it happening is if:
      (a) you redirect to a new page containing totally different content, or
      (b) you have a chain of 301s.

      I assume you’re moving to a more friendly URL structure too, e.g. /page1.html to /seo-friendly-terms? If so, any negative impact on the 301 will be more than outweighed by the better URL.

      • Herbalite

        Do you expect a search engine to keep proper stats that the old page old.html is in fact that the same location as seo-friendly-page.html? In short this is about canonical issues.

        There is also a good blog from a Google engineer available who admits that google messed up in the past with 301 redirects (don’t know about the latest situation) and tells what additionally can and should to be done.

        http://gregable.com/2009/02/relcanonical.html

      • http://www.optimalworks.net/ Craig Buckler

        A 301 states that the page has been moved permanently. In effect, old.html should be removed from the search engine’s index and replaced with seo-friendly-page.html. If it’s the same content, there should be a negligible effect on the rank.

        Actually, this isn’t about canonical issues. That’s when you have two or more URLS with the same content which exist at the same time, e.g. mysite.com and mysite.com/index.html. In that situation, you could use a redirect or the canonical meta tag.

  • Speekenbrink

    clean post… but why doesnt anyone ever advocate the use of the other header() paramters… therefore kicking back to the amount of code todo the permanent reidrect into:
    header(‘Location: /newUrl’, false 301);

    ?

  • just a small tip (IIS only)

    If you’re running IIS 7.5 (and possibly earlier versions too) and you want to try out your custom 404 page on your own machine you might need to add the following to the web.config of your site under the <system.webServer> section:
    <httpErrors errorMode=”Custom” />

    Alternatively, in IIS Manager in the Error Pages section on the right-hand side under Actions you can click ‘Edit Feature Settings…’ and then choose ‘Custom error pages’ to get the same effect.

    If you don’t do this there’s a good chance you’ll see the detailed error page that IIS returns when you’re running locally.

  • Anonymous

    Why not just use rewrite rules? I’d understand the need for PHP if you were creating a web based interface for clients to add redirects but if not, just create a .htaccess file in the root and add this to the top.

    RewriteRule ^/my-old-url?$ /my-new-url? [R=301,L]

    This would also be faster than a PHP solution.

    • http://www.optimalworks.net/ Craig Buckler

      You can — if you’re using Apache.

      That solution’s absolutely fine if you only have a few redirects or can handle every one with a simple regex. Unfortunately, there are occasions when you have hundreds of pages being redirected. Under those circumstances, I’d rather have an easy-to-debug PHP file/database than declare every URL in .htaccess.

  • Meketrefe

    What a piece of crap Craig! Why do you always have to write this kind of misleading articles? Please go pay yourself a proper PHP course: http://www.zend.com/en/services/training/

    • http://www.optimalworks.net/ Craig Buckler

      Thanks for your feedback Meketrefe.

      Perhaps you could provide more specific reasons why this is misleading and how you would approach the problem?

  • Peter in Barrow

    I don’t really agree with the premise that a 404 is a bad thing for a user. When pages move if a redirect is handled automatically, through PHP or other sofware, users tend not to update their bookmarks. As a result “bad links” are promulgated. A well crafted 404 page can instruct a user that changes have taken place on their target web site, and put the responsibility back on the user to fix bookmarks. If possible it would be helpful if the 404 showed the user where to find the needed page, but leave the responsibility for navigating to the new location in the hands of the user.

    • http://www.optimalworks.net/ Craig Buckler

      It’s an interesting thought, although I’d rather search engines could find the new URL automatically.

      You could detect search engine robots and send them a 301 whereas standard users get a 404 page. It’s a slightly dubious method since most search engines dictate that users and robots should ‘see’ the same content.

      That said, how many users bother updating their bookmarks anyway? Old URLs never die!

  • Jeremy Cook

    Wouldn’t it be better to do this sort of thing in .htaccess files using the Apache Redirect command and possibly mod_rewrite? I imagine that would be faster and would save the need for Apache to invoke the PHP engine at all for the redirects.

  • Ryan Blunden

    Why not just use rewrite rules? I’d understand the need for PHP if you were creating a web based interface for clients to add redirects but if not, just create a .htaccess file in the root and add this to the top.

    RewriteRule ^/my-old-url?$ /my-new-url? [R=301,L]

    This would also be faster than a PHP solution.

  • PHP Development

    Post is really good and it is true that 404 errors can drop the PR and authority of website this article helps to come out in such situtation so thank you for this information

  • Sphamandla

    Finally getting a sense of how the 404 page works with php to redirect pages, it makes sense and isn’t that hard but im digging deeper into php and its operations now……….

  • markus

    Hi Craig, my problem is that I would like to catch all deleted pages on http://www.domain.tld/postnumber1234/post-title/ and redirect to the new location http://SUBDOMAIN.domain.tld/postnumber1234/post-title/ where there can now be found. How can I do that?