HTTP 1.0 Restful coding -- or $_REQUEST is evil, don't use it

The moment you decide to dive into SEO and use the webserver’s rewrite module you open a critical can of worms – you have to negotiate at least some of the HTTP response protocol for your pages yourself. Understanding and taking advantage of the power that protocol provides can make your code easier to maintain and follow, and more secure.

First, a little 101.

HTTP – HyperText Transfer P protocol, and it’s SSL secured sister HTTPS, is the protocol of the web, at least for the moment. The emergence of web sockets may or may not change that. An HTTP response is a text file, and consists of a header, then a mime encoded response. Regardless of what the file actually is the browser and the webserver transparently handle this business, and up until we get PHP involved in the middle of things with mod rewrite do we need to really bother ourselves with the headers.

An overview of the protocol

By and large PHP scripts handle 2 types of requests – GET and POST. The PHP SAPI maps the parameters of these to, not surpisingly, $_GET and $_POST.

$_REQUEST is an ill conceived shortcut, that merges these to variables and also merges in the $_COOKIE superglobal and might even, on some server configurations, merge in $_SESSION (Highly not advised). Using request strips you of knowing the context of the request by the user.

The reason it is important to distinguish the two has to do with their intent and how the browser handles them. POST is implicitly a write operation, GET is implicitly a read operation. Because of this, a page generated by GET is more likely to be held in cache not only by the browser but by the intervening proxies (if any), but the user can also bookmark the results of a GET request (if they are net saavy). Critically, the browser doesn’t prompt the user for confirmation if they refresh a page received via GET. The same is not true of POST.

Since POST is a write operation (or is expected to be a write operation) the user will be nagged by the browser if they try to refresh. Some browsers even (justly) warn about credit cards being resubmitted and payments being repeated.

A useful trick from going to the trouble to distinguish the types is you don’t have to come up with as many operation names. For example, $_GET[‘action’] = ‘edit’ can be the call to retrieve the form to edit a record, while $_POST[‘action’] = ‘edit’ would be the call to commit the changes.

All fine and well, but what about status codes. 404, 200, et al? These are the second part of the HTTP protocol you need to know some about, especially when you are using mod_rewrite. There are 5 major classes.

  • 1XX: The informational group. I’ve yet to write a PHP script that needed to raise one of these myself.
  • 2XX: Success. 200 is the most frequently used.
  • 3XX: Redirection. If you write a CMS that lets users change URI’s of the pages you’ll be mucking with these soon enough.
  • 4XX: Client Error. 404 is the most famous, but 401 and 403 will see use by PHP scripts which use authentication. 410 should also be used in CMS systems where resources get deleted.
  • 5XX: Server Error. Uncaught PHP script exceptions should raise 500 unless some other code is appropriate (in my own framework the 404 condition bubbles up as an exception).

A full list is here.

You can come up with your own. When you do your new code will be treated as a member of the class. I use one custom code - 520 - Programmer Debug Trace. I started from 520 in case someday more official codes are added (though none have in 15 or so years so it’s unlikely to happen). That code is treated by browsers as a 500 error - meaning it won’t be cached (desirable with a debug trace call).