Saving Resources with PHPCache

Databases are spectacular tools — using one at the back of your Website can give you tremendous power in terms of flexibility and maintenance. A database effectively separates your content from the design, allowing you to edit one without affecting the other. This, coupled with the development of newer scripting languages like PHP and free databases like MySQL, has seen the popularity of database-driven sites increase.

However, even though database-driven Websites offer many benefits over their traditional, static counterparts, there are two major downsides to running a dynamic site. The first is that people tend to use query strings (anything following a "?" in the URL), to pass the correct information to their scripts. This approach can create problems with search engines, however, there are a few solutions to this issue.

The second problem with a dynamic site is that it can create a high server load, and as your site grows in popularity, it may become slower due to an increase in the number of database queries being performed. In this article we’ll look at a solution to this problem – the implementation of a caching system for your site.

There are different ways to decrease the server load your site generates. Some recommend the use of static rather than dynamic pages, and the periodic running of a script or program that generates the new static page. This approach works — but what if you still want some elements served dynamically on the page in question?

Well, you could write a script that generates another script periodically. For instance, instead of using a PHP script to generate an HTML file, you could use a PHP script to generate another PHP script. In such a case, the generated script would cause less server overhead than its parent. While this approach might succeed, it may also create more work – you’ll need to write a different script to generate each page that you run using this technique. Also, if you choose to generate HTML pages, and then change your mind later, you might have to wait a month or more to have all your new pages indexed by search engines. And in that time, you could easily lose your results rankings.

Instead, why not use a caching system? This tool allows you to cache parts of your script, such as the database queries, while it keeps other elements completely dynamic. So how do you write a caching system? Well, you don’t need to, because Nathan at 0×00.org has written a great system and released it under the General Public License. This article will teach you how to install and use this script on your own site.

Installing phpCache

First, download the script. Unzip it -– the only file you need is phpCache.inc. View the other included files at your leisure: they may give you ideas on different ways to use the script, but for the purposes of this article they aren’t needed.

Once you’ve extracted the file, consider renaming it. It’s a bad idea to put any script that doesn’t end with the proper file extension in any public area on your server. This is because there could be sensitive information in that file, and if a user tried to access it directly, they’d be able to: if the file doesn’t have the correct extension, the code inside it will be displayed in the browser. So it’s a good idea to either rename the file using a .php extension, or store it in a directory above your root public directory.

There are a few things you’ll need to edit inside the file itself. Open the file in your favorite text editor and look for this line:

define(CACHE_DIR, "/tmp/phpCache/");

Unless you have your own server you’ll need to edit that code. You should make a tmp directory above your root html directory, and inside it place a phpCache directory. Then enter the path to your new directory like so:

define(CACHE_DIR, "/home/username/tmp/phpCache/");

If you don’t know the path to your directory, ask your server administrator or log into a shell session and use the pwd command.

The other thing you may need to edit is the key function. phpCache uses a key to store the cached data. The key is generated by default using GET and POST variables as well as your query string. If you’re using another method to pass information to your dynamic pages, then you’ll need to edit this function in order to take those variables into account.

Two variables you might be using are $PATH_INFO or $REQUEST_URI — if you are using them, change your cache_default_key() function (found in the phpCache.inc file) to this:

function cache_default_key() {  
 global $HTTP_GET_VARS, $QUERY_STRING, $PATH_INFO,  
 $REQUEST_URI;  
 return md5("GET=" . serialize($HTTP_GET_VARS) . "QS=" .  
 $QUERY_STRING . "PATH_INFO=" . $PATH_INFO . "REQUEST_URI=" .  
 $REQUEST_URI);  
}

Alternatively you can specify what you want to key with, if you opt to use the cache(), rather than cache_all(), function when doing your caching. You can also choose to key the cache based on something else, such as your database primary key, if you use the cache() function. More about this later.

Once you’ve uploaded the file, you’ll need to include it at the start of the PHP scripts on which you want to use it, like this:

include("../phpCache.inc"); 

Now that you’ve installed the script, and have it in place in the scripts you wish to use it with, let’s learn how to use it to cache your data.

Setting up the Caching Function

There is one main function you need to be familiar with, and that is cache(). This function takes three parameters: time, object, and key, and it’s used like this:

if (!($et=cache(time, object, key))){  
.  
.  
.  
endcache();  
}

The first line above uses the $et variable defined by the caching script to check if there is a cache available. If there isn’t, the block of code that follows it will be executed. It is in that block of code that you need to include the information you want cached (such as your database queries). Finally the endcache() function, which saves and closes the cache, is executed. Your data won’t be saved without the endcache() function, so be sure to include it.

There is one important detail you need to know before you set this up on your page. If you’ll be running any queries outside the cache, in addition to the ones you’ll run inside the cache, don’t establish your database connection in the cached block of code. Place it above the code block instead — this way your non-cached queries will still run.

Now we need to fill in the arguments for the cache() function. The first argument, time, is pretty straight forward: it’s the duration of the cache in seconds. You can set this argument to any number, from 60 seconds to 6 million seconds. If you set the time argument to 0, the cache will never time out.

The next arguments really work together, as they’re both used to store and identify the cached data. The object argument usually holds the name or URL of the script. The key argument usually holds an identifier that’s not related to the script name, such as the query string, form variables, or the primary key from your database, as discussed above. To store and retrieve your cached data, phpCache will cross reference the object and the key.

There is also another function for caching, called cache_all(), which can be used like this:

if (!($et=cache_all(time))){  
.  
.  
.  
endcache();  
}

The obvious difference with this function is that it only takes one argument, time. The reason it doesn’t have an object or key argument is that it uses defaults for those fields. The defaults are generated by two functions you’ll find in your phpCache.inc file. One, cache_default_key(), we mentioned above. This function generates a key based on form variables and your query string. The object default is generated by the cache_default_object() function, and simply returns the host and filename of the script.

In many cases, you can use these defaults. There was the exception I talked about above, where you may want to edit or just not use the default key, and in cases where you want to share the same cached data across multiple pages, you won’t want to use the default object. These attributes can be easily replaced with any identifier you choose.

Caching your Database Queries

You must do more than simply put your database queries in the caching code block for them to be cached. You must still specifically identify the information you want cached, and to do that we use the cache_variable() function.

The cache_variable() function takes one argument: the name of the variable you want cached. It can be used like this:

cache_variable("variablename"); // the variable $variablename    
has been cached

This, of course, needs to be placed within the caching code block.

Technically, this is all the information you need for this script to work. However, as some of you may have problems making the jump from caching a single variable to caching multiple query results, I’d like to share my particular method for doing so.

To cache a database query that returns one row is very easy — you can simply cache each field independently — but it gets more complicated when your database returns multiple rows. Surely you don’t want to cache every variable in a 100 row result set independently? To cache the results that consist of multiple rows, we make use of arrays — in particular, two wonderful functions called array_push() and array_walk().

Let’s say you have an article-driven site and want to cache a list of all the different authors, along with their emails and whatever you use as a primary key (we’ll just say id). To cache such a query, you might do something like this:

$result_authors = mysql_query("SELECT first_name, last_name, email, id    
FROM authors ORDER BY last_name", $db);  
if (!$result_authors) {  
 echo("<p>Error performing author query: " . mysql_error() . "</p>");    
 exit();  
} // run the query and do error checking  
$authors = array(); // create the array  
// extract the query data into array $main  
while ($main = mysql_fetch_array($result_authors)){    
 // use array push to insert a result set row into our array  
 array_push($authors, $main);    
}  
cache_variable("authors"); // cache your array  
mysql_free_result($result_authors);

Most of that should be familiar to you. If you need some help with how to connect to do a database query with PHP, I suggest you read Kevin Yank’s article series, which covers the subject in ample detail.

What I am going to explain is the array_push() function. This function takes two arguments, the first being the name of the array you wish to use, and the second being the data you wish to put in the array. All the function really does is take that data, and place it in the next available line in the array. What I do is push each result row into our $authors array, and, as all the database information is then contained in one variable — our array — we can easily cache it using the cache_variable() function.

Now we come to the second array function I mentioned, array_walk(). While array_push() is great for inserting one row into an array at a time, array_walk() is perfect for running a function on one row in the array at a time.

To extract the data from your cached array, you’ll need to use array_walk() where you want the data to appear on your page:

if($authors){ // if the array exists   
print("<p><h2>Authors</h2>");  
// array walk the array $authors through the function print_authors();  
array_walk($authors, 'print_authors');    
print("</p>");  
}

The array_walk() function also takes two arguments, the first being the array you wish to use, and the second being the function you wish to pass each element of the array to. Also, as above, the "if statement" here is entirely optional if you know that $authors will always exist and contain an array. It’s included in this case because the "Authors" heading might look silly if indeed there were no authors (either because there were no authors in the database, or because the cache was lost or corrupted).

Now in the above example I use the function print_authors(). This is a user defined function you should define somewhere above in your PHP script. Some people like to put all of their functions in a separate file and then include it, which is a good solution if you’ll be using the same function across multiple pages. Otherwise, the top of your PHP script is a good place to define functions.

In this specific example the function might look like this:

function print_authors($authors){    
// extract  the data from our array    
$first_name = $authors["first_name"];    
$last_name = $authors["last_name"];    
$email = $authors["email"];    
$id = $authors["id"];    
// print the author's information    
 print("<li><a href = "mailto:$email">$id: $last_name,    
$first_name</a><br/>");    
}

Everything in the above function should be familiar. Here, we simply pull the database information out of the array we made above.

Implement it!

You should be able to implement this script on your site right now, if you haven’t started already. If you refer to the example I outlined above, it should show you all the code needed to get this system up and running. If you happen to have any problems though drop in over at the SitePoint Forums and don’t be afraid to ask for help.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

No Reader comments

Comments on this post are closed.