PHP Master | Practical Code Refactoring, Part 4

In part three of this series we dealt with refactoring code for extensibility and discussed logical extensibility, modular design, decoupling, and encapsulation. In this final part of the series, we discuss what the main concerns of efficiency are for your web application and how to refactor for better efficiency.

Most web applications are light by nature, so efficiency usually revolves around keeping bottlenecks to a minimum (none in a perfect world), whether they be memory consumption, processor availability, and network traffic. We all want our applications’ timing and rendering of content to be “reasonably fast”.

Before I mention some sources of inefficiency though and how to deal with them, you should know that you have two “best friends” when it comes to efficiency: buffering and caching. Buffering allows efficient utilization of memory and network resources, and caching allows efficient utilization of memory and processor. These are two important concepts which you can always use for more performant web apps.

You may use these ideas to refactor your code at a fine-grained level, not just as a big solution for a big problem. A good example of granular caching is an image gallery with thumbnails; instead of generating thumbnails dynamically each time you load the gallery, generate them once and cache them. They can be loaded from cache, and the cache can be updated whenever a new image is uploaded to the server. An example for fine grained buffering is reading a big file from the disk or network in chunks (into a buffer). Otherwise, you might not know how long it will take to read the entire file and it can block your application’s response.

Along with common problems and their refactoring strategies, keep in mind that since we are talking about efficiency, not only bare refactoring is engaged but also some optimization.

Network Bandwidth Inefficiencies

1. Are resource requests from the server kept to a minimum?

A web app is some business logic (written in PHP for example) that generates HTML, CSS, and JavaScript to be downloaded by a client (usually a browser). Keep in mind that you always have someone “waiting” for your page to load, and he’s not willing to wait for very long. Generally speaking, your page should load in around 2.5 seconds on 512Kb connection. For heavy pages, this shouldn’t be more than 5-10 seconds or the user will start to feel impatient. To be able to achieve this, you need to cut down on the number of resources you request from the server. Instead of having many CSS files load, merge them in a single file. The same goes for JavaScript. Then, optimize these aggregate files further.

You can easily cut the filesize of CSS and JavaScript files down by at least 30% if you use JavaScript compressors/minimizers such as YUI JS Compressor and CSSO. Some might argue that these are considered optimizations and not code refactoring, but at this point there is some overlap since you minimizing your code to minimize filesize.

For images you should use CSS sprites, a technique by which you can group images together in a single file to minimize bandwidth. It’s more network efficient to download a single larger file than a large number of smaller images. You can learn more about creating sprites here.

2. Do you use server callbacks wisely?

Most modern web-based applications are Ajax-based and call the server in the background naturally. This is fine so long as your server callbacks don’t have much latency. Don’t overuse Ajax for simple tasks that doesn’t require the server. Only use it when you REALLY need to fetch something from the server, and not for true real-time data. If you require real-time data, consider using techniques for data-push such as Comet and WebSockets.

Memory Inefficiencies

1. Are you using recursion?

Recursion is a powerful programming feature that can save you a lot of hassle when solving some complex problems, like tree traversal and searching. But recursion comes at a price: in languages like PHP it can consume all of your your memory if you don’t know what you are doing. Most problems which are solved with recursion can be solved with iteration, so keep recursion for those which specifically deserve it.

2. Do you iterate too much?

Using iteration instead of recursion doesn’t give you the green-light to do whatever you like; each technique has its limits, and when it comes to efficiency, using nested loops is totally inefficient. Use nested loops only for matrix-like problems, those which require 2 or more dimensional iterations and which can’t be solved by other means.

As well, you should always have some near breaking condition for most of your loops to avoid page load latency. For example, you may want to iterate over the latest 20 news feeds, but why not only the latest 5 and then provide a nice “more” link? Think in that manner and you can increase your page speed.

3. Are your queries planned?

Database latency is very common as a bottleneck in PHP applications and so you should always plan your queries. How many of us have written something like this?

select * from users;

The problem becomes relevant when the table grows over 10,000 users and has many columns. You’re selecting all columns of all rows in a single query! This is a typical unplanned query.

If you’re not using a ORM, you should dedicate some time to planning the querys you need. Decided exactly what data you need to retrieve, or else you’ll have easily rendered your web-app useless after your first 1K users.

Some records can grow exponentially, and so you should account for this in advance. Don’t take the tables for granted. Always select particular columns and rows, and if you’re not sure, use a limit clause to limit the generated result set.

Processing Inefficiencies

1. Do you need dynamic language features?

PHP has a lot of dynamic features, like overloading getters and setters, dynamic instantiation and method/function calling, even the most dangerous eval() method. Use all that wisely. These may be magical elements to some creative solutions to non-standard problems, but generally speaking you don’t need them in standard web app, in which using any extensively is unnecessary added complexity.

2. Do you use employ good coding habits?

If you need to loop based on the count of an array, count it outside the loop so it’s counted only once. This applies for any function calls too you need to make for comparisons.

Summary

The efficiency of your application is a factor of several things. The guidelines suggested here are not a replacement for post-coding optimizations which require you to measure for inefficiencies using profiling tools. They are pre-refactor steps you can take to help reduce bottlenecks before they start. Refactoring code is meant to be an incremental process; do it in simple small steps a bit at a time.

Refactoring is an important technique in software development that ensures your code health. Throughout this series I’ve introduced a set of practical lists you can use when refactoring. I’ve talked about good code, and how you can achieve good code in terms of Readability, Extensibility, and Efficiency. I hope you enjoy reading the series as I enjoyed writing it! :)

Image via Fotolia

Practical Code Refactoring, Part 4 – Efficiency

Network Bandwidth Inefficiencies

Memory Inefficiencies

Processing Inefficiencies

Summary