Some thoughts on performance and optimisation

[RIGHT]Premature optimization is the root of all evil.
Donald E. Knuth
[/RIGHT]

This little article has grown from a note I posted in some discussion thread, which read:

The only proper way of handling performance issues if profiling.
Once you get familiar with it, you forget about such a trifle things as “concatenation vs. expanding”, “single vs. double” or slowdowns of @. Moreover, your eyes open to things never known before - the real performance issues that actually affect timing and resource consumption.

Now, encouraged by some kind words, I decided to expand it a bit.

In fact, internet is stuffed with “best practice” tips on how to make your app faster. Teaching you the right way to compose a string or even which quotes are better to use. With proof tests, of course.
Why are these tests bad? Because they test nothing. In real world, your app does many other things aside of string composing. Moreover, with each request your server has to run a whole PHP interpreter, which is far bigger than your script.
Even moreover, usually there are network traffic expenses that level anything as a steamroller*.

If you want to test something for real, use apache benchmark utility, very simple application supplied with Apache web server.
Type something like this at the server’s command prompt:

ab -n 10 http://example.com/

it will send 10 requests to your site and measure it.
You will notice that numbers are different. And the difference is far bigger than you can gain with your super-micro-optimization. That’s the way the real things are. There is always an observational error present. And lots of circumstances that affect the result.
Therefore, we can (and ought to) measure only the real matters that can actually be measured.

Nearly everything has been said here about dedicated server. Shared hosting used for the most of your sites is a completely different world. With its own way of leveling (or producing) differences.

So, are there no performance issues at all? - would you ask. Of course there are.
They are just different from all these syntax issues.
To find them we would use a thing called profiling.
As a side note I have to say that PHP’s fate is to have many developers who aren’t actually programmers: artists, gamers, housewives. It is good and it is bad. But as a matter of fact, many PHP users do not have programmer’s skills. And most important among them are debugging and profiling skills. Although, both are as simple as an egg.

Profiling itself stands for simply measuring the runtime of different script parts.
It can be can done both manually or using some software, like xdebug, http://xdebug.org/.
Manual profiling is very simple
You have to record current time in the various places of your code, using microtime() function
And then do some calculations at the end, like this:

<?
$TIMER['start']=microtime(TRUE);
// some code
$query="SELECT ...";
$TIMER['before q']=microtime(TRUE);
  $res=mysql_query($query);
$TIMER['after q']=microtime(TRUE);  
  while ($row = mysql_fetch_array($res)) {
// some code
  }
$TIMER['array filled']=microtime(TRUE);  
// some code
$TIMER['pagination']=microtime(TRUE);  

if ('127.0.0.1' === $_SERVER['REMOTE_ADDR']) {
  echo "<table border=1><tr><td>name</td><td>so far</td><td>delta</td><td>per cent</td></tr>";
  reset($TIMER);
  $start=$prev=current($TIMER);
  $total=end($TIMER)-$start;
  foreach($TIMER as $name => $value) {
    $sofar=round($value-$start,3);
    $delta=round($value-$prev,3);
    $percent=round($delta/$total*100);
    echo "<tr><td>$name</td><td>$sofar</td><td>$delta</td><td>$percent</td></tr>";
    $prev=$value;
  }
    echo "</table><>";
}
?>

it will print out something like


name             so far  delta   per cent
start            0       0       0
before q         0.004   0.004   9
after q          0.039   0.035   82
array filled     0.042   0.003   7
pagination       0.042   0       0

So, we can see that only one line of our code took 80% of time. A wise one would optimize SQL query, not the code that represented by 0 0 0 in this table.
SQL profiling is quite different, and should be discussed in another article.
But BENCHMARK and EXPLAIN queries is good place to start. Just add these keywords before your query like this

BENCHMARK SELECT * FROM table
EXPLAIN SELECT * FROM table

and study output.

Xdebug produces similar, but more detailed logs.
For the next level of profiling, I’d recommend PINBA project, http://pinba.org

What to do when you’ve found a code that is terrible slow?
There are no single answer. There are many ways to improve performance. Hardware upgrading, database tuning, algorithm rewriting, opcode caching, output caching. Better to ask on sitepoint forums for the best situated in your case.

When to start profiling?
When something goes wrong. When your site goes notable slow.


*Although there is a way to eliminate all network expenses, making all requests local, by using some “proxy web-server”,

Probably not a 100% reliable way of testing as the MySQL server may not necessarily be on the same server box. The time taken for the query will depend on a number of factors:

  • What the query itself is
  • How big the tables are
  • How busy the server is
  • How much network traffic there is between the PHP and MySQL servers
  • If the evil SELECT * is being used

Bumping for the ’ is faster than " crowd

@Space
A bad query or two will put the rest of those to shame though
/reminisces about crashing the uni server

Worst, bad query tested on empty databases.
I had a superb query, getting things comprised in some area. It was good with 100 records. But in a 10k record table, it needed 30s.
As I had some algorithm in php running around in the page using this query, only some profiling gave me the culprit.
Some EXPLAIN to see that yeah, MySQL does not like to use indexes if you do some sqrt on their columns’ values. Query simplified, some php code to do the work it did before and back down to less than 10ms to run it.

Lesson learned : don’t check your perfs on a brand new installation of your app.

A worthy bump from me.

Thanks Shrapnel_N5!