By Akash Mehta Live PHP benchmarks, demystifying “best practices”

By Akash Mehta

As a new contributor to the SitePoint blogs, I’ll be covering PHP web development, JavaScript and general web tech.

When it comes to optimizing PHP for performance, there’s no end of resources available, and no end of conflicting opinions either. Everyone seems to have their own approach to writing “fast” PHP code; using single quotes, avoiding require_once() and using isset() before is_array() are some of the most common. But with reliable benchmarks thin on the ground, how do we know if any of these techniques – often touted as “performance best practices” – actually deliver benefits? Chris Vincent’s new PHP benchmark suite at aims to “set the record straight” on PHP performance techniques, with a simple, comprehensive view of how various approaches actually stack up.

The benchmark suite covers all the usual bases, taking a simple task — like iterating over an array — and speed testing almost every possible way to achieve it. Most importantly, however, Chris takes raw numbers out of the spotlight and instead focuses on how the options compare with each other. Each test takes the fastest technique as the base value for execution time; all the other options are measured as percentages in relation to this. For example, foreach($aHash as $val) has script execution time of 558% compared to while(list($key) = each($aHash)) $tmp[] = $aHash[$key] with 100%.

The test scores are also generated live; refreshing the page will produce a slightly different set of results. Chris uses the header of the relatively uncomplicated results page to recommend refreshing a few times, just to ensure consistency and to avoid a one-off impact on any particular test. Personally, I’d have preferred the tests be carried out in a controlled environment and regenerated every time a new test is added; anomalies don’t help anyone and live benchmarking offers little real benefit besides operational simplicity. The code for each test is supplied for transparency.

His current set of tests are very comprehensive; no less than 13 tests compared the performance of various echo and print calls, with enough variety to check for any usage scenario. He’s also accepting suggestions for new tests. Most importantly, however, many of his tests help identify the value of best practices. For example, PHP is often criticised for the vague results of == comparisons; the control structures tests show us that not only is === safer, it also takes about 2/3 as long to execute.

While most of the tests show trivial performance differences, there are some interesting results. For example, iterating over an array and modifying each element is 100 times as fast using while(list($key) = each($aHash)) than a simple foreach ($aHash as $key=>$val), while array_keys() is about 20 times slower than while(list($key,$val) = each($aHash));.

The next step, of course, would be to download the test suite to your local machine or server and run the tests yourself, to see how the factors of your production environment affect the results; Chris aims to make this possible in the near future. Many of the techniques listed are clearly going to be either memory-efficient or CPU cycle efficient, and the limits of your infrastructure will determine which way you want to go before scaling out. Nevertheless, managing bottlenecks in your code can really help you get the most out of your servers — after all, if you can achieve something 100 times as fast, why not? — and could become the definitive reference on PHP performance.

Update: Many commenters have noted issues with the usefulness and reliability of the tests. It’s important to consider Chris’ intentions while creating the site, as he’s described in the comments. With all the “use this syntax because it’s slightly faster” posts, was built to see if there is any real material benefit to using a particular syntax; most of the tests show that there clearly isn’t. I’ve done a few tests of my own and seen similar results, but if something looks amiss, feel free to point it out to Chris. The site is not trying to identify immaterial performance benefits from e.g. single quotes vs. double quotes. Moving further, PHP Bench could be used to measure algorithm performance and other more intensive procedures; for now, the basic syntax tests simply reflect the potential of the system. Hopefully this clears things up.

  • Sceptic

    Simple testing here reveals that running foreach($aHash as $key => $val); is way way faster than while($aHash as $key=>$val); . The numbers I’m getting with a $aHash being an array with 10.000 (instead of 100) unique elements with 24 byte keys and 10240 byte values are:

    – foreach: 0.006 ~ 0.007
    – while: 0.02 ~ 0.03

  • Transition

    I’m not sure why the tests are done real-time. The disclaimer at the top of the page says it all:

    You must keep in mind to refresh this page a few times to “catch” the right result.

  • Sceptic

    Very curious about how he got his results. A quick, similar test (10.000 unique items (10k in size) in $aHash using 24byte keys) run through foreach ($aHash as $key => $value); vs. while(list($key,$val) = each($aHash)); yields very different results:

    foreach loop: 0.006 ~ 0.007
    while loop: 0.02 ~ 0.03

    Quite the difference. And if you think about it, it makes sense. foreach may do more at the init stage (reset the array and all that) but the suggested while loop has to call a function on each iteration, one of the slowest things PHP can do.

  • Erik Bauffman

    I have run every single test about foreach & while loops on my local php 5.2.1 with every form of server side caching disabled and each time the foreach scores up to 50% better than the while…

    What configuration is he running, what php version, where can I download phpbench ?

  • From all the emails that I’ve been receiving it looks as if the results of my website are being quite contested. Currently because this website is being hosted on a shared server I don’t know the exact server specs but I’m going to find out. All I do know is that 1) the php version is 5.2.6 because the admins are vigilant at updating the PHP version and 2) the whole concept of this website was to open up your mind and think about & test this stuff a bit more than reading it about in all these articles that just give you the answers without explaining why.

    @Transition – I made this website live because I would find that I’d get twice the arguments about whether the tests are valid or not. This way it shows better results, and it’s also easier for me to move it to another server if needed.

    @Erik – I may think about releasing the test files (I keep all the major sections in separate files) in the future, but I’d probably rather try and see how these results differ from all the results that a lot of other people are receiving. At the moment all the code in the “view code” column is stripped straight from the test files themselves, so in fact you already have everything you need.

  • Lachlan

    The question you’ve got to ask yourself is, does it matter whether a less readable syntax is faster? Show me a PHP application where your choice of loop method will make any sort of significant difference and I will show you an application that probably shouldn’t be written in PHP.

    Touting a slightly faster syntax as “Best Practice” is rubbish. Write code that is clear, maintainable and understandable and concentrate on optimizing actual bottlenecks like database access.

    I will defer to Knuth in this case:

    “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.”

  • Anonymous

    I hate these kinds of benchmarks, and they seem to come up so often in the PHP community it is really starting to make me sick. Any seasoned programmer knows the dangers of premature optimization. You need to write code that is easily readable, extensible/flexible and maintainable. Don’t worry about performance until you actually need to. And when you do, I can guarantee the largest performance gains won’t come from “change all foreach loops to while loops” or “use single quotes instead of double quotes”. They will come from designing your application better by using less database queries and hard drive activity. The key to optimizing in any language is to profile your code, find where the ACTUAL bottlenecks are, and optimize in those specific places.

  • Kailash Badu

    I would concur with everyone who opined that there is no point nitpicking over the issues that simply aren’t important in the big picture. The real bottlenecks lay somewhere else.

  • turb

    I think that these kind of benchmarks are good to still continue to learn PHP in a proper way. Someone said “Don’t worry about performance until you actually need to…” then I said, good practice at the beginning will save you alot of time in the future.

    Even if a tiny details like foreach instead of while loop doesn’t change alot a page speed, after one month, it is saving alot. So imagine all those tiny details, at the end, change alot in terms of optimisation.

    I am doing some MySQL test these days and I see things with a table of 2 million rows that I do not see with a smaller table. So even if it seem like a waste of time, you end up by learning new technique that you’ll replicate in the future.

    Why some people hate so much Flash website….. exactly because ‘seasoned programmer’ code very bad! So telling that it is not important prove the level of professionalism of some people!

  • Mike Roetgers

    You don’t optimize your page by changing all foreach loops into while loops. Important is the readability of your code and while(list(,$val) = each($aHash)) is no enhancement.
    BTW: The benchmark says nothing. I did some testruns on my machine with the provided source code snippets. For example: In my case foreach is winning every time against while.

  • Trent Reimer

    The benchmark code is flawed and is giving surprising results accordingly.

    But the idea is neat and hopefully the code will be updated.

    In one example, a global array variable is set with the name $aHash. The following functions, however, operate on a nonexistent global variable named $x instead. Needless to say the resulting numbers seem quite surprising if you are not aware of the error.

    In some cases results appear so far off it leads me to wonder if the site may be mishandling exponents by stripping off the “E” at the end, considerably multiplying the actual value. e.g. treating 5.004E-5 as 5.004

    As the previous posters who also took the time to run some of these tests have noted, the site’s results do not square.

  • metapundit

    I’m so happy to see that there is some sanity in the PHP community. @Lachlan and the following posts: you are exactly right! I’m past being sick of these kind of benchmarks – I’m pissed that semi-authoritative places like sitepoint link to crap like this and really believe that such benchmarks are counterproductive to the PHP community as a whole.

    That’s a little inflammatory so let me get a few things out of the way. I’m not knocking Chris Vincent – I’ve never heard of him and I’m willing to stipulate sight unseen that he is a better programmer than I. Same goes with Akash Mehta – I’ve got no grudge against personalities and I’ll happily say he probably has better programming chops than I do… Lot’s of people do!

    That out of the way: stop mentioning this crap!

    Look – 99% of PHP programmers will never be in a situation where the CPU performance of their code is important. “Slow” applications are network bound, and database bound most of the time and increasing the runtime of your PHP code does not matter! If your web app is running slowly, you can make it run faster by learning to generating less HTTP requests and making fewer datastorage calls (Database or filesystem). Seriously – this is standard wisdom and it cuts across web development platforms (same advice applies to Ruby, Perl, Python, etc). If you are in the 1% (say you’re writing a pure PHP bignum library: performance comes from data structures and algorithms. Don’t make microbenchmark changes unless you’ve profiled and proven that they are necessary!)

    Think about it – why would we care about these benchmarks? Because we want our code to run faster right? Well no – I only want my code to run faster if it’s not running fast enough right now! If it takes .005 seconds to generate a page and the average user takes 1-5 seconds to load it across the network going through my code to gain even a dramatic 100% speedup (more on that in a second) won’t change the user experience at all!

    So you say – I’m in the 1%! I need to speed up my app! I have a server load in double digits, pages take 30 seconds to render due to CPU load… I gotta have faster running code! Again – refer to above. It is the accepted wisdom of web dev’s across languages that the fault is probably not based on micro benchmarks (whether you used while loops or for loops). Check your database indexes and profile all the sql calls you’re making. Having the right indexes can change the BigO quality of your queries… And lastly check out caching. Do you really have to calculate every piece of data every time? Memcache is lightning fast and even going to a filesystem Cache system like PEAR’s Cache_Lite can dramatically affect performance. Which is better: making your 1000 line’s of code run 25% faster at the expense of less readable code or not running the code at all 9 times out of 10 and simply serving out of the cache? (Hint: 1000% > 25%).

    Ok – enough ranting about what you should do (ie actual best practices) – here’s why you don’t care about the micro-benchmarks unless you actually work on the PHP runtime itself. You visit and see: hey I can speed up my 31% by using double quotes to assign strings rather than single quotes! I should go back through all my old code and make replacements!

    No you shouldn’t. Single quotes and double quotes mean different things in PHP. Single quotes means No interpolation! If I see a line like

    $var = 'some text';

    it tells me that you won’t have special characters like n, you won’t be using $var in the string somewhere. It means something. And no you won’t get a 31% speedup – how many lines in your 1000 line codebase are string assignments? 5%? Great you just made 50 changes to get a 1.5% speedup. And of course you used a search and replace and didn’t realise that one of those strings contained a $ sign so now you introduced a bug. Oh and in the next point release of the PHP runtime the performance characteristics of the language change and your performance gains are gone.

    Get the picture? Language features like foreach($var as $k=>$v) exist because they communicate your intent more precisely and concisely and that is what you should care about as a programmer. Well written program logically structured with clear code? Easy to optimise. Ball of mud with liberal use of microbenchmark best practices? Good luck… Programming is writing for people first (you if nobody else) and computers secondly. CPU time is cheap, programmer time is expensive.

    My rant is running out of steam. Let me address Chris and Akash. Chris first – your site is valuable to two groups of people. Yourself (because writing code and completing projects is good for programmers) and the PHP language developers. The only people who should look at micro-benchmarks are the core devs who should be making foreach as fast as while and should be making single quotes as fast as double quotes. For everybody else – this stuff is bikeshedding at best and counterproductive to the inexperienced who don’t know better (think about it – beginners read this stuff and don’t even know if they have an opcode cache or not). Seriously – put up a big disclaimer at the top of the page directing people looking to make their code run faster to the Wikipedia page on Big O notation. Or at least direct them to useful optimization advice like

    And Akash – you write a fine article. But if you want to talk about optimization do the PHP world a favor and write about algorithms, caching options, server (Apache, MySQL) tuning and so on. Teaching people how to run a profiler and actually find out where their code is slow would be useful. If on the other hand you want to write about how to code please don’t add to the stream of advice that misses the forest for the trees. Only one of the links you posted was that useful. People write “637 ways to speed up your code” style posts because they’re easy! Don’t do easy (and trivial, transient, and useless)- do hard. Do algorithm analysis. Talk about database performance. Look YSlow’s list for optimizing the client side. My intent isn’t to be mean to anybody and I know it’s fun to look at the numbers. Geeks like to argue over minutia. So be it. But don’t pretend that it is important or “best practice” and don’t confuse newbies who might think that they should pay attention to this stuff…

  • Dorsey

    I agree completely with metapundit: spend more time on design issues (algorithms, DB schema and tuning, and a usable human interface) and less on dopey little coding issues such as single vs. double quotes. In the long run (as has been amply demonstrated and documented in the past 30 years, and as any senior IT manager will tell you), the biggest cost of software is maintenance, so make your code readable and logical (as opposed to really, really, clever) so that the person picking it up after you can make sense of out of it to the point of developing the confidence to improve or correct your work.

    By the way, am I the only one appalled that SitePoint (who I really like and depend upon for accurate commentary) blundered so badly on this article? It seems as if nearly everyone who ran the benchmarks themselves found just the opposite of the authors, and a simple peer review might have avoided that embarrassment.

    My compliments and appreciation to metapundit for expressing all of those thoughts.


  • If anything, that site is advocating the worst practices…

    First off the results are horribly wrong for most of the tests, not to mention the most efficient solutions are in many of the cases not even there.

    The author (Chris) also does not seem to understand why the tests “vary”. Performance benchmarks of this type must always be run in an confined enviroment with no unnecessarily processes running. Running them on a shared hosting account, is a clear sign that he dont know what hes doing.

    I also strongly recommend that you read up on the differences between php4 and php5 Chris.

    To be honest I had expected better from a SitePoint article, this place is starting to provide articles in similar quality as those you find in gossip magazines. It is articles like this one that is the reason that there is 99 bad PHP developers for each good one.

    As others has mentioned above, pre-optimicing is the source of all evil. Dont give away code readability for optimicing its not even sure you need. In most cases, easy readable code is also the most efficient. Though, every developer has their own “version” of what is easily readable code ;)

  • — This is written by “the” chris vincent… —

    Not to be intruding or anything…. but if you actually read everything on, including the item down the bottom, you’ll notice that I indeed know clearly all that has been said within this page. The intent of showing off these benchmarks was not whatsoever the purpose of this page.

    I am no expert at what I do, but I put most of my time into algorithms, mysql benchmarking and server optimisation as any php expert should. Please don’t lecture people how the “right” way of doing things without understanding the correct situation, instead, please just put your ideas on the table and discuss this through like the responsible people you are.

    I’d continue but I think that’s all that needs to said.

  • d

    Regarding foreach: please keep in mind that foreach works on a copy of your array, and while/each() loops over a &reference to it. This means that for us 1% who do need to optimize ALL of our code, every nanogram of performance counts. Using big iron doesn’t matter, because no matter how much you boost your systems, sales is coming up with brave new ways to bog it down. foreach was a great leap forward stylistically from pre-php4 days, and is a well-loved workhorse of a tool, but readability is secondary because (USER === GOD)

    Need to do a massive sort when massaging data, without having ORDER BY put a temp table on disk, relatively tight memory conditions with dozens of requests-per-second and C/C++ isn’t a good fit because of platform interoperability? Profiling can find the bottlenecks, but its lists like this that’ll give you some options to try when foreach copies your array. The same ideas applies to using Yslow, EXPLAINing queries, checking the slow query log, etc., etc., etc. Best practices come from some reading and lots and lots of practice, or to quote Niels Bohr: “An expert is a person who has made all the mistakes that can be made in a very narrow field.”

  • Brian ….

    What a load of pony

    For example, foreach($aHash as $val) has script execution time of 558% compared to while(list($key) = each($aHash)) $tmp[] = $aHash[$key] with 100%.

    That is not what is on Vincent’s site … in fact it says exactly the opposite … this is two teenagers w4nking each other off … expert php programmers my expert a$$

Get the latest in Front-end, once a week, for free.