How Readable is Your PHP?

New today is a post on Smoking toooooo much PHP entitled Six deadly PHP sins, this week…. The article gives a quick list of tips for PHP developers on why they definitely should avoid in their code.

One of the entries is titled Every variable should start somewhere. In a previous blog post, I mentioned how frustrating it was to be trying to read some code and asking yourself, ‘Where does this variable come from?!’. Alan labels PHP language constructs such as extract and eval as ‘evil’ because they disguise and obfuscate code. That isn’t news. However he makes a valid point – that using these types of shortcuts is not a security problem in itself – the security problem occurs when your code is too hard to understand and you inadvertently introduce additional problems.

Take register_globals for example. The register_globals feature in itself is not a security problem, but the fact that it makes understanding and reviewing code so much harder can easily lead to security holes throughout your code. The PHP manual states that ‘register_globals will inject (poison) your scripts will all sorts of variables’ which means that ‘writing insecure code is that much easier’.

When developing in PHP, the following steps can be taken to make your code easier to understand.

Don’t make your code overly complicated. If you have a choice between doing something on one line or doing it on four lines, choose the one that’s easiest to read and understand over the one that takes up less space. Take a similar approach to variable naming. A variable named $j is a lot more confusing than a variable named $paying_customers.

Break code into small, separate functions (or methods). When a piece of code is self-contained inside a function, it benefits from the local variable scope of that function. If that function is only 20 to 30 lines long (including comments), it can fit on your screen all at once, hence making it very easy to see where each variable is coming from. If you want to know where something was initialised, you won’t have to go off and grep for it (or give up and trust it) if its declaration/initlialisation all occurs within one page of where it is used. Of course, using the global keyword negates this benefit.

Comment your code. A common question from PHP newbies (or newbies to any language) is ‘how many comments should I use?’. There’s no right answer to this, and experienced coders who have spent some time maintaining other people’s code will probably know better than most. I would recommend that for every method or function, you make sure it’s clear what type of arguments are expected, what the return value if any means, and anything else that needs to be known about the method in order to use it. The purpose and return value should be clear from the name of the function/method, but to remove any doubt it doesn’t hurt to comment it. If the method should be ‘private’, you should indicate that. If the method should be called statically, you should indicate that. In addition to this, you should insert an inline comment every time you have done something somewhat complicated. Commenting styles are hotly debated topic, just as are coding styles (spaces or tabs?). The one thing that is certain is that you should comment your code. Commenting your code makes it easier to understand how it works, and thus easier to review it later and to ensure it is secure.

Unfortunately a lot of hard-to-understand code is, understandably, a result of laziness or unrealistic deadlines. Keep in mind that regardless of the current specification of the project, it is likely to change in future and when that happens, somebody is going to go through your mess of code and try to change its direction and purpose. If the code is self-explanatory, it will be a lot easier to modify it and doing so will be less likely to introduce new security problems.

Making sure your code is easy to read and understand is the first, and in my opinion the most important, step towards writing secure code.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Etnu

    The biggest piece of irony here is that this site (Sitepoint.com) uses vbulletin — a product that I would argue is the single worst-designed (internally) piece of code out there. Now, don’t get me wrong, I think vbulletin does a whole lot of things *right* (their interfaces are great). But every time I have to hack apart their code to fix various bugs and the like, I want to cringe.

    Every single thing that vbulletin renders to the screen is done via eval(). Not only is it a major security risk (especially when combined with register_globals), but it’s impossible to take advantage of opcode caches.

  • chris ward

    without knowing too much about the structure of my code… the correct naming of variables is an invaluable way to keep track of shared code

    how i normally manage things is simple.

    php might not be strict to it’s datatypes, but take a variable called “aFirstNames”

    at first glance it’s easy to tell that this is an array called First Names :)

    iFirstNamesCount, sFirstName, sqlGetFirstNames

    data-type abbreviation is so underated in the php environment, and it’s good practice, easy to serialise and port to c#,perl or java if the need arises.

    in regards to commenting, obviously you can’t comment everything, but it only takes a couple of seconds to help your colleagues out.

    by rule of thumb, if im doing anything ‘clever’ i’ll normally write pseudo-code in the comments before i get stuck in… more as a guide to myself, it means i dont get stuck pondering on what my next move is!

    i hope that helps

    and here’s some sample code, of a function, part of a pagination class i’ve written


    function iLimitEnd()
    {
    if($this->iLastRemainder() == 0 && $this->iThisPage == $this->iTotalPages())
    return $this->iThisPage * $this->iResultsPerPage;

    if($this->iThisPage == $this->iTotalPages())
    return $this->iLimitStart() + $this->iLastRemainder();

    return $this->iLimitStart() + $this->iResultsPerPage;
    }

    immediately you’ll notice that this function returns an integer, iLimitEnd. easy!

  • http://www.deanclatworthy.com Dean C

    Etnu, do you actually know what you’re talking about? Every single variable in vBulletin is sanitized before output or input. It’s one of the most secure pieces of software available.

  • TheLunchBox

    Are you gonna site their and take that Etnu?! GEEK FIGHT!

  • http://www.maxhyatt.com MystaMax

    I thought that sounded a little off about vBulletin!

  • http://www.tbz.com Quadzoola

    Well it does use eval() everywhere, however it has been made very secure of the years. The part about it being horrendous code to work with is very true. I’ve worked on mods for vBulletin in the past and I must say it is the second worst code base I’ve every had to deal with on a readability level.

  • http://www.lastcraft.com/ lastcraft

    Hi…

    I’ve found that the more experienced developers actually write less comments, often none at all. They just break methods up into small chunks. Each method name becomes what the comment would have been.

    yours, Marcus

  • Etnu

    Etnu, do you actually know what you’re talking about? Every single variable in vBulletin is sanitized before output or input. It’s one of the most secure pieces of software available.

    I’ve worked with vBulletin since it first came out, and continue to recommend it as the best forum software out there today to all of my clients / employers.

    However –

    I’ve installed it on 8 major websites and dozens of smaller ones. I’ve integrated it with everything under the sun, and constantly had to deal with its shortcomings. If you think it’s one of the most secure pieces of software available, you’ve never used anything but the latest version (which has no known security holes that I’m aware of), and likely never dealt with any other piece of code in your life. Even versions as recent as 3.0.5 had major flaws (question: why should turning on COMMENTS IN OUTPUT be exposing me to a security risk?)

    vBulletin is awful, awful software. If it weren’t for the fact that ALL forum software out there is pretty much crap (the forums with good code tends to be feature-poor), it’d never gotten where it is today. Well, the fact that phpBB and IPB suck worse probably helps a bit, too.

    Lets not even get started on the developers practice of encouraging “hacks” to accomodate their poor design. Things are improved quite a bit in 3.5, but it’s still sorely lacking. That’s what you get when you have a company more concerned with making sure that every installation is legit than with making sure their software won’t expose the users to potentially major security risks.

  • http://www.napathon.net/ vinyl-junkie

    Unfortunately a lot of hard-to-understand code is, understandably, a result of laziness or unrealistic deadlines.

    You can also add incompetence and incomplete requirements to your list. In my opinion, unrealistic deadlines and incomplete requirements are the biggest reasons for poorly written code.

    As a developer, I am often asked to start writing code before the requirements are fully defined. What started out as well-written code rapidly degenerates when requirements are constantly being changed in the middle of coding. That has occasionally resulted in other developers questioning my competence when they look at my code. They have no idea what caused my code to turn out like it did because they weren’t involved in the initial project to see these problems.

    Conventional wisdom states that if one adheres to established project management methodology (analysis, then design, then construction), problems such as I have stated can be avoided. However, unrealistic deadlines and incomplete requirements definition force us to shortcut the development process, which often results in poorly written code.

  • Sky

    Just to say that i liked reading that article and all your comments are very interesting.
    Thx Thomas.

  • http://mgaps.highsidecafe.com BDKR

    Well, all I can say is “preach it brother”! Being in the position of having to manage and extend a code base that is the stuff of nightmares, I feel this very topic should be one of the most loudly ballyhooed on the net today.

    By far, the most difficult challenge with my present employment is in extending portions of existing functionality that was never clearly abstracted or defined in the first place. So as a result of this hastily jumbled together moras of spaghetti, the vast majority of my time is dedicated to merely “grokking” what I see before me. And we all know you should only attempt to extend what you understand.

  • gmilner

    Of course, the best way to make your PHP code seem immediately more readable is to learn Perl. It’s subjective, but it really works.

  • http://www.daholygoat.com daholygoat

    In some software engineering methodologies (Agile Software Development for example), instructors encourage you to reduce comments, and make entity names such as function, class and variable names obvious and precise. In the book “Refactoring – Improving existing designs”, Martin Fowler explains his view on comments as being a “smell” (read about Refactoring smells) that implies a code structure or naming that is not concise. He encourages to only comment algorithms that are so complex that they can only be made obvious by commenting them. So in short his lesson is: try to improve your code before you comment. It might be better to stress on well structured, obvious code than on commenting, especially (no offense) in the PHP world.

  • http://navin.biz Navin

    I would recommend to all beginning php-programmers to make familiar with PEAR php coding standards (pear.php.net). There you will find far more tips on how a good code must be organized and developed then in this article.

  • Fred

    and make entity names such as function, class and variable names obvious and precise

    This can be debatable, one may find a piece of code to be clean and clear whilst the others might not. People tend to have different logic and solution on a same problem.

    So I reckon appropriate comments is necessary.

  • Pingback: Canglan’s Blog 沧蓝品味 » 代码标准 (Coding Standards)

  • Fred

    You could make some functions to make your code more readable. Could also make some classes to make your code more structured, and use low coupling and high cohesion program design. I recommend to use low coupling because it can make a good structured program design, good maintainability, information hiding and encapsulation.