PHP Gotchas: Part 1

Tweet

PHP is a remarkably easy language to get started with but from there, if my own experience is anything to go by, developers seem to experience a “rollercoaster ride” in terms of productivity. Some people refer to PHP as the “Visual Basic of Open Source”, which is both a complaint and a complement. A quote attributed to Bjarne Stroustrup (designer of C++); “There are only two kinds of programming languages: those people always ***** about and those nobody uses.”…

Over the next few weeks (perhaps months) will be attempting to highlight PHP “gotchas”; things that lead to developer slow-down and *****ing, when working with PHP. In other words the types of problem which aren’t obvious up front and only become clear once you’ve “been there”. Some will be purely technical issues (PHP configuration, legacy headaches etc.) while others will be more theoretical (what “works” and what doesn’t in terms of code design).

The purpose is signpost “gotchas” to developers getting started with PHP and, hopefully, prevent frustration before it happens. Will be based primarily on my own experiences, after almost five years of PHP, as well as things I’ve seen on Sitepoint’s PHP forums. Further input / insight much appreciated, as are requests for subjects.

PHP Environment and Portability Gotchas

Kicking off, these are some of the common php.ini related gotchas. When talking about “portability” here, I’m referring to running code under different PHP installations, as opposed to operating system portability or backwards compatibility with older PHP versions, both of which need examining seperately.

Some of these are already covered here so excuse me re-iterating; think it’s worth attempting to put together a complete list as I see some of these problems over and over again, looking at Open Source PHP projects.

The basic misconception seems to be the assumption that all PHP installations are equal; code that runs under one should run fine under all. While that’s largely true, some key PHP configuration settings and legacy issues conspire to make headaches. It is possible to write code that runs fine under any PHP installation (assuming comparable PHP versions) but a care is needed.

Controlling Runtime Configuration

First up, you need to know how to change PHP’s runtime configuration (runtime as opposed to compile time configuration when PHP is built and installed).

There are, essentially, four basic mechanisms to control PHP’s runtime configuration; the php.ini file, Apaches httpd.conf file (or similar, such as the Windows registry), using Apache .htaccess files or within the scripts themselves using functions like ini_set(). It’s worth reading the manual on Runtime Configuration as well as browsing the core directives and the more or less complete reference found under ini_set(). Further notes can be found commented in the php.ini file itself.

The key point to note here is on a shared web server (your typical PHP host) users will only be to changes settings via the scripts themselves and possibly using .htaccess files (few hosts will let users change php.ini or httpd.conf). Changing settings with a .htaccess requires Apache configured to provide users the “AllowOverride Options” or “AllowOverride All” privileges (normally placed in httpd.conf under descriptions) – this is fairly common but cannot be 100% relied upon.

The mechanism by which a runtime configuration setting can be changed depends on the setting itself. Looking at the list found under ini_set(), you’ll notice values in the “Changeable” column like PHP_INI_PERDIR and PHP_INI_SYSTEM. These are actually constants defined as follows;

- PHP_INI_USER: the configuration option can be change inside a PHP script (in fact you’ll never see this listed – it falls under PHP_INI_ALL below).

- PHP_INI_PERDIR: the setting can be changed in php.ini, httpd.conf or a .htaccess file.

- PHP_INI_SYSTEM: the setting can only be changed in php.ini or httpd.conf.

- PHP_INI_ALL: the setting can be changed by all available mechanisms, include a users script.

In other words, for portability, avoid writing code that relies on PHP_INI_SYSTEM and be aware that PHP_INI_PERDIR may be a problem for some users.

Apache Directives

The are two Apache directives, which can be used in httpd.conf and .htaccess files, available for changing configuration settings, namely php_value for settings which have string values and php_flag for settings which have boolean (0 or 1 in fact) values. An example .htaccess file containing one of both;


# Switch off register_globals
php_flag "register_globals" 0

# Set the include_path - Unix! See below...
php_value "include_path" ".:/usr/local/lib/php"

Place this in some directory on your server and place a PHP script containing;


phpinfo();
?>

You should see that the local values for these settings have been changed (the global values are those set in php.ini or httpd.conf).

Note for sysadmins – there are also two more directives, php_admin_value and php_admin_flag described here.

Script Configuration

To change configuration settings within a PHP script, the main functions are ini_set() to change a configuration value, ini_get() to get the current local value of a configuration setting, get_cfg_var() to get the global value from php.ini, ini_get_all() for a giant array of all settings, containing both local and global values and ini_restore() to revert a local option to it’s global value (overriding .htaccess files as well). Other functions, such as set_include_path() act as aliases for a specific configuration option, but pay close attention to the PHP version information in the manual, when using these.

An example to append a value to the include path, from within a PHP script;


// Include path seperator different on Windows
if (strtoupper(substr(PHP_OS, 0,3) == 'WIN')) {
$seperator = ';';
} else {
$seperator = ':';
}

// Some extra include path
$extraIncPath = '/home/harryf/lib';

// The current local include path
$currentIncPath = ini_get('include_path');

// Append extra path to the local include path
ini_set('include_path',$currentIncPath . $seperator . $extraIncPath);

Development Settings

The settings described below are things you should set in php.ini itself, in your development environment. Some are a pain, in that code modification may be required if you've written stuff without them, but it's worth the effort in fixing if your code will be used by other people.

Error Reporting: E_ALL

When developing, switch error_reporting to E_ALL. In particular this catches E_NOTICE type errors, which you can normally get away but may display error messages to users with this setting and may break code when it comes to sending HTTP headers. It will mean code updates e.g. where you used to write;


if ( $_GET['doSomething'] == 'yes' ) {
// do something
} else {
// do the default
}

You'll need to write;


if ( isset($_GET['doSomething']) && $_GET['doSomething'] == 'yes' ) {
// do something
} else {
// do the default
}

To prevent error notices when $_GET['doSomething'] isn't set. Note that using the @ operator to suppress error messages is generally slower than using the isset() construct.

PHP Tags

Switch short_open_tag off and avoid asp_tags. Code modification may be required e.g.;

Becomes;

The problem with short_open_tag is the PHP interpreter will be confused by XML tags (plus anyone with it switched off will see the tags as HTML) e.g.;


$xml_body = file_get_contents('nodeclaration.xml');
?>

PHP will trip on the XML declaration, thinking it's PHP. The short_open_tag setting is, sadly, PHP_INI_PERDIR so there's no way to modify it inside a script (which would be nice to have, IMO but, no doubt, tricky to implement).

Register Globals: Off

Hopefully you've realised that having register_globals switched on is generally bad news for security, as explained here. Will do security "gotchas" another time.

From the point of view of portability, code written with register_globals switched off should run with register_globals switched on (but may not be secure!) - the same probably won't work in reverse.

Cutting a long story short, switch of register_globals!

Call Time Reference Passing: Off

References in PHP4 are a tricky subject that you'll find more on here and probably need their own "gotchas" discussion.

For portability, switch off allow_call_time_pass_reference. This refers to code like;


function someFunc($msg) {
// Do something
}

$msg = 'Hello World!';

// Call time reference passing!
someFunc(& $msg);
?>

Switching allow_call_time_pass_reference off will result in PHP warning errors being generated if you attempt to use it. Once you understand how references work, there's no need to do this anyway and it can make code extremely hard to follow.

Magic Quotes

Magic quotes are a tricky subject. They do a lot to prevent beginners shooting themself in the foot but can cause big headaches later. There's more in depth discussion here and here - be aware there are important security concerns to be aware of, regarding magic quotes.

From a portability perspective, it's best to write code that doesn't rely on magic_quotes_gpc being switched on (e.g. use mysql_escape_string()) but can function correctly irrespective of whether magic_quotes_gpc is on or off. A quick way to do this is to execute the something like the following, before the rest of your code;


// Is magic quotes on?
if (get_magic_quotes_gpc()) {

// Yes? Strip the added slashes

$_GET = array_map('stripslashes', $_GET);
$_POST = array_map('stripslashes', $_POST);
$_COOKIE = array_map('stripslashes', $_COOKIE);

}

Include Path Seperator

Although I said I wasn't going to talk about operating system related issues, as I've mentioned it above it's worth being aware that the include_path seperator is different on Unix and Windows. If you're setting it within a PHP script, the trick you've already seen above can help;


if (strtoupper(substr(PHP_OS, 0,3) == 'WIN')) {
$seperator = ';';
} else {
$seperator = ':';
}

[update]
PHP 4.3.4 provides the predefined constant PATH_SEPARATOR which contains the above character needed for include paths.

Thanks Joshelli for tip
[/update]

Safe Mode

Errr - no thanks. Personally don't write code for users running with safe mode on. If anyone want's to fill this blank, please do.

SAPI Issues

PHP has a number of Server APIs, perhaps the two most popular being the Apache API and the CGI API. The new CLI API adds further issues. The PHP function php_sapi_name() can be useful.

There's some discussion of the Apache vs. CGI APIs here, in particular related to the $_SERVER['PATH_TRANSLATED'] variable. Notes on compatibility between the CLI and CGI binaries, when running command line scripts, can be found on the later half of this page.

Enough already for now. Feel free to add / correct - will update this blog with things I've missed.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Raqueeb Hassan

    Great article, keep up the good work!

  • Dangermouse

    Another remarkable blog/article Harry :)

  • Firestorm2003

    Should be a whole article!

  • Jazeps

    I haven’t written any other language but PHP, maybe that’s the reason why I find no sense in the natural behaviour of PHP interpreter which takes this code:

    < ?php
    // echo ‘?>';

    /*
    echo ‘< ?php /* echo "hello world"; */ ?>‘;
    */

    // echo ‘< ?php echo "hello world"; ?>‘;

    echo ’12′;
    ?>

    And does not return plain ’12′; Instead, it returns all this:

    ‘; /* echo ”; */ // echo ‘hello world'; echo ’12′; ?>

    The manual clearly states why this is so, but I find no sense in this. Havent seen any PHP tutorial which stresses this.

  • Jazeps

    One more thing I came accross recently: builtin function number_format. Its declaration states:
    string number_format ( float number, int decimals, string dec_point, string thousands_sep)
    The manual entry also states that if thousands separator is a string of multiple characters, only the first is used. What if you’d like to use   as thousands separator???

  • josheli

    as of at least version 4.3.4 (or so) you can use the predefined constant PATH_SEPARATOR for the “Include Path Seperator”.

    $ php -r “echo PATH_SEPARATOR;”
    :

    and DIRECTORY_SEPARATOR as of version 4.0.6:

    $ php -r ‘echo DIRECTORY_SEPARATOR;’
    /

  • http://www.phppatterns.com HarryF

    Thanks Jazeps – will see if I can incorporate some answers to those into another gotchas blog. Definately room for one of PHP syntax “gotchas”. Specific function “gotchas” may be more than I can manage but will see what I can do.

  • http://www.phppatterns.com HarryF

    And thanks Josheli – will make an update re that.

  • Chris Shiflett

    Very helpful and informative, Harry. Keep up the good work.

    As for safe_mode, since it doesn’t really solve the specific security problem it attempts to address, I think you gave it plenty of attention. :-)

  • http://www.procata.com/ Selkirk

    Great topic, Harry.

  • plugged

    Jazeps: That makes perfect sense to me.
    Perhaps the following looks clearer to you:

    < ?php // comment ?>‘; /* echo ‘< ?php /* comment */ ?>‘; */ // echo ‘< ?php echo "hello world"; ?>‘; echo ’12′; ?>

    = (nothing)'; /* echo ‘(nothing)'; */ // echo ‘hello world'; echo ’12′; ?>

    = ‘; /* echo ”; */ // echo ‘hello world'; echo ’12′; ?>

  • http://www.phpmystery.com phpMystery

    Thanks for valuable tips about configuration issues. Everytime I read ur blog, there is topics that I haven’t aware before. I wanted to buy your books, unfortunately there is no shipping to my country. Is there any way to get these books such as ebook vesion?!
    Sorry for my english if there is any mistake.

  • Sander

    @Jazeps

    Each time a php-session is ended by a “?>”, the last command is also ended. You can understand the output of your command by replacing all “?>” by “;?>”, then it makes perfectly sense.

  • tqbsl(rot1)

    @Jazeps

    Perhaps if you broke up your sample code with whitespaces it becomes more obvious.

    < ?php // comment ?>‘; /* echo ‘< ?php /* comment */ ?>‘; */ // echo ‘< ?php echo "hello world"; ?>‘; echo ’12′; ?>

    Each php comment section (IIRC) is “terminated” by the end of line explicitly but also apparently by the end of code block delimiter. So the first code block’s comment is terminated at the close of that code block.

  • Richard Cyganiak

    Great post. It’s worth noting that setting ini values in .htaccess files requires the Apache server API. Low-cost shared hosts often use the CGI server API instead.

  • Niakie

    Great post Harry! Ever thought of writting a thrid volume to the Anthology series spacificly on ‘gotchas’? I know I would be one happy customer!

  • http://www.limb-project.com dbrain

    Jazeps, you can do following trick to make &nbsp as thousands separator:
    < ?
    $num = number_format(123456.89, 2, ‘,’ , ‘ ‘);
    echo str_replace(‘ ‘, ‘&nbsp’, $num)
    ?>

  • Derek

    I would stick with:
    $num = number_format(123456.89, 2, ‘,’ , ‘&’);
    …then, when you’re ready to publish the page you can HTMLencode all the relevant strings…

  • David Duret

    One can use the following piece of code to emulate the PATH_SEPARATOR for PHP < 4.3.4:

    if ( !defined('PATH_SEPARATOR') ) {
    define('PATH_SEPARATOR', ( substr(PHP_OS, 0, 3) == 'WIN' ) ? ';' : ':');
    }

  • Ren

    Jazeps,

    $num = htmlentities(number_format(123456.89, 2, ‘,’ , chr(0xA0)));

  • Ren

    Another problem (imo) is

    arg_seperator.output config value.

    Some people are seemingly using this to put pre-encoded values in (namely &) rather than properly encoding the all the output.

    See http://pear.php.net/bugs/bug.php?id=704

    This leaves those of us that encode are urls with the possibility of double encoding and ending up outputting &amp;.

  • http://alan.caint.com rekcah

    I just put the Magic Quotes code above into a project I’m working on. I use the $_REQUEST super global alot and I noticed that the slashes don’t get stripped, so maybe $_REQUEST = array_map(‘stripslashes’, $_REQUEST); should be added to the above code?

  • drtebi

    The code snipped to strip slashes does only work as long as the variables are not arrays. Otherwise PHP will complain:

    Notice: Array to string conversion in /example/test.php on line 125

    And slashes are not stripped, but rather unexpected things may happen.

    There is a better function that does strip slashes recursively posted on PHP’s manual pages:
    http://us3.php.net/stripslashes

  • bhutz

    When writing your include_path it’s worth noting that spaces may cause a problem…well they did for me on Windows anyway.
    This didn’t work
    include_path = ".;C:php-5.2.0extrasfpdf-ext; C:php-5.2.0extrasfpdf;C:php-5.2.0extrassmartylibs;"
    But this did…notice the space after ‘fpdf-ext;’ was removed
    include_path = ".;C:php-5.2.0extrasfpdf-ext;C:php-5.2.0extrasfpdf;C:php-5.2.0extrassmartylibs;"

  • marshn

    regarding the magic_quotes script mentioned in this article. It does not take into account the fact that stripslashes() is not recursive, and that when using checkboxes in a form $_POST or $_REQUEST can be multidimensional arrays.
    From the manual “Note: stripslashes() is not recursive. If you want to apply this function to a multi-dimensional array, you need to use a recursive function.”
    The manual page has a simple example of how to fix this issue:
    http://us3.php.net/manual/en/function.stripslashes.php

  • marshn

    jeez.. I’m about a year to late.. should have read the comments first before blurting ;)

  • Anonymous