Spooky Scary PHP

Break out the candy corn and apple cider; it’s that time of year again! The rest of the world may not celebrate Halloween as hog wild as in America, but I thought it’d be fun to share some scary PHP stuff to mark the holiday. This is a fun article, sharing with you some scary (but logical) behavior found in PHP itself, and spooky (and possibly quite illogical) ways in which some have twisted PHP to do their bidding. Think of this as my treat to you, a little bit of geek “mind-candy” – because why should all the trick-or-treaters have all goodies?

The Haunted Array

Once upon a time in a development shop not so far away, Arthur was up late at night hacking out some code. Little did he know the array he was about to use was haunted! He felt a chill run down his spine as he typed out his statements, keystroke by keystroke, but foolishly brushed off his subtle premonition.

<?php
$spell = array("double", "toil", "trouble", "cauldron", "bubble");
foreach ($spell as &$word) {
    $word = ucfirst($word);
}
foreach ($spell as $word) {
    echo $word . "n";
}

Alright, so the array wasn’t really haunted, but the output certainly was unexpected:

Double
Toil
Trouble
Cauldron
Cauldron

The reason for this spooky behavior lies in how PHP persists the reference outside of the first foreach loop. $word was still a reference pointing to the last element of the array when the second loop began. The first iteration of the second loop assigned “double” to $word, which overwrote the last element. The second iteration assigned “toil” to $word, overwriting the last element again. By the time the loop read the value of the last element, it had already been trampled several times.

For an in-depth explanation of this behavior, I recommend reading Johannes Schlüter’s blog post on the topic, References and foreach. You can also run this slightly modified version and examine its output for better insight into what PHP is doing:

<?php
$spell = array("double", "toil", "trouble", "cauldron", "bubble");
foreach ($spell as &$word) {
    $word = ucfirst($word);
}
var_dump($spell);
foreach ($spell as $word) {
    echo join(" ", $spell) . "n";
}

Arthur learned a very important lesson that night and fixed his code using the array’s keys to assign the string back.

<?php
foreach ($spell as $key => $word) {
    $spell[$key] = ucfirst($word);
}

The Phantom Database Connection

More and more, PHP is asked to do more than just generate web pages on a daily basis. The number of shell scripts written in PHP are on the rise, and the duties such scripts perform are increasingly more complex, as developers see merit in consolidating development languages. Often times the performance of these scripts are acceptable and the trade off for convenience can be justified.

And so Susan was writing a parallel-processing task which resembled the following code:

#! /usr/bin/env php
<?php
$pids = array();
foreach (range(0, 4) as $i) {
    $pid = pcntl_fork();
    if ($pid > 0) {
        echo "Fork child $pid.n";
        // record PIDs in reverse lookup array
        $pids[$pid] = true;
    }
    else if ($pid == 0) {
        echo "Child " . posix_getpid() . " working...n";
        sleep(5);
        exit;
    }
}
// wait for children to finish
while (count($pids)) {
    $pid = pcntl_wait($status);
    echo "Child $pid finished.n";
    unset($pids[$pid]);
}
echo "Tasks complete.n";

Her code forked children processes to do some long-running work in parallel while the parent process continued on to monitor the children, reporting back when all of them have terminated.

Fork child 1634. 
Fork child 1635. 
Fork child 1636. 
Child 1634 working... 
Fork child 1637. 
Child 1635 working... 
Child 1636 working... 
Fork child 1638. 
Child 1637 working... 
Child 1638 working... 
Child 1637 finished. 
Child 1636 finished. 
Child 1638 finished. 
Child 1635 finished. 
Child 1634 finished. 
Tasks complete.

Instead of outputting status messages to stdout though, Susan’s supervisor asked her to log the times when processing started and when all of the children finished. Susan extended her code using the Singleton-ized PDO database connection mechanism that was already part of the company’s codebase.

#! /usr/bin/env php
<?php
$db = Db::connection(); 
$db->query("UPDATE timings SET tstamp=NOW() WHERE name='start time'"); 

$pids = array();
foreach (range(0, 4) as $i) {
    ...
}
while (count($pids)) {
    ...
}

$db->query("UPDATE timings SET tstamp=NOW() WHERE name='stop time'"); 

class Db 
{ 
    protected static $db; 

    public static function connection() { 
        if (!isset(self::$db)) { 
            self::$db = new PDO("mysql:host=localhost;dbname=test",
                "dbuser", "dbpass"); 
            self::$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); 
        } 
        return self::$db; 
    } 
}

Susan expected to see the rows in the timings table updated; the “start time” row should have listed the timestamp when the whole process was kicked off, and the “stop time” should have when everything finished up. Unfortunately, execution threw an exception and the database didn’t mirror her expectations.

PHP Fatal error:  Uncaught exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 2006 MySQL server has gone away' in /home/susanbrown/test.php:21 
Stack trace: 
#0 /home/susanbrown/test.php(21): PDO->query('UPDATE timers S...') 
#1 {main}
+------------+---------------------+ 
| name       | tstamp              | 
+------------+---------------------+ 
| start time | 2012-10-13 01:11:37 | 
| stop time  | 0000-00-00 00:00:00 | 
+------------+---------------------+ 

Like Arthur’s array, had Susan’s database become haunted? Well, see if you can piece together the puzzle if I give you the following clues:

  1. When a process forks, the parent process is copied as the child. These duplicate processes then continue executing from that point onward side by site.
  2. Static members are shared between all instances of a class.

The PDO connection was wrapped as a Singleton, so any references to it across the application all pointed to the same resource in memory. DB::connection() first returned the object reference, the parent forked, the children continued processing while the parent waited, the children processes terminated and PHP cleaned up used resources, and then the parent tried to use the database object again. The connection to MySQL has been closed in a child process, so the final call failed.

Naively trying to obtain the connection again before the end logging query wouldn’t help Susan as the same defunct PDO instance would be returned because it’s a Singleton.

I recommend avoiding Singletons – they’re really nothing more than fancy OOP’ized global variables which can make debugging difficult. And even though the connection would still be closed by a child process in our example, at least DB::connection() would return a fresh connection if you invoked it before the second query if Singletons weren’t used.

But still better would be to understand how the execution environment is cloned when forking and how various resources can be affected across all the processes. In this case, it’s wiser to connect to the database in the parent thread after the children have been forked, and the children would connect themselves if they needed to. The connection shouldn’t be shared.

#! /usr/bin/env php
<?php
$pids = array();
foreach (range(0, 4) as $i) {
    ...
}

$db = Db::connection(); 
$db->query("UPDATE timings SET tstamp=NOW() WHERE name='start time'"); 

while (count($pids)) {
    ...
}

$db->query("UPDATE timings SET tstamp=NOW() WHERE name='stop time'"); 

An API Worthy of Dr. Frankenstein

Mary Shelley’s Frankenstein is a story of a scientist who creates life, but is so repulsed by its ugliness that he abandons it. After a bit of gratuitous death and destruction, Dr. Frankenstein pursues his creation literally to the ends of the earth seeking its destruction. Many of us have breathed life into code so hideous that we later wish we could just run away from it – code so ugly, so obtuse, so tangled that it makes us want to retch, but it only wants love and understanding.

Years ago I was toying around with an idea focused on database interfaces and how might they look if they adhered more closely to Unix’s “everything is a file” philosophy: the query would be written to the “file”, the result set would be read from the “file.” One thing lead to another, and after some death and destruction coding of my own, I had written the following class which has little relevance to my original idea:

<?php
class DBQuery implements Iterator
{
    protected $db;
    protected $query;
    protected $result;
    protected $index;
    protected $numRows;
    
    public function __construct($host, $dbname, $username, $password) {
        $this->db = new PDO("mysql:dbname=$dbname;host=$host",
            $username, $password);
    }
    
    public function __get($query) {
        $this->query = $query;
        $this->result = $this->db->query($query);
        return $this->numRows = $this->result->rowCount();
    }
    
    public function __call($query, $values) {
        $this->query = $query;
        $this->result = $this->db->prepare($this->query);
        $this->result->execute($values);
        return $this->numRows = $this->result->rowCount();
    }
    
    public function clear() {
        $this->index = 0;
        $this->numRows = 0;
        $this->query = "";
        $this->result->closeCursor();
    }
    
    public function rewind() {
        $this->index = 0;
    }
    
    public function current() {
        return $this->result->fetch(PDO::FETCH_ASSOC,
            PDO::FETCH_ORI_ABS, $this->index);
    }
    
    public function key() {
        return $this->index;
    }
    
    public function next() {
        $this->index++;
    }
    
    public function valid() {
        return ($this->index < $this->numRows);
    }
    
    public function __toString() {
        return $this->query;
    }
}

The result was genius, but repulsive: an instance which looked like an object (with no real API methods), or an array, or a string…

<?php
$dbq = new DBQuery("localhost", "test", "dbuser",
    "dbpassword");

// query the database if the user is authorized
// (instance behaves like an object)
$username = "administrator";
$password = sha1("password");
if (!$dbq->{"SELECT * FROM admin_user WHERE username=? " .
    "AND password=?"}($username, $password)) {
    die("Unauthorized.");
}

// query the database and display some records
// (instance is iterable like an array)
$dbq->{"SELECT id, first_name, last_name FROM employee"};
foreach ($dbq as $result) {
    print_r($result);
}

// casting the object string yields the query
echo "Query: $dbq";

I blogged about it shortly thereafter and branded it as evil. Friends and colleagues who saw pretty much reacted the the same: “Brilliant! Now kill it… kill it with fire.”

But over the years since, I admit I’ve softened on it. The only rule it really bends is that of the programmer’s expectation of blandly named methods like query() and result(). Instead it uses the query string itself as the querying method, the object is the interface and the result set. Certainly it’s no worse than an overly-generalized ORM interface with select() and where() methods chained together to look like an SQL query but with more ->‘s. Maybe my class really isn’t so evil after all? Maybe it just wants to be loved? I certainly don’t want to die in the Arctic!

In Closing

I hope you enjoyed the article and the examples won’t give you (too many) nightmares! I’m sure you’ve got your own tails of haunted or monstrous code, and there’s no need to let the holiday fun fizzle out regardless of where you are in the world, so feel free to share your scary PHP stories in the comments below!

Image via Fotolia

Win an Annual Membership to Learnable,

SitePoint's Learning Platform

  • http://www.audero.it/ Aurelio De Rosa

    Hi Timothy and thank you for sharing these amazing examples. Just a side note…in the first example wouldn’t it be simpler to add unset($word); after the first loop instead of rewriting the code? Of course for this example there’s just one line of code, but in real source the lines to change could be much more.

    • http://prorankstudios.com Jeremy Rimer

      Sovle the first problem with
      foreach ($array as $key=>$value)
      {
      ucfirst($array[$key]);
      }

      …instead of using a reference to the current element.

  • Omer Sabic

    I think the “float comparison” could also fit on this list. The day I discovered it I thought I was losing my mind!

  • Ignatius Teo

    References are inherently evil and should be avoided where possible. One way Arthur could have avoided all of this was to use array_map().

    By the way, Tim, loved the Frankenstein API. That’s pure evil genius!

  • http://www.fractalserver.com Manuel Herrera

    Hi Timothy,
    In “The Hauted Array” chapter, I would recomend the use of the following structure:
    while(list($key,$var) = each($array)){
    $array[$key] = transform($var);
    }
    reset($array);
    Probably a little bit verbouse, but you won’t experiment problems with array items modifications.

  • Joe

    No. Thank goodness Halloween isn’t big here – but there is some Halloween junk in our supermarket. Not happy Jan.

  • http://flowerpoop.com James Lee

    wouldn’t using array_map just be as helpful?
    $spell = array(“double”, “toil”, “trouble”, “cauldron”, “bubble”);
    $spell = array_map(function($value){
    return ucfirst($value);
    }, $spell);
    print_r($spell);

    Array
    (
    [0] => Double
    [1] => Toil
    [2] => Trouble
    [3] => Cauldron
    [4] => Bubble
    )

  • http://www.coreyballou.com Corey Ballou

    PHP 5.3+ could help clean up your PDO API class with dynamic method calls.

    $query('username', 'my_password')) {
    die("Unauthorized.");
    }

  • http://www.mightmedia.net FDisk

    for the first bug there is a explanation
    https://bugs.php.net/bug.php?id=29992