PHP
Article

The Delicious Evils of PHP

By Christopher Pitt

Working with PHP 7.1? Download our FREE PHP 7.1 Cheat Sheet!

This article was peer reviewed by Wern Ancheta and Deji Akala. Thanks to all of SitePoint’s peer reviewers for making SitePoint content the best it can be!


I want to look at two PHP functions: eval and exec. They’re so often thrown under the sensible-developers-never-use-these bus that I sometimes wonder how many awesome applications we miss out on.

Like every other function in the standard library, these have their uses. They can be abused. Their danger lies in the amount of flexibility and power they offer even the most novice of developers.

Let me show you some of the ways I’ve seen these used, and then we can talk about safety precautions and moderation.

Evil elephpant

Dynamic Class Creation

The first time I ever saw dynamic class creation was in the bowels of CodeIgniter. At the time, CodeIgniter was using it to create ORM classes. eval is still used to rewrite short open tags for systems that don’t have the feature enabled…

More recently, though, my friend Adam Wathan tweeted about using it to dynamically create Laravel facades. Take a look at what the classes usually look like:

namespace Illuminate\Support\Facades;

class Artisan extends Facade
{
    protected static function getFacadeAccessor()
    {
        return "Illuminate\Contracts\Console\Kernel";
    }
}

This is from github.com/laravel/framework/blob/5.3/src/Illuminate/Support/Facades/Artisan.php

These facade classes aren’t facades in the traditional sense, but they do act as static references to objects stored in Laravel’s service locator class. They project an easy way to refer to objects defined and configured elsewhere, and have benefits over traditional static (or singleton) classes. One of these benefits is in testing:

public function testNotificationWasQueued()
{
    Artisan::shouldReceive("queue")
        ->once()
        ->with(
            "user:notify",
            Mockery::subset(["user" => 1])
        );

    $service = new App\Service\UserService();
    $service->notifyUser(1);
}

…and though these facades are simple to create, there are a lot of them. That’s not the kind of code I find interesting to write. It seems Adam felt the same what when we wrote the tweet.

So, how could we create these facade classes dynamically? I haven’t seen Adam’s implementation code, but I’m guessing it looks something like:

function facade($name, $className) {
    if (class_exists($name)) {
        return;
    }

    eval("
        class $name extends Facade
        {
            protected static function getFacadeAccessor()
            {
                return $className::class;
            }
        }
    ");
}

That’s a neat trick. Whether of not you use (or even like) Laravel facades, I’m guessing you can see the benefits of writing less code. Sure, this probably adds to the execution time of each request, but we’d have to profile performance to decide if it even matters much.

I’ve used a similar trick before, when working on a functional programming library. I wanted to create the supporting code to enable me to write applications using fewer (if any) classes. This is what I used eval to create:

functional\struct\create("person", [
    "first_name" => "string",
    "last_name" => "string",
]);

$me = person([
    "first_name" => "christopher",
    "last_name" => "pitt",
]);

I wanted these structures to be accessible from anywhere, and self-validating. I wanted the freedom to be able to pass in an array of properties, and the control to be able to reject invalid types.

The following code uses many instances of ƒ. It’s a unicode character I’ve used as a sort of pseudo-namespace for properties and methods I can’t make private, yet don’t want others to access directly. It’s interesting to know that most unicode characters are valid characters for method and property names.

To allow this, I created the following code:

abstract class ƒstruct
{
    /**
     * @var array
     */
    protected $ƒdef = [];

    /**
     * @var array
     */
    protected $ƒdata = [];

    /**
     * @var array
     */
    protected $ƒname = "structure";

    public function __construct(array $data)
    {
        foreach ($data as $prop => $val) {
            $this->$prop = $val;
        }

        assert($this->ƒthrow_not_all_set());
    }

    private function ƒthrow_not_all_set()
    {
        foreach ($this->ƒdef as $prop => $type) {
            $typeIsNotMixed = $type !== "mixed";
            $propIsNotSet = !isset($this->ƒdata[$prop]);

            if ($typeIsNotMixed and $propIsNotSet) {
                // throw exception
            }
        }

        return true;
    }

    public function __set($prop, $value)
    {
        assert($this->ƒthrow_not_defined($prop, $value));
        assert($this->ƒthrow_wrong_type($prop, $value));

        $this->ƒdata[$prop] = $value;
    }

    private function ƒthrow_not_defined(string $prop)
    {
        if (!isset($this->ƒdef[$prop])) {
            // throw exception
        }

        return true;
    }

    private function ƒthrow_wrong_type(string $prop, $val)
    {
        $type = $this->ƒdef[$prop];

        $typeIsNotMixed = $type !== "mixed";
        $typeIsNotSame = $type !== type($val);

        if ($typeIsNotMixed and $typeIsNotSame) {
            // throw exception
        }

        return true;
    }

    public function __get($prop)
    {
        if ($property === "class") {
            return $this->ƒname;
        }

        assert($this->ƒthrow_not_defined($prop));

        if (isset($this->ƒdata[$prop])) {
            return $this->ƒdata[$prop];
        }

        return null;
    }
}

function type($var) {
    $checks = [
        "is_callable" => "callable",
        "is_string" => "string",
        "is_integer" => "int",
        "is_float" => "float",
        "is_null" => "null",
        "is_bool" => "bool",
        "is_array" => "array",
    ];

    foreach ($checks as $func => $val) {
        if ($func($var)) {
            return $val;
        }
    }

    if ($var instanceof ƒstruct) {
        return $var->class;
    }

    return "unknown";
}

function create(string $name, array $definition) {
    if (class_exists("\\ƒ" . $name)) {
        // throw exception
    }

    $def = var_export($definition, true);

    $code = "
        final class ƒ$name extends ƒstruct {
            protected \$ƒdef = $def;
            protected \$ƒname = '$name';
        }

        function $name(array \$data = []) {
            return new ƒ$name(\$data);
        }
    ";

    eval($code);
}

This is similar to code found at github.com/assertchris/functional-core

There’s a lot going on here, so let’s break it down:

  1. The ƒstruct class is the abstract basis for these self-validating structures. It defines __get and __set behavior that includes checks for presence and validity of the data used to initialize each struct.
  2. When a struct is created, ƒstruct checks if all required properties have been provided. That is, unless any of the properties are mixed they must be defined.
  3. As each property is set, the value provided is checked against the expected type for that property.
  4. All of these checks are designed to work with (and wrapped in) calls to assert. This means the checks are only performed in development environments.
  5. The type function is used to return predictable type strings for the most common types of variables. In addition, if the variable is a subclass of ƒstruct, the ƒname property value is returned as the type string. This means we can define nested structures as easily as: create("account", ["holder" => "person"]). A caveat is that the pre-defined types (like "int" and "string") will always be resolved before structures of the same name.
  6. The create function uses eval to create new subclasses of ƒstruct, containing the appropriate class name, ƒname, and ƒdef. var_export takes the value of a variable and returns the syntax string form of it.

The assert function is usually disabled in production environments by having zend.assertions at 0 in php.ini. If you’re not seeing assertion errors where you expect them, check what this setting is set to.

Domain Specific Languages

Domain Specific Languages (or DSLs as they’re usually referred to) are alternative programming languages that express an idea or problem domain well. Markdown is an excellent example of this.

I’m writing this post in Markdown, because it allows me to define the meaning and importance of each bit of text, without getting bogged down in the visual appearance of the post.

CSS is another excellent DSL. It provides many and varied means of addressing one or more HTML elements (by a selector), so that visual styles can be applied to them.

DSLs can be internal or external. Internal DSLs use an existing programming language as their syntax, but they are uniquely structured within that syntax. Fluent interfaces are a good example of this:

Post::where("is_published", true)
    ->orderBy("published_at", "desc")
    ->take(6)
    ->skip(12)
    ->get();

This is an example of some code you might see in a Laravel application. It’s using an ORM called Eloquent, to build a query for a SQL database.

External DSLs use their own syntax, and need some kind of parser or compiler to transform this syntax into machine code. SQL syntax is a good example of this:

SELECT * FROM posts
    WHERE is_published = 1
    ORDER BY published_at DESC
    LIMIT 12, 6;

The above PHP code should approximately render to this SQL code. It’s sent over the wire to a MySQL server, which transforms it into code servers can understand.

If we wanted to make our own external DSL, we would need to transform custom syntax into code a machine can understand. Short of learning how assembler works, we could translate custom syntax into a lower-level language. Like PHP.

Imagine we wanted to make a language that was a super-set language. That means the language would support everything PHP does, but also a few extra bits of syntax. A small example could be:

$numbers = [1, 2, 3, 4, 5];
print_r($numbers[2..4]);

How could we convert this into valid PHP code? I answered this exact question in a previous post, but the gist of it is by using code similar to:

function replace($matches) {
    return '
        call_user_func(function($list) {
            $lower = '.explode('..', $matches[2])[0].';
            $upper = '.explode('..', $matches[2])[1].';
            return array_slice(
                $list, $lower, $upper - $lower
            );
        }, $'.$matches[1].')
    ';
}

function parse($code) {
    $replaced = preg_replace_callback(
        '/\$(\S+)\[(\S+)\]/', 'replace', $code
    );

    eval($replaced);
}

parse('
    $numbers = [1, 2, 3, 4, 5];
    print_r($numbers[2..4]);
');

This code takes a string of PHP-like syntax and parses it by replacing new syntax with standard PHP syntax. Once the syntax is standard PHP, the code can be evaluated. It essentially does an inline code replacement, which is only possible when code can be executed dynamically.

To do this, without the eval function, we’d need to build a compiler. Something that takes high-level code and gives back low-level code. In this case, it would need to take our PHP super-set language code, and give back valid PHP code.

Parallelism

Let’s take a look at another jaded core function: exec. Perhaps more decried than even eval, exec is universally denounced by all but the more adventurous developers. And I have to wonder why.

In case you’re unfamiliar, exec works like this:

exec("ls -la | wc -l", $output);
print $output[0]; // number of files in the current dir

exec is a way for PHP developers to run an operating system command, in a new sub-process of the current script. With a little bit of prodding, we can actually make this sub-process run completely in the background:

exec("sleep 30 > /dev/null 2> /dev/null &");

To do this: we redirect stdout and stderr to /dev/null and add an & to the end of the command we want to run in the background. There are many reasons you’d want to do something like this, but my favorite is to be able to perform slow and/or blocking tasks away from the main PHP process.

Image you had a script like this:

foreach ($images as $image) {
    $source = imagecreatefromjpeg($image["src_path"]);

    $icon = imagecreatetruecolor(64, 64);

    imagecopyresampled(
        $source, $icon, 0, 0, 0, 0,
        64, 64, $image["width"], $image["height"]
    );

    imagejpeg($icon, $image["ico_path"]);

    imagedestroy($icon);
    imagedestroy($source);
}

This is fine, for a few images. But imagine hundreds of images, or dozens of requests per second. Traffic like that could easily affect server performance. In cases like these, we can isolate slow code and run it in parallel (or even remotely) to user-facing code.

Here’s how we could run the slow code:

exec("php slow.php > /dev/null 2> /dev/null &");

We could even take it a step further by generating a dynamic script for the PHP command-line interface to run. To begin with, we can install SuperClosure :

require __DIR__ . '/vendor/autoload.php';

use SuperClosure\Serializer;

function defer(Closure $closure) {
    $serializer = new Serializer();
    $serialized = $serializer->serialize($closure);

    $autoload = __DIR__ . '/vendor/autoload.php';

    $raw = '
        require \'' . $autoload . '\';

        use SuperClosure\Serializer;

        $serializer = new Serializer();
        $serialized = \'' . $serialized . '\';

        call_user_func(
            $serializer->unserialize($serialized)
        );
    ';

    $encoded = base64_encode($raw);

    $script = 'eval(base64_decode(\'' . $encoded . '\'));';

    exec('php -r "' . $script . '"', $output);

    return $output;
}

$output = defer(function() {
    print "hi";
});

Why do we need to hard-code a script (to run in parallel) when we could just dynamically generate the code we want to run, and pipe it directly into the PHP binary?

We can even combine this exec trick with eval, by encoding the source code we want to run, and decoding it upon execution. This makes the command to start the sub-process much neater overall.

We can even add a unique identifier, so that the sub-process is easier to track and kill:

function defer(Closure $closure, $id = null) {
    // create $script

    if (is_string($id)) {
        $script = '/* id:' . $id . ' */' . $script;
    }

    $shh = '> /dev/null 2> /dev/null &';

    exec(
        'php -r "' . $script . '" ' . $shh,
        $output
    );

    return $output;
}

Staying Safe

The main reason so many developers dislike and/or advise against eval and exec is because their misuse leads to far more disastrous outcomes than, say, count.

I’d suggest, instead of listening to these folks and immediately dismissing eval and exec, you learn how to use them securely. The main thing you want to avoid is using them with unfiltered user-supplied input.

Avoid at all costs:

exec($_GET["op"] . " " . $_GET["path"]);

Try instead:

$op = $_GET["op"];
$path = $_GET["path"];

if (allowed_op($op) and allowed_path($path)) {
    $clean = escapeshellarg($path);

    if ($op === "touch") {
        exec("touch {$clean}");
    }

    if ($op === "remove") {
        exec("rm {$clean}");
    }
}

…or better yet: avoid putting any user-supplied data directly into an exec command! You can also try other escaping functions, like escapeshellcmd. Remember that this is a gateway into your system. Anything the user running the PHP process is allowed to do, exec is allowed to do. That’s why it’s intentionally disabled on shared hosting.

As with all PHP core functions, use these in moderation. Pay special attention to the data you allow in, and avoid unfiltered user-supplied data in exec. But, don’t avoid the functions without understanding why you’re avoiding them. That’s not helpful for you or the people who will learn from you.

  • Oh man, please. You will actually write less code, but, later, other developers will suffer over your code because of poor legibility and obscure code. And chances are that as developers work on that kind of code, it will become worse and worse, and will rot incredibly fast.

    • Chris

      I’m not sure what parts you particularly dislike, nor what your motivation for disliking them really is. “This will lead to abuse” isn’t a compelling argument, and in truth it is the very cargo-cult mentality the post seeks to unpack.

      • Konstantin Pereiaslov

        André is referring to “Dynamic class creation” section.
        It’s same as with using magic __get and __set methods. It might save writing some code, but you’re losing ability to statically analyze this code. This means, you’re losing many features of IDE.
        Also it becomes impossible to step through the code in the debugger.

        It is useful if you’re writing throw-away code, but in the long term it’s harmful for development process.

        • Chris

          This is a lot clearer to talk about.

          I think some of it can be mitigated by documentation, both intentional and phpDoc. For instance, dynamic methods and properties can be defined for an IDE via the @method and @property declarations. This works exceedingly well for PHPStorm, for instance. ~~I’m fairly sure Xdebug steps through __get, __set, and __call; so that would only be a problem for custom step-through debuggers.~~ [I’m unsure whether Xdebug actually works on evaluated classes. Interested to see what it does…]

          I’d have to see more data to agree that it’s always harmful in the long-term. I’ve worked on countless projects, and with many frameworks, that use __get and __set. The long-term maintainability has never been affected by the use or avoidance of __get and __set. But then, my anecdotal evidence is as valuable as yours… :)

          • Konstantin Pereiaslov

            You can step through __get and __set, yes, but not with eval, which article is about.

            Even with @method and @property you can’t control-click on method in PHPStorm to see the code that will actually be executed. Every time I have to deal with this in 3rd-party libraries, I’m losing some time. And now you have to support both annotations and methods themselves, scattered over the file.

          • Chris

            Yeah, I see what you’re saying. The important thing is that you understand well why you don’t want to use them. That’s good. Far better than dismissing them because you’ve heard from someone else that they’re bad.

            I’ve not actually used the code I’ve demonstrated here in any projects. It’s all to show what is possible, free from the constraints of absolutism and cargo-cult thinking. Thanks for taking the time to talk about it. :)

          • Rasmus Schultz

            You saved me the time of explaining this, thank you :-)

            The truth is that almost anything that can be implemented poorly with eval() can also be implemented well by some other means. There are very few exceptions. For example, I’m working on an experimental library which, given an interface name, will generate a class and instance of an RPC client so you can host an actual implementation on a different server. Even that is bordering on hocus-pocus, even if it does have IDE support and passes static analysis ;-)

  • Interesting article..I just so don’t want to ever get to work with such a code..If I would get a real project with things introduced here I would tear my hair out :)

Recommended
Sponsors
Get the latest in PHP, once a week, for free.