The Delicious Evils of PHP

This article was peer reviewed by Wern Ancheta and Deji Akala. Thanks to all of SitePoint’s peer reviewers for making SitePoint content the best it can be!

I want to look at two PHP functions: eval and exec. They’re so often thrown under the sensible-developers-never-use-these bus that I sometimes wonder how many awesome applications we miss out on.

Like every other function in the standard library, these have their uses. They can be abused. Their danger lies in the amount of flexibility and power they offer even the most novice of developers.

Let me show you some of the ways I’ve seen these used, and then we can talk about safety precautions and moderation.

Evil elephpant

Key Takeaways

Dynamic Class Creation with `eval`: Demonstrates the flexibility of PHP through dynamic class creation, using `eval` to dynamically generate classes like Laravel facades, reducing boilerplate code and potentially enhancing productivity despite performance concerns.
Innovative Use of Unicode in PHP Structures: Introduces the concept of using Unicode characters as pseudo-namespaces in PHP classes (`ƒstruct`), which allows for creating self-validating, structured data types that enforce type and presence checks during development.
Domain Specific Languages (DSLs) in PHP: Explores how PHP can be used to create both internal and external DSLs, enhancing the expressiveness of code and allowing for clearer domain-specific logic, such as SQL query builders or Markdown parsers.
Parallel Execution with `exec`: Discusses the use of `exec` for running background processes in PHP, enabling asynchronous task handling and better resource management, which is crucial for performance in web applications handling intensive tasks.
Security Practices for `eval` and `exec`: Emphasizes the importance of secure usage of `eval` and `exec`, advocating for stringent input sanitization and validation to prevent code injection attacks, and suggesting controlled environments for their usage to mitigate risks.

Dynamic Class Creation

The first time I ever saw dynamic class creation was in the bowels of CodeIgniter. At the time, CodeIgniter was using it to create ORM classes. eval is still used to rewrite short open tags for systems that don’t have the feature enabled…

More recently, though, my friend Adam Wathan tweeted about using it to dynamically create Laravel facades. Take a look at what the classes usually look like:

namespace Illuminate\Support\Facades;

class Artisan extends Facade
{
    protected static function getFacadeAccessor()
    {
        return "Illuminate\Contracts\Console\Kernel";
    }
}

This is from github.com/laravel/framework/blob/5.3/src/Illuminate/Support/Facades/Artisan.php

These facade classes aren’t facades in the traditional sense, but they do act as static references to objects stored in Laravel’s service locator class. They project an easy way to refer to objects defined and configured elsewhere, and have benefits over traditional static (or singleton) classes. One of these benefits is in testing:

public function testNotificationWasQueued()
{
    Artisan::shouldReceive("queue")
        ->once()
        ->with(
            "user:notify",
            Mockery::subset(["user" => 1])
        );

    $service = new App\Service\UserService();
    $service->notifyUser(1);
}

…and though these facades are simple to create, there are a lot of them. That’s not the kind of code I find interesting to write. It seems Adam felt the same what when we wrote the tweet.

So, how could we create these facade classes dynamically? I haven’t seen Adam’s implementation code, but I’m guessing it looks something like:

function facade($name, $className) {
    if (class_exists($name)) {
        return;
    }

    eval("
        class $name extends Facade
        {
            protected static function getFacadeAccessor()
            {
                return $className::class;
            }
        }
    ");
}

That’s a neat trick. Whether of not you use (or even like) Laravel facades, I’m guessing you can see the benefits of writing less code. Sure, this probably adds to the execution time of each request, but we’d have to profile performance to decide if it even matters much.

I’ve used a similar trick before, when working on a functional programming library. I wanted to create the supporting code to enable me to write applications using fewer (if any) classes. This is what I used eval to create:

functional\struct\create("person", [
    "first_name" => "string",
    "last_name" => "string",
]);

$me = person([
    "first_name" => "christopher",
    "last_name" => "pitt",
]);

I wanted these structures to be accessible from anywhere, and self-validating. I wanted the freedom to be able to pass in an array of properties, and the control to be able to reject invalid types.

The following code uses many instances of ƒ. It’s a unicode character I’ve used as a sort of pseudo-namespace for properties and methods I can’t make private, yet don’t want others to access directly. It’s interesting to know that most unicode characters are valid characters for method and property names.

To allow this, I created the following code:

abstract class ƒstruct
{
    /**
     * @var array
     */
    protected $ƒdef = [];

    /**
     * @var array
     */
    protected $ƒdata = [];

    /**
     * @var array
     */
    protected $ƒname = "structure";

    public function __construct(array $data)
    {
        foreach ($data as $prop => $val) {
            $this->$prop = $val;
        }

        assert($this->ƒthrow_not_all_set());
    }

    private function ƒthrow_not_all_set()
    {
        foreach ($this->ƒdef as $prop => $type) {
            $typeIsNotMixed = $type !== "mixed";
            $propIsNotSet = !isset($this->ƒdata[$prop]);

            if ($typeIsNotMixed and $propIsNotSet) {
                // throw exception
            }
        }

        return true;
    }

    public function __set($prop, $value)
    {
        assert($this->ƒthrow_not_defined($prop, $value));
        assert($this->ƒthrow_wrong_type($prop, $value));

        $this->ƒdata[$prop] = $value;
    }

    private function ƒthrow_not_defined(string $prop)
    {
        if (!isset($this->ƒdef[$prop])) {
            // throw exception
        }

        return true;
    }

    private function ƒthrow_wrong_type(string $prop, $val)
    {
        $type = $this->ƒdef[$prop];

        $typeIsNotMixed = $type !== "mixed";
        $typeIsNotSame = $type !== type($val);

        if ($typeIsNotMixed and $typeIsNotSame) {
            // throw exception
        }

        return true;
    }

    public function __get($prop)
    {
        if ($property === "class") {
            return $this->ƒname;
        }

        assert($this->ƒthrow_not_defined($prop));

        if (isset($this->ƒdata[$prop])) {
            return $this->ƒdata[$prop];
        }

        return null;
    }
}

function type($var) {
    $checks = [
        "is_callable" => "callable",
        "is_string" => "string",
        "is_integer" => "int",
        "is_float" => "float",
        "is_null" => "null",
        "is_bool" => "bool",
        "is_array" => "array",
    ];

    foreach ($checks as $func => $val) {
        if ($func($var)) {
            return $val;
        }
    }

    if ($var instanceof ƒstruct) {
        return $var->class;
    }

    return "unknown";
}

function create(string $name, array $definition) {
    if (class_exists("\\ƒ" . $name)) {
        // throw exception
    }

    $def = var_export($definition, true);

    $code = "
        final class ƒ$name extends ƒstruct {
            protected \$ƒdef = $def;
            protected \$ƒname = '$name';
        }

        function $name(array \$data = []) {
            return new ƒ$name(\$data);
        }
    ";

    eval($code);
}

This is similar to code found at github.com/assertchris/functional-core

There’s a lot going on here, so let’s break it down:

The ƒstruct class is the abstract basis for these self-validating structures. It defines __get and __set behavior that includes checks for presence and validity of the data used to initialize each struct.
When a struct is created, ƒstruct checks if all required properties have been provided. That is, unless any of the properties are mixed they must be defined.
As each property is set, the value provided is checked against the expected type for that property.
All of these checks are designed to work with (and wrapped in) calls to assert. This means the checks are only performed in development environments.
The type function is used to return predictable type strings for the most common types of variables. In addition, if the variable is a subclass of ƒstruct, the ƒname property value is returned as the type string. This means we can define nested structures as easily as: create("account", ["holder" => "person"]). A caveat is that the pre-defined types (like "int" and "string") will always be resolved before structures of the same name.
The create function uses eval to create new subclasses of ƒstruct, containing the appropriate class name, ƒname, and ƒdef. var_export takes the value of a variable and returns the syntax string form of it.

The assert function is usually disabled in production environments by having zend.assertions at 0 in php.ini. If you’re not seeing assertion errors where you expect them, check what this setting is set to.

Domain Specific Languages

Domain Specific Languages (or DSLs as they’re usually referred to) are alternative programming languages that express an idea or problem domain well. Markdown is an excellent example of this.

I’m writing this post in Markdown, because it allows me to define the meaning and importance of each bit of text, without getting bogged down in the visual appearance of the post.

CSS is another excellent DSL. It provides many and varied means of addressing one or more HTML elements (by a selector), so that visual styles can be applied to them.

DSLs can be internal or external. Internal DSLs use an existing programming language as their syntax, but they are uniquely structured within that syntax. Fluent interfaces are a good example of this:

Post::where("is_published", true)
    ->orderBy("published_at", "desc")
    ->take(6)
    ->skip(12)
    ->get();

This is an example of some code you might see in a Laravel application. It’s using an ORM called Eloquent, to build a query for a SQL database.

External DSLs use their own syntax, and need some kind of parser or compiler to transform this syntax into machine code. SQL syntax is a good example of this:

SELECT * FROM posts
    WHERE is_published = 1
    ORDER BY published_at DESC
    LIMIT 12, 6;

The above PHP code should approximately render to this SQL code. It’s sent over the wire to a MySQL server, which transforms it into code servers can understand.

If we wanted to make our own external DSL, we would need to transform custom syntax into code a machine can understand. Short of learning how assembler works, we could translate custom syntax into a lower-level language. Like PHP.

Imagine we wanted to make a language that was a super-set language. That means the language would support everything PHP does, but also a few extra bits of syntax. A small example could be:

$numbers = [1, 2, 3, 4, 5];
print_r($numbers[2..4]);

How could we convert this into valid PHP code? I answered this exact question in a previous post, but the gist of it is by using code similar to:

function replace($matches) {
    return '
        call_user_func(function($list) {
            $lower = '.explode('..', $matches[2])[0].';
            $upper = '.explode('..', $matches[2])[1].';
            return array_slice(
                $list, $lower, $upper - $lower
            );
        }, $'.$matches[1].')
    ';
}

function parse($code) {
    $replaced = preg_replace_callback(
        '/\$(\S+)\[(\S+)\]/', 'replace', $code
    );

    eval($replaced);
}

parse('
    $numbers = [1, 2, 3, 4, 5];
    print_r($numbers[2..4]);
');

This code takes a string of PHP-like syntax and parses it by replacing new syntax with standard PHP syntax. Once the syntax is standard PHP, the code can be evaluated. It essentially does an inline code replacement, which is only possible when code can be executed dynamically.

To do this, without the eval function, we’d need to build a compiler. Something that takes high-level code and gives back low-level code. In this case, it would need to take our PHP super-set language code, and give back valid PHP code.

Parallelism

Let’s take a look at another jaded core function: exec. Perhaps more decried than even eval, exec is universally denounced by all but the more adventurous developers. And I have to wonder why.

In case you’re unfamiliar, exec works like this:

exec("ls -la | wc -l", $output);
print $output[0]; // number of files in the current dir

exec is a way for PHP developers to run an operating system command, in a new sub-process of the current script. With a little bit of prodding, we can actually make this sub-process run completely in the background:

exec("sleep 30 > /dev/null 2> /dev/null &");

To do this: we redirect stdout and stderr to /dev/null and add an & to the end of the command we want to run in the background. There are many reasons you’d want to do something like this, but my favorite is to be able to perform slow and/or blocking tasks away from the main PHP process.

Image you had a script like this:

foreach ($images as $image) {
    $source = imagecreatefromjpeg($image["src_path"]);

    $icon = imagecreatetruecolor(64, 64);

    imagecopyresampled(
        $source, $icon, 0, 0, 0, 0,
        64, 64, $image["width"], $image["height"]
    );

    imagejpeg($icon, $image["ico_path"]);

    imagedestroy($icon);
    imagedestroy($source);
}

This is fine, for a few images. But imagine hundreds of images, or dozens of requests per second. Traffic like that could easily affect server performance. In cases like these, we can isolate slow code and run it in parallel (or even remotely) to user-facing code.

Here’s how we could run the slow code:

exec("php slow.php > /dev/null 2> /dev/null &");

We could even take it a step further by generating a dynamic script for the PHP command-line interface to run. To begin with, we can install SuperClosure :

require __DIR__ . '/vendor/autoload.php';

use SuperClosure\Serializer;

function defer(Closure $closure) {
    $serializer = new Serializer();
    $serialized = $serializer->serialize($closure);

    $autoload = __DIR__ . '/vendor/autoload.php';

    $raw = '
        require \'' . $autoload . '\';

        use SuperClosure\Serializer;

        $serializer = new Serializer();
        $serialized = \'' . $serialized . '\';

        call_user_func(
            $serializer->unserialize($serialized)
        );
    ';

    $encoded = base64_encode($raw);

    $script = 'eval(base64_decode(\'' . $encoded . '\'));';

    exec('php -r "' . $script . '"', $output);

    return $output;
}

$output = defer(function() {
    print "hi";
});

Why do we need to hard-code a script (to run in parallel) when we could just dynamically generate the code we want to run, and pipe it directly into the PHP binary?

We can even combine this exec trick with eval, by encoding the source code we want to run, and decoding it upon execution. This makes the command to start the sub-process much neater overall.

We can even add a unique identifier, so that the sub-process is easier to track and kill:

function defer(Closure $closure, $id = null) {
    // create $script

    if (is_string($id)) {
        $script = '/* id:' . $id . ' */' . $script;
    }

    $shh = '> /dev/null 2> /dev/null &';

    exec(
        'php -r "' . $script . '" ' . $shh,
        $output
    );

    return $output;
}

Staying Safe

The main reason so many developers dislike and/or advise against eval and exec is because their misuse leads to far more disastrous outcomes than, say, count.

I’d suggest, instead of listening to these folks and immediately dismissing eval and exec, you learn how to use them securely. The main thing you want to avoid is using them with unfiltered user-supplied input.

Avoid at all costs:

exec($_GET["op"] . " " . $_GET["path"]);

Try instead:

$op = $_GET["op"];
$path = $_GET["path"];

if (allowed_op($op) and allowed_path($path)) {
    $clean = escapeshellarg($path);

    if ($op === "touch") {
        exec("touch {$clean}");
    }

    if ($op === "remove") {
        exec("rm {$clean}");
    }
}

…or better yet: avoid putting any user-supplied data directly into an exec command! You can also try other escaping functions, like escapeshellcmd. Remember that this is a gateway into your system. Anything the user running the PHP process is allowed to do, exec is allowed to do. That’s why it’s intentionally disabled on shared hosting.

As with all PHP core functions, use these in moderation. Pay special attention to the data you allow in, and avoid unfiltered user-supplied data in exec. But, don’t avoid the functions without understanding why you’re avoiding them. That’s not helpful for you or the people who will learn from you.

Frequently Asked Questions (FAQs) about PHP’s Delicious Evils

What are the potential risks of using PHP’s eval() function?

The eval() function in PHP is a powerful tool that allows you to execute PHP code stored as a string. However, it comes with significant security risks. If user-supplied data is passed directly to this function without proper sanitization, it can lead to code injection attacks. This is because eval() executes the code it’s given with the privileges of the script. If you must use eval(), ensure you’re using it in a controlled environment and always sanitize and validate your inputs.

How can I safely use the exec() function in PHP?

The exec() function is another powerful PHP function that allows you to execute an external program. However, like eval(), it can be dangerous if misused. To use exec() safely, you should always escape your arguments using escapeshellarg() or escapeshellcmd(). Never pass unsanitized user input to exec(). If possible, avoid using exec() altogether and opt for safer, built-in PHP functions instead.

What is the difference between eval() and exec() in PHP?

While both eval() and exec() allow you to execute code, they do so in different ways. eval() executes PHP code stored as a string, while exec() runs an external program. This means that eval() operates within the PHP environment, while exec() can interact with the system outside of PHP. Both functions can be dangerous if misused and should be used with caution.

Why is the use of eval() and exec() often discouraged in PHP?

The use of eval() and exec() is often discouraged due to the security risks they pose. Both functions can execute code, which can lead to code injection attacks if user-supplied data is passed to them without proper sanitization. Additionally, these functions can often be misused to execute code that could have been written more safely and efficiently using other PHP functions.

What are some alternatives to using eval() in PHP?

There are several safer alternatives to using eval() in PHP. For example, you can use the create_function() function to create a new anonymous function with the specified parameters and code. Alternatively, you can use the call_user_func() or call_user_func_array() functions to call a function given by the first parameter.

How can I protect my PHP code from injection attacks?

To protect your PHP code from injection attacks, always sanitize and validate your inputs. This means removing any potentially harmful characters and ensuring the input is of the correct type and format. Additionally, avoid using functions like eval() and exec() that can execute code, especially with user-supplied data.

What is the purpose of the @ symbol in PHP?

The @ symbol in PHP is an error control operator. When prepended to an expression in PHP, any error messages that might be generated by that expression will be ignored. However, using the @ symbol is generally discouraged as it can make debugging more difficult by hiding potential errors.

What is the difference between the die() and exit() functions in PHP?

The die() and exit() functions in PHP are essentially the same. They both halt the execution of the script and can optionally output a message. The only difference is in their usage: die() is typically used to output an error message and terminate the script, while exit() is often used to indicate successful completion of the script.

How can I prevent my PHP code from being easily readable if someone gains access to my files?

To prevent your PHP code from being easily readable, you can use a process called obfuscation. This involves making your code more complex and harder to understand, while still maintaining its functionality. There are several PHP obfuscators available that can help with this.

What is the purpose of the backtick operator in PHP?

The backtick operator in PHP is used to execute shell commands. Anything enclosed within backticks will be executed as a shell command, and the output will be returned. Like eval() and exec(), the backtick operator can be dangerous if misused and should be used with caution.