PHP - - By Bert Ramakers

Poka Yoke – Saving Projects with Hyper-Defensive Programming

This article was peer reviewed by Deji Akala and Marco Pivetta. Thanks to all of SitePoint’s peer reviewers for making SitePoint content the best it can be!


When working in a medium to large team on the same codebase, it can sometimes become hard to understand each other’s code and how to use it. Various solutions exist to help with this. For example, you can agree to follow a set of coding standards to make the code more readable for each other, or use a framework that’s known to all (we’ve got a great Laravel intro premium course here).

However, this is often not enough, especially when someone has to dig back into a part of the application written some time ago to fix a bug or add a new feature. It can be quite hard to remember how particular classes were intended to work, both on their own and in combination with each other. At that point, it becomes easy to accidentally introduce side effects or bugs without realizing it.

A detected bug

These mistakes might get caught in quality assurance, but there’s a realistic chance they might slip through. And even if they get caught, it can take a lot of time to send the code back and get it fixed.

So how can we prevent this? Enter “Poka Yoke”.

What is Poka Yoke?

Poka Yoke is a Japanese term which roughly translates to “mistake-proofing”. The term
originated in lean manufacturing, where it refers to any mechanism that helps a machine operator avoid mistakes.

Outside of manufacturing, Poka Yoke is also often used in consumer electronics. Take, for example, a SIM card, which can only fit one way a sim tray because of its asymmetrical shape.

A SIM card and tray, illustrating how the SIM card can only fit one way.

An example of hardware lacking in Poka Yoke is the PS/2 port, which has exactly the same shape for a keyboard connector and a mouse connector. They can only be differentiated by using color codes, so it is easy to accidentally switch the connectors and insert them in the wrong ports because they both fit the same way.

PS/2 ports for keyboards and mice, illustrating how they use an interchangeable format making it too easy to accidentally insert one into the other.

Besides being used in hardware, the concepts of Poka Yoke can also be applied to programming. The idea is to make the public interfaces of our code as easy as possible to understand, and to raise errors as soon as our code is being used incorrectly. This might seem obvious, but in reality we often come across code that is lacking in these regards.

Note however, that Poka Yoke is not meant to prevent intentional abuse. The goal is only to prevent accidental mistakes, not to secure your code against malicious usage. As long as someone has access to your code, they will always be able to get around your safeguards if they really want to.

Before discussing what specific measures you can take to make your code more mistake-proof, it is important to know that Poka Yoke mechanisms can generally be divided into two categories:

  • Mistake prevention
  • Mistake detection

Mistake prevention techniques are helpful for catching mistakes early on. They are meant to make sure no one can accidentally use our code incorrectly, by making the interfaces and behavior as straightforward as possible. Think of the example of the SIM card, which can only fit one way in a SIM tray.

Mistake detection mechanisms, on the other hand, live outside of our code. They monitor our applications to watch for potential mistakes and warn us about them. An example can be software that detects whether a device connected to a PS/2 port is of the correct type, and if not show a warning to the user clarifying why it doesn’t work. This particular software could not prevent a mistake, because the connectors are interchangeable when plugging them in, but it can detect one and warn us about it so the mistake can be fixed.

In the rest of this article we will explore several methods we can use to implement both mistake prevention and mistake detection in our applications. But keep in mind that this list is just a starting point. Depending on your specific application, additional measures might be possible to make your code more mistake-proof. Also, it’s important to keep the upfront cost of Poka Yoke in mind and make sure it’s worth it for your specific project. Depending on the complexity and size of your application, some measures may be too costly compared to the potential cost of mistakes. So it is up to you and your team to decide which measures are best for you to take.

Mistake Prevention Examples

A targeted virus, molecular scale view CG render

(Scalar) type declarations

Previously known as type hints in PHP 5, type declarations are an easy way to start mistake-proofing your function and method signatures in PHP 7.

By assigning specific types to your function arguments, it becomes harder to mix up the order of arguments when calling the function.

For example, let’s take this Notification that we might want to send to a user:

<?php

class Notification {
    private $userId;
    private $subject;
    private $message;

    public function __construct(
        $userId,
        $subject, 
        $message
    ) {
        $this->userId = $userId;
        $this->subject = $subject;
        $this->message = $message;
    }

    public function getUserId()
    {
        return $this->userId;
    }

    public function getSubject()
    {
        return $this->subject;
    }

    public function getMessage()
    {
        return $this->message;
    }
}

Without type declarations, we can easily inject the incorrect types of variables which would probably break our application. For example, we could assume that the $userId should be a string, while it might actually have to be an int.

If we injected the wrong type, the error would probably go undetected until the application tries to actually do something with the Notification. By then, we’d probably get some cryptic error message about an unexpected type, but nothing that immediately points to our code where we inject a string instead of an int.

Because of this, it is often more interesting to force the application to crash as soon as possible, so that bugs like these get caught early on during development.

In this case, we could simply add some type declarations and PHP will stop and warn us immediately with a fatal error when we mix up the types of our arguments:

<?php

declare(strict_types=1);

class Notification {
    private $userId;
    private $subject;
    private $message;

    public function __construct(
        int $userId,
        string $subject, 
        string $message
    ) {
        $this->userId = $userId;
        $this->subject = $subject;
        $this->message = $message;
    }

    public function getUserId() : int
    {
        return $this->userId;
    }

    public function getSubject() : string
    {
        return $this->subject;
    }

    public function getMessage() : string
    {
        return $this->message;
    }
}

Note however that, by default, PHP will try to coerce incorrect arguments to their expected types. To prevent this, it is important that we enable strict_types so we actually get a fatal error when a mistake is made. Because of this, scalar type declarations are not an ideal form of Poka Yoke, but they’re a good start to reducing mistakes. Even with strict_types disabled, they can still serve as an indication of what type is expected for an argument.

Additionally, we declared return types for our methods. These make it easier to determine what kind of values we can expect when calling a certain function.

Clearly defined return types are also useful to avoid a lot of switch statements when working with return values, because without explicitly declared return types, our methods could return various types. Therefore, someone using our methods would have to check which type was actually returned in a specific scenario. These switch statements can obviously be forgotten, and lead to bugs that can be hard to detect. Mistakes like this become much less prevalent with return types.

Value Objects

One problem that scalar type hints can not easily fix for us, is that having multiple function arguments makes it possible to mix up the order of said arguments.

Child shape sorter toy, an analogy to sorting the right arguments by type when passing them into a function

When all arguments have a different scalar type, PHP can warn us when we mix up the order of arguments, but in most cases we will probably have some arguments with the same type.

To fix this, we could wrap our arguments in value objects like so:

class UserId {
    private $userId;

    public function __construct(int $userId) {
        $this->userId = $userId;
    }

    public function getValue() : int
    {
        return $this->userId;
    }
}

class Subject {
    private $subject;

    public function __construct(string $subject) {
        $this->subject = $subject;
    }

    public function getValue() : string
    {
        return $this->subject;
    }
}

class Message {
    private $message;

    public function __construct(string $message) {
        $this->message = $message;
    }

    public function getMessage() : string
    {
        return $this->message;
    }
}

class Notification {
    /* ... */

    public function __construct(
        UserId $userId,
        Subject $subject, 
        Message $message
    ) {
        $this->userId = $userId;
        $this->subject = $subject;
        $this->message = $message;
    }

    public function getUserId() : UserId { /* ... */ }

    public function getSubject() : Subject { /* ... */ }

    public function getMessage() : Message { /* ... */ }
}

Because our arguments now each have a very specific type, it becomes near impossible to mix them up.

An additional advantage of using value objects over scalar type declarations, is that we no longer have to enable strict_types in every file. And if we don’t have to remember it, we can’t forget it by accident.

Validation

A list of checkboxes being checked, analogy to validation

When working with value objects, we can encapsulate the validation logic of their data inside the objects themselves. Doing so, we can prevent the creation of a value object with an invalid state, which would probably lead to problems down the road in other layers of our application.

For example, we might have a rule that says that any given UserId should always be positive.

We could obviously validate this rule whenever we get a UserId as input, but on the other hand it can also easily be forgotten in one place or another.

And even if this mistake would result in an actual error in another layer of our application, the error message could be unclear about what actually went wrong and it becomes hard to debug.

To prevent mistakes like this, we could add some validation to the UserId constructor:

class UserId {
    private $userId;

    public function __construct($userId) {
        if (!is_int($userId) || $userId < 0) {
            throw new \InvalidArgumentException(
                'UserId should be a positive integer.'
            );
        }

        $this->userId = $userId;
    }

    public function getValue() : int
    {
        return $this->userId;
    }
}

This way we can always be sure that when we’re working with a UserId object, it has a valid state. This prevents us from having to constantly re-validate our data throughout the various layers of our application.

Note that we could add a scalar type declaration instead of using is_int here, but it would force us to enable strict_types everywhere we use UserId.

If we wouldn’t enable strict_types, PHP would silently try to coerce other types to int whenever they are passed into UserId. This can be problematic, as we might for example inject a float which might actually be an incorrect variable as user ids are generally not floats.

In other cases, where we might for example be working with a Price value object, disabling strict_types could result in rounding errors as PHP would automatically convert float variables to int.

Immutability

By default, objects are passed by reference in PHP. This means that when we make a change to an object, it becomes altered throughout our whole application instantly.

Lego blocks, perhaps the most immutable object in the known universe

While this approach has its advantages, it also has some downsides.
Take this example of a Notification being sent to a user via both SMS and e-mail:

interface NotificationSenderInterface
{
    public function send(Notification $notification);
}

class SMSNotificationSender implements NotificationSenderInterface
{
    public function send(Notification $notification) {
        $this->cutNotificationLength($notification);

        // Send an SMS...
    }

    /**
     * Makes sure the notification does not exceed the length of an SMS.
     */
    private function cutNotificationLength(Notification $notification)
    {
        $message = $notification->getMessage();
        $messageString = substr($message->getValue(), 160);
        $notification->setMessage(new Message($messageString));
    }
}

class EmailNotificationSender implements NotificationSenderInterface
{
    public function send(Notification $notification) {
        // Send an e-mail ...
    }
}

$smsNotificationSender = new SMSNotificationSender();
$emailNotificationSender = new EmailNotificationSender();

$notification = new Notification(
    new UserId(17466),
    new Subject('Demo notification'),
    new Message('Very long message ... over 160 characters.')
);

$smsNotificationSender->send($notification);
$emailNotificationSender->send($notification);

Because the Notification object is being passed by reference, we have caused an unintended side effect. By cutting the message’s length in the SMSNotificationSender, the referenced Notification object was updated throughout the whole application, which means it was also cut when it was sent by the EmailNotificationSender later on.

To fix this, we can make our Notification object immutable. Instead of providing set methods to make changes to it, we can add some with methods that make a copy of the original Notification before applying changes:

class Notification {
    public function __construct( ... ) { /* ... */ }

    public function getUserId() : UserId { /* ... */ }

    public function withUserId(UserId $userId) : Notification {
        $c = clone $this;
        $c->userId = clone $userId;
        return $c;
    }

    public function getSubject() : Subject { /* ... */ }

    public function withSubject(Subject $subject) : Notification {
        $c = clone $this;
        $c->subject = clone $subject;
        return $c;
    }

    public function getMessage() : Message { /* ... */ }

    public function withMessage(Message $message) : Notification {
        $c = clone $this;
        $c->message = clone $message;
        return $c;
    }
}

This way, whenever we make a change to our Notification class by for example cutting the message’s length, the change no longer ripple throughout the whole application, preventing any unintended side effects.

Note however that it is very hard (if not impossible) to make an object truly immutable in PHP. But for the sake of making our code mistake-proof, it already helps a lot if we add “immutable” with methods instead of set methods, as users of the class no longer have to remember to clone the object themselves before making changes.

Returning Null Objects

Sometimes we might have functions or methods that can either return some value, or null. These nullable return values can pose a problem because they almost always requires a check to see whether or not they are null before we can do something with them. Again, this is something that we could easily forget. To prevent us from always having to check the return values, we could return null objects instead.

An empty poker chip, analogous to a null object

For example, we could have a ShoppingCart with either a discount applied to it or not:

interface Discount {
    public function applyTo(int $total);
}

interface ShoppingCart {
    public function calculateTotal() : int;

    public function getDiscount() : ?Discount;
}

When calculating the final price of our ShoppingCart, we now always have to check whether getDiscount() returns null or an actual Discount before calling the applyTo method:

$total = $shoppingCart->calculateTotal();

if ($shoppingCart->getDiscount()) {
    $total = $shoppingCart->getDiscount()->applyTo($total);
}

If we didn’t do this check, we’d probably get a PHP warning and/or other unintended effects when getDiscount() returns null.

On the other hand, these checks could be removed altogether if we’d return a null object instead when no Discount is set:

class ShoppingCart {
    public function getDiscount() : Discount {
        return !is_null($this->discount) ? $this->discount : new NoDiscount();
    }
}

class NoDiscount implements Discount {
    public function applyTo(int $total) {
        return $total;
    }
}

Now, when we call getDiscount(), we always get a Discount object even if no discount is available. This way, we can apply the discount to our total, even if it there is none, and we no longer need an if statement:

$total = $shoppingCart->calculateTotal();
$totalWithDiscountApplied = $shoppingCart->getDiscount()->applyTo($total);

Optional Dependencies

For the same reasons that we would want to avoid nullable return types, we might want to avoid optional dependencies and just make all of our dependencies required.

Take for example the following class:

class SomeService implements LoggerAwareInterface {
    public function setLogger(LoggerInterface $logger) { /* ... */ }

    public function doSomething() {
        if ($this->logger) {
            $this->logger->debug('...');
        }

        // do something

        if ($this->logger) {
            $this->logger->warning('...');
        }

        // etc...
    }
}

There are two issues with this approach:

  1. We constantly have to check for the existence of a logger in our doSomething() method.
  2. When setting up the SomeService class in our service container, someone might forget to actually set a logger, or they might not even know the class has the option to set a logger.

We can simplify this by making the LoggerInterface a required dependency instead:

class SomeService {
    public function __construct(LoggerInterface $logger) { /* ... */ }

    public function doSomething() {
        $this->logger->debug('...');

        // do something

        $this->logger->warning('...');

        // etc...
    }
}

This way our public interface becomes less cluttered, and whenever someone creates a new instance of SomeService, they know that the class requires an instance of LoggerInterface so they cannot forget to inject one.

Additionally, we have omitted the need for if statements to check whether a logger is injected or not, which makes our doSomething() easier to read, and less susceptible to mistakes whenever someone would make changes to it.

If at some point we wanted to use SomeService without a logger, we could apply the same logic as with return statements and just use a null object instead:

$service = new SomeService(new NullLogger());

In the end this has the same effect as using an optional setLogger() method, but it makes our code easier to follow and reduces the chances of a mistake in our dependency injection container.

Public Interfaces

To make our code easier to use, it is best to keep the amount of public methods on our classes to the bare minimum. This way, it becomes less confusing how our code should be used, and we have less code to maintain and lesser chances of breaking backwards compatibility when refactoring.

Arrows pointing towards a machine-based center, indicating public methods entering a class' major functionality

To keep public methods to a minimum, it can help to think of public methods as transactions.

Take for example this example of transferring money between two bank accounts:

$account1->withdraw(100);
$account2->deposit(100);

While the underlying database could provide a transaction to make sure no money would be withdrawn if the deposit could not be made or vice-versa, the database can not prevent us from forgetting to call either $account1->withdraw() or $account2->deposit(), which would result in incorrect balances.

Luckily, we can easily fix this by replacing our two separate methods with a single transactional method:

$account1->transfer(100, $account2);

As a result our code becomes more robust, as it becomes harder to make a mistake by only completing the transaction partially.

Mistake Detection Examples

Mistake detection mechanisms are, contrary to mistake prevention mechanisms, not meant to prevent errors. Instead they’re meant to warn us about problems whenever they get detected.

Most of the time they live outside of our application, and run at regular intervals to monitor our code or specific changes to it.

Unit Tests

Unit tests can be a great way to make sure new code works correctly, but it can also help to make sure existing code still works as intended whenever someone refactors part of the system.

Because someone could still forget to actually run our unit tests, it is advisable to run them automatically when changes are made using services like Travis CI and Gitlab CI. This way, developers automatically get notified when breaking changes occur, and it also helps us when reviewing pull requests to make sure the changes work as intended.

Besides mistake detection, unit tests are also a great way to provide examples of how specific parts of code are intended to work, which can in turn prevent mistakes when someone else uses our code.

Code Coverage Reports and Mutation Tests

Because we could always forget to write enough tests, it can be beneficial to automatically generate code coverage reports using services like Coveralls whenever our unit tests run. Coveralls will send us a notification whenever our code coverage drops so we can add some unit tests, and we can also get a grasp of how our code coverage evolves over time.

Graphs indicating change in trends over time

Another, even better, way to make sure we have enough unit tests for our code is to set up some mutation tests, for example using Humbug. As the name implies, these tests are meant to verify that we have a decent amount of code coverage by slightly altering our source code, running our unit tests afterwards, and making sure the relevant tests start failing because of the mutations.

Using both code coverage reports and mutation tests, we can make sure that our unit tests cover enough code to prevent accidental mistakes or bugs.

Code Analyzers

Code analyzers can detect bugs in our application early in the development process. IDEs like PHPStorm, for example, use code analyzers to warn us about errors and to give suggestions when we’re writing code. These can range from simple syntax errors to the detection of duplicate code.

Besides the analyzers built into most IDEs, it is possible to incorporate third-party and even custom analyzers into the build process of our applications to spot specific problems. A non-exhaustive list of analyzers suitable for PHP projects can be found at exakat/php-static-analysis-tools, ranging from coding standard analyzers, to analyzers that check for security vulnerabilities.

Online solutions exist as well, for example SensioLabs Insights.

Log Messages

Contrary to most other mistake detection mechanisms, log messages can help us detect mistakes in our application when it’s running live in production.

Illustration of a scroll, indicating a long log

Of course it is first required that our code actually logs messages whenever something unexpected happens. Even when our code supports loggers, they can easily be forgotten when setting everything up. Because of this we should try to avoid optional dependencies (see above).

While most applications will log at least some messages, the information they provide only becomes really interesting when they are actively analyzed and monitored using tools like Kibana or Nagios. Tools like these can give new insights in what errors and warnings occur in our application when users are actively using it, instead of when it’s being tested internally. We’ve got a great post about monitoring PHP apps with this ELK stack here.

Don’t Suppress Errors

Even when actively logging error messages, it often happens that some errors are being suppressed. PHP has the tendency to carry on whenever a “recoverable” error occurs, as if it wants to help us by keeping the application running. However, errors can often be very useful when developing or testing a new feature, as they often indicate bugs in our code.

This is why most code analyzers will warn you when they detect you’re using @ to suppress errors, as it can hide bugs that will inevitably pop up again as soon as the application is actually being used by visitors.

Generally, it is best to set PHP’s error_reporting level to E_ALL so even the slightest warnings get reported. However, make sure to log these messages somewhere and hide them from your users so no sensitive information about your application’s architecture or potential security vulnerabilities are being exposed to end users.

Aside from the error_reporting configuration, it is also important to always enable strict_types so PHP doesn’t try to automatically coerce function arguments to their expected type, as this can often lead to hard-to-detect bugs when converting from one type to another (for example, rounding errors when casting from float to int).

Usages outside of PHP

As Poka Yoke is more of a concept rather than a specific technique, it can also be applied to areas outside of (but related to) PHP.

Infrastructure

At an infrastructure level, a lot of mistakes can be prevented by having a shared development setup that is identical to the production environment, using tools like Vagrant.

Automating the deployment process using build servers like Jenkins and GoCD can also help a lot to prevent mistakes when deploying changes to our application, as this can often include a wide range of required steps depending on the application that can easily be forgotten.

REST APIs

When building REST APIs, we can incorporate Poka Yoke to make our API easier to use. For example, we could make sure we always return an error whenever an unknown parameter is passed in the URL query or request body. This might seem strange as we obviously want to avoid “breaking” our API’s clients, but it is generally better to warn the developers using our API as soon as possible about incorrect usage so bugs can be fixed early in the development process.

For example, we could have a color parameter on our API, but someone consuming our API might accidentally use a colour parameter instead. Without any warnings, this mistake can easily make its way through to the production environment until it is only noticed by end users because of unintended behavior. To learn how to build APIs that won’t bite you later, a good book like this one might come in handy.

Application Configuration

Practically all applications depend on at least some custom configuration. More often than not, developers like to provide as many default values as possible for configuration, so it’s less work to configure the application.

However, just like the color and colour example above, it can be easy to mistype configuration parameters which would cause our application to unexpectedly fall back to the default values. These kinds of mistakes can be hard to track down when the application does not raise an error, and the best way of raising an error for incorrect configuration is simply to not provide any defaults and raise an error as soon as a configuration parameter is missing.

Preventing User Mistakes

Policeman illustration holding up an open palm towards the viewport as if to say 'stop'

Poka Yoke concepts can also be applied to prevent or detect user mistakes. For example in payment software, an account number entered by the user can be validated using the check digit algorithm. This prevents the user from accidentally entering an account number with a typo.

Conclusion

While Poka Yoke is more of a concept rather than a specific set of tools, there are various principles we can apply to our code and development process to make sure mistakes get prevented or detected early on. Very often these mechanisms will be specific to the application itself and its business logic, but there are some simple techniques and tools we can use to make any code more fool-proof.

Probably the most important thing to remember is that while we obviously want to avoid errors in production, they can be very useful during development and we should not be afraid to raise them as soon as possible so mistakes are easier to track down. These errors can be raised either by the code itself, or by separate processes that run separately from our application and monitor it from the outside.

To further reduce errors, we should aim to keep the public interfaces of our code as simple and straightforward as possible.

If you have any more tips on how Poka Yoke can be applied to PHP development or programming in general, feel free to share them in the comments!

Further Reading

Poka Yoke

Poka Yoke in PHP

Sponsors