Implementing a Unit of Work – Handling Domain Objects through a Transactional Model

Tweet

Even in the most basic scenario you can picture, where the logic of an application’s core is boiled down to just pulling in a few records from the database, bringing some domain objects to life, and then dumping them to the screen through an API of some basic rendering mechanism, there’s always an ongoing transaction behind the scenes whose most expensive facet often gets blurred beneath the appealing outward influence of the user interface.

If you think this through, you’ll notice that the crux of the matter is the transaction lies not surprisingly in the heap of database trips, even though they can be largely mitigated by a clever caching strategy. In relatively small applications, where there are just a few basic domain objects involved in each transaction, and where the hike to the database is just for retrieving data most of the time, a simple caching system dropped into the appropriate place certainly can help get things sorted out with efficiency.

While sad but true, reality is a ruthless creature always shouting at us radically different things than the sweet ones we’d rather hear instead. In most cases, because of the intrinsic, unavoidable mutability of domain objects (with a few scarce exceptions when the dependencies of domain classes are modelled around the concept of immutable Value Objects), chances are that some objects will need to be modified across multiple requests, and even new ones will be put in memory in response to some user-related event.

In short, this means that even dummy CRUD applications that don’t encapsulate extensive chunks of additional business logic can quickly become bloated and generate a lot of overhead under the hood when it comes to performing multiple database writes. What if they reach a point where it’s necessary to handle a huge number of domain objects, which must be persisted and removed in sync, without compromising what us programming plebs loosely call data integrity?

Let’s be honest with ourselves (at least once). Neither all the lofty data source architectural patterns that we could just pick up along the way, nor that cool new approach we might have figured out overnight, can tackle satisfactorily something as predictable and mundane as writing out and removing multiple sets of data from storage. In light of this, should we just give up and call the issue pretty much a lost cause?

Admittedly the question is rhetorical. In fact, it’s feasible to wrap collections of domain objects inside a fairly flexible business transactional model and just perform several database writes/deletes in one go, therefore avoiding having to break down the process into more atomic and expensive database calls, which always lead to the session-per-operation antipattern. Moreover, this transaction-based mechanism rests on the academic formalities of a design pattern commonly known as Unit of Work (UOW), and its implementation in several popular enterprise-level packages, such as Hibernate, is quite prolific and prosperous.

On the flip side, PHP is, for obvious reasons, still elusive at having a variety of UOWs running in production, excepting in a few well-trusted libraries like Doctrine and RedBeanPHP, which use the pattern’s forces at disparate levels in order to process and coordinate operations on entities. Despite this, it would be certainly pretty educational to take a closer look at the benefits a UOW provides, that way you can see if they are something that may meet your requirements.

Registering Domain Objects with a Unit of Work

In his book Patterns of Enterprise Application Architecture, Martin Fowler discusses two mainstream approaches that can be followed when it comes to implementing a UOW: the first makes the UOW directly responsible for registering or queuing domain objects for insertion, update, or deletion, and the second shifts this responsibility over to the domain objects themselves.

In this case, since I’d like to have the domain model only encapsulating my business logic and remain agnostic about any form of persistence that may exist further down in other layers, I’m going to just stick to the commandments of the first option. In either case, you’re free to pick the approach you feel will fit the bill the best.

A lightweight implementation of a UOW might look like this:

<?php
namespace ModelRepository;
use ModelEntityInterface;

interface UnitOfWorkInterface
{
    public function fetchById($id);
    public function registerNew(EntityInterface $entity);
    public function registerClean(EntityInterface $entity);
    public function registerDirty(EntityInterface $entity);
    public function registerDeleted(EntityInterface $entity);
    public function commit();
    public function rollback();
    public function clear();
}
<?php
namespace ModelRepository;
use MapperDataMapperInterface,
    LibraryStorageObjectStorageInterface,
    ModelEntityInterface;

class UnitOfWork implements UnitOfWorkInterface
{
    const STATE_NEW     = "NEW";
    const STATE_CLEAN   = "CLEAN";
    const STATE_DIRTY   = "DIRTY";
    const STATE_REMOVED = "REMOVED";
    
    protected $dataMapper;
    protected $storage;

    public function __construct(DataMapperInterface $dataMapper, ObjectStorageInterface $storage) {
        $this->dataMapper = $dataMapper;
        $this->storage = $storage;
    }
    
    public function getDataMapper() {
        return $this->dataMapper;
    }
    
    public function getObjectStorage() {
        return $this->storage;
    }
    
    public function fetchById($id) {
        $entity = $this->dataMapper->fetchById($id);
        $this->registerClean($entity);
        return $entity;
    }
    
    public function registerNew(EntityInterface $entity) {
        $this->registerEntity($entity, self::STATE_NEW);
        return $this;
    }
    
    public function registerClean(EntityInterface $entity) {
        $this->registerEntity($entity, self::STATE_CLEAN);
        return $this;
    }
    
    public function registerDirty(EntityInterface $entity) {
        $this->registerEntity($entity, self::STATE_DIRTY);
        return $this;
    }
    
    public function registerDeleted(EntityInterface $entity) {
        $this->registerEntity($entity, self::STATE_REMOVED);
        return $this;
    }
    
    protected function registerEntity($entity, $state = self::STATE_CLEAN) {
        $this->storage->attach($entity, $state);
    }
    
    public function commit() {
        foreach ($this->storage as $entity) {
            switch ($this->storage[$entity]) {
                case self::STATE_NEW:
                case self::STATE_DIRTY: 
                    $this->dataMapper->save($entity);
                    break;
                case self::STATE_REMOVED:
                    $this->dataMapper->delete($entity);
            }
        }
        $this->clear();
    }
    
    public function rollback() {
        // your custom rollback implementation goes here
    }
    
    public function clear() {
        $this->storage->clear();
        return $this;
    }  
}

It should be clear to see that a UOW is nothing but plain, in-memory object storage which keeps track of which domain objects should be scheduled for insertion, update, and removal. In short, the convention could be boiled down to something along these lines: domain objects that need to be added to the storage will be registered “NEW”; those being updated will be marked “DIRTY”; the ones flagged “REMOVED” will be… yep, dropped from the database. In addition, any object registered “CLEAN” will be kept frozen and safe in memory until the client code explicitly requests to modify its associated state.

Of course, the method that performs these persistence-related operations in just one single transaction is commit(), which exploits the functionality of an still undefined data mapper to get access to the persistence layer. It would be even easier for you to understand the UOW’s inner workings if I show you the implementation of the collaborators injected in its constructor, so here’s the components that compose the object storage module:

<?php
namespace LibraryStorage;

interface ObjectStorageInterface extends Countable, Iterator, ArrayAccess
{
    public function attach($object, $data = null);
    public function detach($object);
    public function clear();
}
<?php
namespace LibraryStorage;

class ObjectStorage extends SplObjectStorage implements ObjectStorageInterface
{
    public function clear() {
        $tempStorage = clone $this;
        $this->addAll($tempStorage);
        $this->removeAll($tempStorage);
        $tempStorage = null;
    } 
}

In this case in particular, I decided to use a slightly-customized implementation of the SplObjectStorage class for registering domain objects without much fuss along with their related states with the UOW, even though pretty much the same can be also achieved using plain arrays. Again, it’s up to you to have the domain objects registered by using the method that best accommodates your needs.

With the custom ObjectStorage class in place, let’s take a look at the implementation of the aforementioned data mapper:

<?php
namespace Mapper;
use ModelEntityInterface;

interface DataMapperInterface
{
    public function fetchById($id);
    public function fetchAll(array $conditions = array());
    public function insert(EntityInterface $entity);
    public function update(EntityInterface $entity);
    public function save(EntityInterface $entity);
    public function delete(EntityInterface $entity);
}
<?php
namespace Mapper;
use LibraryDatabaseDatabaseAdapterInterface,
    ModelCollectionEntityCollectionInterface,   
    ModelEntityInterface;

abstract class AbstractDataMapper implements DataMapperInterface
{
    protected $adapter;
    protected $collection;
    protected $entityTable;
    
    public function __construct(DatabaseAdapterInterface $adapter, EntityCollectionInterface $collection, $entityTable = null) {
        $this->adapter = $adapter;
        $this->collection = $collection;
        if ($entityTable !== null) {
            $this->setEntityTable($entityTable);
        }
    }
        
    public function setEntityTable($entityTable) {
        if (!is_string($table) || empty($entityTable)) {
            throw new InvalidArgumentException(
                "The entity table is invalid.");
        }
        $this->entityTable = $entityTable;
        return $this;
    }
    
    public function fetchById($id) {
        $this->adapter->select($this->entityTable, 
            array("id" => $id));
        if (!$row = $this->adapter->fetch()) {
            return null; 
        }
        return $this->loadEntity($row);
    }
    
    public function fetchAll(array $conditions = array()) {
        $this->adapter->select($this->entityTable, $conditions);
        $rows = $this->adapter->fetchAll();
        return $this->loadEntityCollection($rows);
    }
    
    public function insert(EntityInterface $entity) {
        return $this->adapter->insert($this->entityTable,
            $entity->toArray());
    }
    
    public function update(EntityInterface $entity) {
        return $this->adapter->update($this->entityTable,
            $entity->toArray(), "id = $entity->id");
    }
    
    public function save(EntityInterface $entity) {
        return !isset($entity->id) 
            ? $this->adapter->insert($this->entityTable,
                $entity->toArray()) 
            : $this->adapter->update($this->entityTable,
                $entity->toArray(), "id = $entity->id");   
    }
    
    public function delete(EntityInterface $entity) {
        return $this->adapter->delete($this->entityTable,
            "id = $entity->id");
    }
    
    protected function loadEntityCollection(array $rows) {
        $this->collection->clear();
        foreach ($rows as $row) {
            $this->collection[] = $this->loadEntity($row);
        }
        return $this->collection;
    }
    
    abstract protected function loadEntity(array $row);
}

The AbstractDataMapper puts behind a pretty standard API the bulk of logic required for pulling domain objects in and out of the database. To make things even easier, it’d be also nice to derivate a refined implementation of it, that way we could easily test the UOW with a few sample user objects. Here’s how this extra mapping subclass looks:

<?php
namespace Mapper;
use ModelUser;

class UserMapper extends AbstractDataMapper
{
    protected $entityTable = "users";
    
    protected function loadEntity(array $row) {
        return new User(array(
            "id"    => $row["id"], 
            "name"  => $row["name"], 
            "email" => $row["email"],
            "role"  => $row["role"]));
    }
}

At this point we just could put our hands on the UOW and see if its transactional schema delivers what it promises. But before we do, first off we really should drop at least a few domain objects in memory. That way, we can get them neatly registered with the UOW. So let’s now define a primitive Domain Model which will be charged with supplying the objects in question.

Defining a basic Domain Model

Frankly speaking, there are several ways to implement a functional Domain Model (most likely there exists one per developer living and breathing out there). Since in this case I want the process to be both painless and short, the model I’ll be using for testing the UOW will be composed just of a prototypical entity class, along with a derivative, which will be charged with spawning basic users objects:

<?php
namespace Model;

interface EntityInterface
{
    public function setField($name, $value);
    public function getField($name);
    public function fieldExists($name);
    public function removeField($name);
    public function toArray();      
}
<?php
namespace Model;

abstract class AbstractEntity implements EntityInterface
{
    protected $fields = array(); 
    protected $allowedFields = array(); 

    public function __construct(array $fields = array()) {
        if (!empty($fields)) {
            foreach ($fields as $name => $value) {
                $this->$name = $value;
            } 
        }
    }
    
    public function setField($name, $value) {
        return $this->__set($name, $value);
    }
    
    public function getField($name) {
        return $this->__get($name);
    }
    
    public function fieldExists($name) {
        return $this->__isset($name);
    }
    
    public function removeField($name) {
        return $this->__unset($name);
    }
    
    public function toArray() {
        return $this->fields;
    }
             
    public function __set($name, $value) {
        $this->checkAllowedFields($name);
        $mutator = "set" . ucfirst(strtolower($name));
        if (method_exists($this, $mutator) && 
            is_callable(array($this, $mutator))) {
            $this->$mutator($value);
        }
        else {
            $this->fields[$name] = $value;
        }
        return $this;                 
    }
    
    public function __get($name) {
        $this->checkAllowedFields($name);
        $accessor = "get" . ucfirst($name);
        if (method_exists($this, $accessor) &&
            is_callable(array($this, $accessor))) {
            return $this->$accessor();
        }
        if (!$this->__isset($name)) {
            throw new InvalidArgumentException(
                "The field '$name' has not been set for this entity yet.");
        }
        return $this->fields[$name];
    }
    
    public function __isset($name) {
        $this->checkAllowedFields($name);
        return isset($this->fields[$name]);
    }
    
    public function __unset($name) {
        $this->checkAllowedFields($name);
        if (!$this->__isset($name)) {
            throw new InvalidArgumentException(
                "The field "$name" has not been set for this entity yet.");
        }
        unset($this->fields[$name]);
        return $this;
    }
    
    protected function checkAllowedFields($field) {
        if (!in_array($field, $this->allowedFields)) {
            throw new InvalidArgumentException(
                "The requested operation on the field '$field' is not allowed for this entity.");
        }
    }
}
<?php
namespace Model;

class User extends AbstractEntity
{
    const ADMINISTRATOR_ROLE = "Administrator";
    const GUEST_ROLE         = "Guest";
    
    protected $allowedFields = array("id", "name", "email", "role");
    
    public function setId($id) {
        if (isset($this->fields["id"])) {
            throw new BadMethodCallException(
                "The ID for this user has been set already.");
        }
        if (!is_int($id) || $id < 1) {
            throw new InvalidArgumentException(
                "The user ID is invalid.");
        }
        $this->fields["id"] = $id;
        return $this;
    }
    
    public function setName($name) {
        if (strlen($name) < 2 || strlen($name) > 30) {
            throw new InvalidArgumentException(
                "The user name is invalid.");
        }
        $this->fields["name"] = htmlspecialchars(trim($name),
            ENT_QUOTES);
        return $this;
    }
    
    public function setEmail($email) {
        if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
            throw new InvalidArgumentException(
                "The user email is invalid.");
        }
        $this->fields["email"] = $email;
        return $this;
    }
    
    public function setRole($role) {
        if ($role !== self::ADMINISTRATOR_ROLE &&
            $role !== self::GUEST_ROLE) {
            throw new InvalidArgumentException(
                "The user role is invalid.");
        }
        $this->fields["role"] = $role;
        return $this;
    }
}

While the implementations of the AbstractEntity and User classes might look complex at first glance, I assure this is just a fuzzy impression. In fact, the former is a skeletal wrapper for some typical PHP magic methods, while the latter encapsulates some straightforward mutators, in order to assign the appropriate values to the fields of generic user objects.

With these domain classes already doing their business in relaxed insulation, let’s now do the last building block of the model. In reality, this is an optional class which can be skipped over if the situation warrants, and its responsibility is just to wrap collections of entities. Its implementation is as following:

<?php
namespace ModelCollection;
use ModelEntityInterface;

interface EntityCollectionInterface extends Countable, ArrayAccess, IteratorAggregate 
{
    public function add(EntityInterface $entity);
    public function remove(EntityInterface $entity);
    public function get($key);
    public function exists($key);
    public function clear();
    public function toArray();
}
<?php
namespace ModelCollection;
use ModelEntityInterface;
    
class EntityCollection implements EntityCollectionInterface
{
    protected $entities = array();
    
    public function __construct(array $entities = array()) 
    {
        if (!empty($entities)) {
            $this->entities = $entities;
        }
    }
    
    public function add(EntityInterface $entity) {
        $this->offsetSet($entity);
    }
    
    public function remove(EntityInterface $entity) {
        $this->offsetUnset($entity);
    }
    
    public function get($key) {
        $this->offsetGet($key);
        
    }
    
    public function exists($key) {
        return $this->offsetExists($key);
    }
    
    public function clear() {
        $this->entities = array();
    }
    
    public function toArray() {
        return $this->entities;
    }
    
    public function count() {
        return count($this->entities);
    }
    
    public function offsetSet($key, $entity)
    {
        if (!$entity instanceof EntityInterface) {
            throw new InvalidArgumentException(
                "Could not add the entity to the collection.");
        }
        if (!isset($key)) {
            $this->entities[] = $entity;
        }
        else {
            $this->entities[$key] = $entity;
        }
    }
    
    public function offsetUnset($key) {
        if ($key instanceof EntityInterface) {
            $this->entities = array_filter($this->entities, 
                function ($v) use ($key) {
                    return $v !== $key;
                });
        }
        else if (isset($this->entities[$key])) {
            unset($this->entities[$key]);
        }
    }
    
    public function offsetGet($key) {
        if (isset($this->entities[$key])) {
            return $this->entities[$key];
        }
    }
    
    public function offsetExists($key) {
        return $key instanceof EntityInterface 
            ? array_search($key, $this->entities) 
            : isset($this->entities[$key]);
    }
    
    public function getIterator() {
        return new ArrayIterator($this->entities);
    }
}

At this point we’ve managed to create a primitive domain model, which certainly we can use for engendering user objects without a major hassle. In doing do, we have a real chance to see if the UOW is actually the functional component it seems to be when it comes to persisting multiple entities in the database as one single transaction.

Putting the UOW Under Test

If you’ve reached this point of the article, you probably feel like you’re being pulled in opposite directions, wondering if all of the hard up front work required in writing a bunch of interfaces and classes was really worth it. In fact, it was. Moreover, if you’re still skeptical, make sure check the following code snippet, which shows how to put the UOW to work in sweet synchrony with some naïve user objects:

<?php    
require_once __DIR__ . "/Library/Loader/Autoloader.php";
$autoloader = new Autoloader;
$autoloader->register();

$adapter = new PdoAdapter("mysql:dbname=test", "myfancyusername",
    "myhardtoguesspassword");

$unitOfWork = new UnitOfWork(new UserMapper($adapter,
    new EntityCollection), new ObjectStorage);

$user1 = new User(array("name" => "John Doe", 
    "email" => "john@example.com"));
$unitOfWork->registerNew($user1);

$user2 = $unitOfWork->fetchById(1);
$user2->name = "Joe";
$unitOfWork->registerDirty($user2);

$user3 = $unitOfWork->fetchById(2);
$unitOfWork->registerDeleted($user3);

$user4 = $unitOfWork->fetchById(3);
$user4->name = "Julie";

$unitOfWork->commit();

Leaving aside some irrelevant details, such as assuming there’s effectively a PDO adapter living somewhere, the driving logic of the earlier script should be fairly easy to assimilate. Simply put, it shows off how to get things rolling with the UOW, which drags in some user objects from the database and queues them for insertion, update, and deletion by using the corresponding registering methods. At the end of the process, commit() just loops internally over the registered objects and performs the proper operations all in one go.

While in a standard implementation a UOW does expose the typical set of registering methods that we’d expect to see, its formal definition doesn’t provide any kind of finder. In this case, however, I decided intentionally to implement a generic one so you can see more clearly how to pull in objects from storage and in turn register them with the UOW without struggling with the oddities of a standalone, closer-to-the domain structure, such as a Repository or even an overkill Service.

Closing Thoughts

Now that you’ve peeked behind the curtain at a UOW and learned how to implement a naïve one from scratch, let your wild side show and tweak it at your will.

Keep in mind though that while there are benefits with the pattern, it’s far from being a panacea that will solve all of the issues associated with massive accesses to the persistence layer. In enterprise-level applications that must perform expensive database writes across several places, though, a UOW provides an effective, transactional-like approach that reduces the underlying overhead, hence becoming a solid, multifaceted solution when properly coupled to a caching mechanism.

Image via Zhukov Oleg / Shutterstock

Get your free chapter of Level Up Your Web Apps with Go

Get a free chapter of Level Up Your Web Apps with Go, plus updates and exclusive offers from SitePoint.

  • Kyle

    Great article as always, Alejandro! You could also inject the UnitOfWork class into the entity, and allow the entity to register itself. Doing that makes the contract with the entity a little cleaner, imo, and take less lines to instantiate if you’re using a DIC. Of course it has the pitfall of behaving “magically”, so there’s an argument to be made for both I suppose.

    • Alex Gervasio

      Glad you liked the post, Kyle. And you’re absolutely right: it’s possible to inject the UoW into the entities’ internals and given them with the ability for neatly registering themselves, “behind the scenes”. I really enjoy a lot getting comments like yours, because they come in useful for showing different slants that can be consumed when it comes down to implement a pattern in particular. Thanks for the insights :)

  • Andrew

    Thank you, excelent article. Help me understand Doctrine internals, little bit :) I’m thinking about entity relations. Where should be logic for commiting related entities? Have to be saved in specific order according type. UnitOfWork holds this “strategy” how to save?

    • Alex Gervasio

      Hey Andrew,
      Thanks for the comments. Well, as you know, there’re many ways to skin a cat and certainly this goes with any UoW implementation. Anyway, in most cases all the logic that handles the relationships between domain objects is encapsulated in the data mappers, thus freeing up the UoW of having too many responsibilities, hence just focusing on registering the domain objects. The UoW usually acts like simple a transactional wrapper, which shoots the queued operations in one go, without worrying if they will ever impact one or multiple entities in the storage.
      Hope that helps. Thanks!

  • Benoit

    Hi Alex,

    First, thanks for this great article.
    I have one question, how do you handle many-to-many, many-to-one, etc.. relationships between tables ? Thanks for your help ;)

    • Alex Gervasio

      Hey Benoit,
      Glad you enjoyed the writeup. As I posted before, if your UOW implementation is coupled to a batch of data mappers, all the burdens of handling table relationships is delegated right to them. For obvious reasons, this is cry away for being a banal, school-like task that can be tackled in a jiffy. Relational data mappers are complex to set up creatures indeed. That’s why 3rd party libraries like Doctrine exist in the first place, as they neatly do the hard leg work and handle from top to bottom all the relationships for you. Anyway, if you’re reluctant to climb up Doctrine’s learning curve (or the one of any other package) and just need to handle a few simple relationships, (one-to-one, one-to-many), you can embrace the challenge and set up your own set of data mappers. If you’re interested in the topic, I recently wrote a tutorial on how to accomplish this with minor hassles here http://phpmaster.com/integrating-the-data-mappers/.
      Thanks for the feedback!