Some C++ Wisdom Brought to PHP: Resource Acquisition Is Initialization

Jeff_Mott · April 29, 2015, 7:18am

Consider the following:

function doAtomicTask($dbh)
{
    $dbh->beginTransaction();

    // ...

    $dbh->commit();
}

Is there anything wrong here? We begin a transaction and we have a matching commit. But what if there’s an early return?

function doAtomicTask($dbh)
{
    $dbh->beginTransaction();

    if (/* some condition */ true) {
        return;
    }

    $dbh->commit();
}

Or what if an exception is thrown?

function doAtomicTask($dbh)
{
    $dbh->beginTransaction();

    mightThrow();

    $dbh->commit();
}

In both cases, the transaction is left dangling, neither committed nor rolled back. And if later your application needs to perform another transaction, then you might encounter the error, “There is already an active transaction.”

In C++ jargon, a transaction would be considered a resource. A resource is something you must give back once you have finished using it. In C++, the obvious example of a resource is memory that you get from the free store (using new) and have to give back to the free store (using delete). But other examples of resources are files (if you open one, you also have to close it), locks, sockets… or transactions.

Hazardous knee-jerk solution

When people first encounter this problem, they tend to consider it a problem with exceptions rather than with resource management, and they come up with a solution by catching the exception.

function doAtomicTask($dbh)
{
    $dbh->beginTransaction();

    try {
        mightThrow();
    } catch (Exception $e) {
        $dbh->rollBack();
    }

    $dbh->commit();
}

But this solution doesn’t generalize well. Consider acquiring more resources.

function doAtomicTask($dbh)
{
    $dbh->beginTransaction();
    $fileHandle = fopen('file.txt', 'r');
    flock($fileHandle, LOCK_EX); // acquire an exclusive lock

    try {
        mightThrow();
    } catch (Exception $e) {
        flock($fileHandle, LOCK_UN); // release the lock
        fclose($fileHandle);
        $dbh->rollBack();
    }

    flock($fileHandle, LOCK_UN); // release the lock
    fclose($fileHandle);
    $dbh->commit();
}

This solves the problem, but it repeats the resource release code, and repetitive code is a maintenance hazard.

Resource acquisition is initialization

Fortunately, we don’t need to litter our code with try/catch statements to deal with resource leaks. A better solution is to acquire a resource in the constructor of some object and release it in the matching destructor. This approach is called Resource Acquisition Is Initialization (RAII), or alternatively “Constructor Acquires, Destructor Releases.”

function doAtomicTask($dbh)
{
    $transaction = new Transaction($dbh);

    mightThrow();

    $dbh->commit();
}

// This class "owns" a transaction resource.
// A transaction is acquired in the constructor and released in the destructor.
class Transaction
{
    private $dbh;

    public function __construct($dbh)
    {
        $this->dbh = $dbh;

        // Acquire
        $this->dbh->beginTransaction();
    }

    public function __destruct()
    {
        // Release
        $this->dbh->rollBack();
    }
}

Now whichever way we leave doAtomicTask(), whether an early return or an exception, the destructor for $transaction will be invoked and the transaction released (rolled back).

s_molinari · April 29, 2015, 9:16am

Isn’t this pretty much a moot point with PHP, because PHP will release memory and clear resources, once the script ends anyway? This is one of the things that makes PHP comparatively slow to other languages, because initialization and cleanup has to happen with each and every script execution.

Scott

TheRedDevil · April 29, 2015, 9:39am

While this would work in simpler transaction based systems, I am not sure if I would considering it anything else than a band aid on the initial issue.

Though I guess it mainly depend on how you setup the structure of the classes, i.e. if the architecture depend on uncaught Exceptions for trivial issues like the user did not fill out a field or if that is handled another way.

If it does not, there is no difference than to handle the rollback of the transaction in the same method that initiated it. Since you will return the failure/error anyway.

In the end, if you relay on uncaught Exceptions at that stage, the idea makes sense since it allow you to keep this approach. If not adding the rollbacks directly makes more sense in my opinion. Especially if you deal with multiple transaction layers at the same time.

TomB · April 29, 2015, 9:40am

There are a couple of other caveats to this approach as well: 1) There are a couple of situations where __destruct is not called. 2) In some implementations (I believe fastCGI is a culprit) the script CWD is different in __construct to __destruct, this has odd implications if you’re using relative paths and writing to a file in __destruct

Michael_Morris · April 29, 2015, 12:34pm

In a default manner yes. What if you have specific logic you wish to be executed during the process though, for any reason?

Jeff_Mott · April 29, 2015, 2:52pm

It’s certainly less of an issue, yes, but you might still need more than one transaction even during a single run of a PHP script. Or you might need to read/write to the same file more than once. Better to clean up our own resources.

s_molinari · April 29, 2015, 2:56pm

Agreed.

Scott

Jeff_Mott · April 29, 2015, 3:07pm

I agree that exceptions shouldn’t be used to communicate validation results. But I’m not sure why we’re assuming that any exception that might bubble through your call stack could only be for a trivial case. Plus, it’s not always up to you to use exceptions or not. Even if you choose to not throw in your own function body, any other function you invoke might throw.

Jeff_Mott · April 29, 2015, 3:14pm

True, such as if someone invokes exit in a destructor. And if someone does that, then I don’t think any trick or technique can save you.

TheRedDevil · April 30, 2015, 1:47pm

Perhaps we are looking at this from two different sides, I am not sure if you consider that if a transaction fail then everything fails, or if you look at it from the same perspective as me.

Where even if one part of a transaction fail, you either rollback to last savepoint and try again (depending on error) or commit up to the last savepoint, and then log the part that failed so it can be reviewed and reprocessed at a later point. In short, still use the data up to the closest point to it became invalid.

I guess, what I try to say is, if you know the code you call might throw an exception that could affect your code at run time, why dont you add a catch for it instead of depending on cleaning up as a fatal termination.

Jeff_Mott · April 30, 2015, 2:49pm

Because using automatically invoked destructors is less typing, less prone to error, and less repetitive.

Less typing because you don’t need to write try/catches everywhere you have a resource.
Less prone to error because you don’t have to remember to write try/catches everywhere you have a resource. Nor do you have to remember which resources might be opened and need to be cleaned up. Same goes if there’s an early return. You don’t have to remember to clean up resources before returning, you can just return and let destructors clean up automatically.
And less repetitive because otherwise you’ll often have to repeat resource cleanup code at the end if successful, in a catch if there was an exception, and just before an early returns.

TomB · April 30, 2015, 2:54pm

I’m not sure how well PHP implemented finally but in Java the finally block in try/catch gets run regardless of whether there was a return statement in the try or catch block e.g.

try {
	return 1;
}
catch (Exception e) {
	return 2;
}
finally {
	System.out.println("Finally called");
}

In Java the message is still printed, I’m not sure if the same is true in PHP, however.

edit: It appears in the same way:

<?php

function test() {
	try {
		return 1;
	}
	catch (Exception $e) {
		return 2;
	}
	finally {
		echo 'TEST';
	}
}

test();

As you would expect “TEST” is echoed in this script.

Jeff_Mott · April 30, 2015, 3:17pm

Yup, “finally” does indeed try to solve the same sort of problem. And when comparing the “constructor acquires, destructor releases” technique to “finally”, the differences become smaller and more nuanced. Probably “constructor acquires, destructor releases” would still result in less code, because the acquire/release logic would exist in just one place – a constructor and matching destructor – rather than at every occasion where you need to acquire that resource. Or, in Stroustroup’s words, “there are far more resource acquisitions than kinds of resources.”

EDIT: Another useful behavior (though possibly less so in the PHP world) is that acquiring in a constructor and releasing in a destructor gives us explicit ownership a resource. We might lock a file, open a socket, connect to a database in one function, then defer the closing of that resource to a later time and place. Usually it’s the programmer’s responsibility to remember that a function acquired some resource, and that we’re responsible for releasing it later on. But if instead the lifetime of the resource is tied to the lifetime of an object, then we can return and pass around that object, and once we’re done with it (the object’s reference count goes to 0), then the resource gets released automatically.

system · July 30, 2015, 10:20pm

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.