Persistence and the 'new' keyword: create new object or retrieve persisted object?

I have stumbled upon a problem of how to most elegantly solve the dilemma of how to instantiate a persisted object. I have a system where people can prepare and edit documents. This is similar to a shopping cart system: users create a new document, add or remove items to it and then submit the document. So the document behaves like a shopping cart in the way that its content is preserved across the session so they can come back to it any time. The only difference is that one document is shared among all users within a location so they all create and edit a single document, which is then committed to the db (currently there are 2 locations, so in fact there are 2 separate documents, each edited by their respective group). So far this has been easy, here is a simplified way of instantiating the document object:


$doc = new Document(User $user, $location);

This statement does one of the two scenarios:

  1. If there is no document created for the given location, then create it and store it in a session

  2. If there is a document for the given location stored in a session then retrieve it (don’t create a new one)

Now that I want to extend the system I find mixing these two responsibilities a bit messy. I want to add the functionality to be able to correct a document committed in the past. So in the session there can be 3 documents open for editing: a normal one for location A, a normal one for location B, and a correction document shared between both locations. Now the constructor would need to look like this:


$doc = new Document(User $user, $location = null,
        Document $docToCorrect = null);

Now in order to create a new document for correction the $docToCorrect needs to be passed so that the object knows which document needs to be corrected and the $location parameter is redundant because I want users from both locations to be able to correct any document in the db. But after the document is created for the first time and is now persisted in a session, on subsequent instantiations it seems like I need to pass the exact same parameters in order for the constructor to know if it should be a new object or an existing object stored in a session. And because in fact the real-world system has a bit more variables than I presented in this simplified example, the constructor has become really messy trying to be intelligent enough whether to create a new document or pull an existing one from the session.

So the main question is this: can the constructor share these 2 responsibilities (creating a new doc and retrieving an existing one from a session) or should I delegate the creation of a new document to different method or class?

For example, in the Propel ORM I see a clear distinction between these two, because the ‘new’ keyword always creates a new object (=new data row), for example:

$book = new Book;

If I want to retrieve an existing (persisted) book, I use separate Peer classes and I need to provide an id:

$book = BookPeer::retrieveByPK(10);

However, some people use active record pattern where the ‘new’ keyword retrieves an existing object:

$book = new Book(10); // fetches Book from db with PK=10

Therefore, I’m not sure which way to go. In my case, I think this would work well: when I want to create a new document, then I create the object like this:


// here I pass all the necessary parameters depending on what
// type of document I want to create (create = begin editing)
DocumentCreator::create(User $user, $location = null,
        Document $docToCorrect = null);

This creates a new object and stores it in a session. Then on subsequent pages I just instantiate the document object like this:


// when I want to edit a normal document (one created from scratch)
$document = new Document($user, 'normal');

// when I want to edit a correction document
$document = new Document($user, 'correction');

Each of these would throw an exception if a document of a given type (normal or correction) has not been created for the given user by the DocumentCreator class.

But I’m not sure if it’s the best way. Perhaps I could extend the Document class with a CorrectionDocument class that would have a separate constructor for correction documents? Or maybe have two separate classes, Document and CorrectionDocument, each sharing the same interface? I hope I have explained it clearly enough, I’m sure this problem must be known in the OOP world and I hope there is a pattern to deal with it in an elegant way.

Off Topic:

Do you realize that what you’re describing has already been done really, really well by the folks at Mercurial?

That is some solution but then the parameters are not available to the constructor. I once saw a simple pattern for such cases where a config object was passed to the constructor with all parameters in a very readable form, I think I might prefer that. Or maybe it was an array but parsed with a config object in the constructor.

The mapper deals with storing the data. Whether it’s written to the filesystem, session or database is irrelevant, the syntax would be the same.

Yes, this is exactly what I need. I think I will create a simple Data Mapper implementation based on what I can find. I don’t need it for database storage and retrieval, just for simple object persistence. It is done in the db in this case but sure I don’t want to be dependent on this. Thanks, this helped a lot and looks sensible.

In some cases, you might want to set some default properties on the document or pass some dependencies to it, in which case the mapper can act as a factory for document creation. In which case you’d use $mapper->getNew(); Rather than sending the same parameters to “new Document()” every time you create one. It is even worth doing this anyway, in case one day Document does have a dependency. That way you can change the object creation routine in one place even though documents are created from multiple places within the application.

Good to know that. I also thought about a factory for document creation, which would come in handy. But in this particular case I don’t really imagine I will ever create a new document from more than one place. However, I will access it from multiple places.

Yep that’s right. Generally I’d advise against sending the properties of the object in the constructor because it gets messy if you have to add lots or want to leave some blank.

Would use something more like:


$document = new Document;
$document->name = 'Tom\\'s document';
$document->author = 'Tom';
$document->body = 'Work in progress..';
$mapper = new DocumentMapper;
$mapper->save($document);

The mapper deals with storing the data. Whether it’s written to the filesystem, session or database is irrelevant, the syntax would be the same.

edit: In some cases, you might want to set some default properties on the document or pass some dependencies to it, in which case the mapper can act as a factory for document creation. In which case you’d use $mapper->getNew(); Rather than sending the same parameters to “new Document()” every time you create one. It is even worth doing this anyway, in case one day Document does have a dependency. That way you can change the object creation routine in one place even though documents are created from multiple places within the application.

I have something like this:


# Active record
$doc = new Document();
$doc->user = $user;
$doc->url = 'bla';
$doc->save();
# Active record load
$sameDoc = new Document( $doc->id );
$sameDoc->url = 'http://example.com/';
$sameDoc->save();
# Active record search (Simple split on AND and use the User and Url as fields for first and second param).
$doc = Document::getOneByUserAndUrl($user, $url);

Ok, thanks, I assume this would be an example of fetching an object previously created. Then how would I create a new document so that it gets saved for future page requests? I have come up with this:


$document = new Document($param1, $param2, ...);
$mapper = new DocumentMapper('...');
$mapper->save($document);

Is this correct?

I tend to have the same opinion. What would you suggest then? The most intuitive way for me is using the ‘new’ keyword to create a new object (= a new document). Then how do I access this object on subsequent page requests if not with the ‘new’ keyword?

I would suggest a Data Mapper (look up the pattern).


$mapper = new DocumentMapper('...');
$document = $mapper->getDocument($key);

$document->author = 'TomB';
$mapper->save($document);

Whether the document is coming from a database, session or filesystem makes no difference. You could even have a mapper for each case so the document can come from anywhere. The beauty of this is that the Document class definition remains unchanged yet the source of the data can be substituted.

I don’t like the Active Record style, or anything where initiating an object is also fetching its data. It’s giving one piece of code multiple tasks to perform which is an OO no-no™ :wink: