Domain model and PHP's create/destroy nature

I’m sure this has been discussed before, so forgive me.

I’ve been reading PoEAA again because I feel lately my code has been becoming much less OO-pure and more procedural, but reading through this excellent book again got me thinking… with PHP’s nature being request/response cycle driven, with nothing kept in memory between cycles, everything has to per persisted in some form or another between each request. Everything is created, everything is destroyed. All walls are built, all are torn down.

It made me want to ask of the clever people in this forum: how TRUE of a domain model do you actually use within PHP? How big are the object graphs you’re pulling into memory per R/R cycle? Do you use some form of lazy loading? When you perform an operation on your model, does it modify only the object graph in memory, and then persist to the database (or other, session?) when the walls are torn down? Or are you performing ad hoc queries to/from the database when manipulating your object graph?

I’m just thinking that because PHP continually needs to work with persistent storage… databases primarily, how far does one go with the “pure” domain model approach, and how do you find the right balance between performance, persistence and memory usage per R/R cycle? Need some thoughts and opinions from those brighter than me…

Anyone? The silence is deafening… :shifty:

The way that php’s runtime architecture is designed certainly has some impact on how applications get designed. It’s not just the internal architecture, but also the way that http works though.

I wouldn’t way that it’s completely at odds with a rich object oriented domain model, but there are certain patterns that seems to work better than others. Hiding the database completely, for example, is not really a good idea.

Hi…

The popularity of ActiveRecord and TransactionScript is a concession to this problem. You have hit on the elephant in the PHP (and Ruby) room.

For highly concurrent apps like web servers there are really three OO domain model approaches: (a) in memory persistent model (Java), (b) continuations (Smalltalk, Scheme) and (c) on demand reading of a domain subgraph (PHP, Ruby). (c) is trying to give the illusion of (a). (b) has to give that illusion as it’s hiding the request cycle altogether.

The lazy subgraph approach clearly involves a lot of intricate mechanics. Not just ORMs, but all the other lifecycle bits and bobs we carry around such as sessions, cookies and shared memory. In other words we inflict a lot of work on ourselves, although dwarfed by the mountain that is a modern ORM. Selectively creating object hierarchies is the subject of patterns like DI, so not surprisingly it’s becoming a popular topic in the PHP world. Even the java folks get concurrency issues with their big in memory super models. And we have to make business processes transactional :eek:. It’s all so hard.

Further pressure is being placed on the big domain model view of the world from the DB people. They are shifting to stream databases for processing and using structured storage only for OLAP. Like the comprehension folks, business processes seem to be becoming more visible than business models these days.

And then you have all the work to develop the domain model right in the first place. Get it wrong and you’ll suffer until you fix it. So you have to be able to migrate it too. I still think you win by having the model, but the visibility of the errors adds to the feeling of weight. We take on the job of the DBA and data modeler when we build these things. They have all the books.

The web developers are leading the “no-SQL” movement and producing much more fragmented models of the world. They win in being able to choose the right tool for the job and in being more scalable. Of course there has to be more mechanics and generally more stuff to cope with in this more primitive, hands on approach. A bit of a step back for me. I feel it’s easier to incrementally write the website, but harder to run the business that way.

Myself I am moving away from having domain models in PHP code. I use a lot of OO, but that is for the mechanics. I’m not anti-model by any means. They give the code an overall plan that is the last line of defence to software entropy. Just that I don’t want to keep loading the damn thing.

My last PHP project was accounting related, so I could move almost everything into the DB and have the business domain expressed in SQL procs and views. Lot’s of modelling, much less code.

I want my next private project to involve using message queues such as RabbitMQ. That way the DB can be used for modelling and storage, the message queue can handle the transactional behaviour, and the PHP/Javascript can handle the interactions. I’m looking forward to trying this. It feels right to me.

Sorry, rambled a bit there. My thoughts aren’t clear yet, but I felt compelled to reply. I think you’ve asked the most important question on this forum in quite a while.

yours, Marcus

I’m just surprised you’d be moving from OO to procedural, I always thought the natural progression was the other way round…

HTTP is essentially procedural - PHP’s architecture mimics this, and this has implications for the application architecture. But you can implement an application with a mostly procedural architecture, yet using object oriented implementation techniques. Which is what I assume is what Marcus meant.

Hi…

I’m not. At least I don’t think I am :).

I’m moving to relational for data, OO for the mechanics stuff, functional for the presentation, and message queueral (?) for the integration. I guess you’d say event driven for the integration.

If I were using the Java style application server stack, the domain model were complex, the domain involved non-DB stuff or multiple DBs and load was not too big an issue, I’m absolutely sure I would be domain modeling in pure OO. But I’m not. I’m often using a simple web server stack backed by a DB.

OO is my default coding paradigm (PHP, Ruby, Scala) as it copes well with vagueness of requirements and modularity and refactoring gets baked in. It’s just that I like relational for data (SQL), functional (actually actors in Erlang) for networking, functional for presentation (R, Javascript), procedural/OO for performance (C) and functional for prototyping (Scheme).

A recent small project was an online survey with fancy reporting. 70 lines of PHP, 50 lines of SQL, 16 lines of R, 5 lines of Javascript and 1 week of work from conception. No framework if you don’t count JQuery.

I guess I’m just trying to find the path where I have to write the least amount of code. Often that’s OO, but if I can catch myself implementing lot’s of stuff I’ll switch.

yours, Marcus

Awesome question. :slight_smile:

Depends on the project I’m working on. Example: some of the SitePoint stuff I’m working on has very strange storage requirements. A lot of it has been upgraded over the years and previous decisions have constrained the model somewhat.

For brand-spanking shiny new stuff, I try to create a fairly sane and sensible domain without going overboard - I find conceptually that it really helps early on the project lifecycle, even if it might make optimising things a little trickier later on.

Usually pretty trivial; most of the apps I’ve worked on don’t include massive data sets, and if they do I’ve cached the bejaysus out of them.

Depends. If I’m using a framework, I usually get that functionality “for free” and it’s nice to have if so. If I’m doing something that requires real lightning-fast response time or has huge processing overheads, I’ll cache the heck out of the output anyway.

My philosophy is generally “right tool for the right job”. In the web world, I’m seldom in a position to require something as heavy as Java - most of my work is straightforward stuff where stateless HTTP calls are fine (and PHP is, therefore, lightweight enough to be useful).

If I was doing real-time stock market monitoring, however, I might consider a Java app, accessible as a web service… :stuck_out_tongue:

Can you elaborate on the main reasons why? Not a criticism, I would just like to learn more from your reasoning.

Are you able to give an basic example of how DI is used for selectively creating an object hierarchy?

Hmmm, I haven’t explored message queues… thanks, I’ll have a further look into this.

Wow, I’m humbled, haha. No need to apologise for ‘rambling’, getting all of this stuff extracted from our brains and into a forum is how we can work through these issues. Yes it was something I needed input on because this red flag kept coming up inside my head whenever I would start to create a big model and then look at all the BS you need to go through to keep loading and persisting over and over again. It’s frustrating and never feels “right”.

Lastcraft, I think rolfen might be replying to me. In any case, the stuff I’ve been doing lately is a kind of weird “blend” between OO and procedural. OO for the model (ActiveRecord, with 1:1 and 1:M in-memory relations between objects), and procedural for the controller/core system functionality. The global nature of some parts of this system makes me cringe a little, and I would like to once again get into a fully OO system, but the complexity at the moment is low, and it falls pretty well under the “GSD” (Gets S**t Done) category of development.

Thanks again to all for your opinions, much appreciated.

Hi…

Turns out we are not the only ones thinking about this right now. More on this topic:

http://blog.objectmentor.com/articles/2009/04/20/is-the-supremacy-of-object-oriented-programming-over

yours, Marcus

I declare all the the appropriate objects in the wiring file…


...
$injector->forVariable('session')->willUse('MemcacheSession');
$injector->forVariable('mailer')->willUse('MailChimp');
...

Here in PHP, but more likely in your framework author’s accent of XML.

Then the very constructor declaration only instantiates the needed classes…


class SendSpam implements Action {
    function __construct($mailer) {
        ...
    }
}

You declare an application as a whole just once. Only those parts of the application that are needed are instantiated.

yours, Marcus

Nice example, thanks!

The elephant in the room is the ORM impedance mismatch. If you hide the mechanics of the database behind an object interface, you can severely impact on performance. It’s imply not practically feasible to write your application code as totally ignorant of the underlying relational model.

But there’s more to it. The standpoint of ORM’s is that the relational model is inferior and should be reduced to a storage for the object model. But a relational model is a very powerful one, and for some tasks it’s much more appropriate than an imperative/object oriented model. Lists of data that cut across tables (queries, views) fits well into this paradigm, and it turns out that this is something that is used a lot in typical business applications.

Hi…

Which begs the question:

“Suppose we had the perfect persistence layer or ORM or OO DB for our objects. That is all our objects are automatically persistent. Would it then always be best to have a domain model?”.

I might nearly say yes.

In that case you have a continuation based framework (but are still putting up with a visible request cycle). Now these are popular aren’t they?

The seeds of doubt come from the difficulty of searching such a structure, the performance characteristics algorithmically speaking, the effort to maintain such a glorious tribute to the enterprise gods, the centralised change resistant nature of the solution (don’t ask the DBA, ask the Architect), having to add transactions everywhere (ACID), and the sheer bloody bloat of this perfect ORM.

It might be possible. The Smalltalkers may have done it already.

We kind of lump OLTP, responsiveness, OLAP and archival, process engineering, and top level expressiveness all into one super thing. We could have OLTP, transactions and processs in one layer, and then have a lower layer of OLAP, storage and search that is a few seconds out of date. Wouldn’t that be easier? If you did that, would you still use an OO domain model for the lower layer?

yours, Marcus

This whole thread, and the quote above especially, have intrigued me. Is there any chance you could go into a little more depth on what you do use? Specifically the “relational for data, OO for the mechanics stuff” that you mentioned earlier. Were there other factors that led you to move away from domain models, other than the difficulty of persistence?

I’m not sure where this belief comes from that domain models in Java web applications are handled hugely differently than in PHP web applications in that regard. They’re not, except in Java you have better tools for working with them. Java web applications usually work like your described approach c), not a), in-memory / second-level caches put aside. Maybe you need to elaborate on what you mean by “(a) in memory persistent model (Java)” but if you mean having the whole domain model in-memory and each object only once and working on that domain model from all request-processing threads, then you have a concurrency nightmare unless your whole domain model is immutable.

Yes, in a Java web application you have the ability to keep things around in memory between requests efficiently, however, you do not want to do that with everything and you especially don’t want to share all (domain) objects between all requests. Not even if you have unlimited RAM, because there is one big issue here: thread-safety. You pick carefully which objects you want to share/keep and which not. It is best-practice in Java web development to start with request-scoped (or even narrower scoped) objects unless you need them in a wider scope (conversation/session/application). Preferring request-scoped objects results in better scalability, not worse, and in code that is easier to understand and does not need to care about threading issues that much (of course its also good practice to make objects immutable if possible).

The standard approach for domain model persistence with Java is JPA/Hibernate and the typical pattern for a Hibernate Session / JPA EntityManager is not 1 Session/EM for the whole lifetime of the application. JPA EntityManagers in Java EE are by default transaction-scoped, which means the part of the domain model you load becomes detached (and eventually subject to garbage collection during the same request) as soon as the transaction commits (that is, within the same request). You can apply second-level caches to alleviate database load greatly but this is a separate concern and even from caches every request/session gets different domain object instances usually. A Session/EntityManager guarantees uniqueness of the entity only in his persistence context, which is thread-local and ends either when the transaction ends or when the EntityManager/Session is closed explicitly. EntityManagers/Sessions are generally not thread-safe.

Alternatively a JPA EM can have an extended persistence context which means it remains open until it is explicitly closed. This can be used to keep state between long-running business transactions that involve multiple request-response cycles (and thus span user think-time and potentially multiple database transactions since you dont want to keep transactions running during user think-time). The lifecycle of the extended persistence context is usually bound to the lifecycle of the stateful session bean that uses it, so its also not “global”. Multiple users in different sessions associated with different stateful session beans will get different domain object instances (different in the sense of JVM object identity).

Its funny that even though many people seem to want the ability to keep stuff (mostly objects) between requests in PHP, I bet 99% of them would be steamrolled by threading issues anyway. The limited php “shared nothing (except the database)” architecture protects you from dealing with a lot of nasty threading issues while at the same time taking away a lot of possibilities. There is nothing special in pushing all shared state to a database (often the slowest part of the stack in a large system), you can do that in Java, too, but you can also do otherwise, so anyone who is thinking this is some php awesomeness is really dreaming.

Summary:

  • Working with a domain model in php or java in a web application is largely the same. Most objects are request-scoped. However, in Java you have the better tools (and caches).
  • The value of a domain model does not depend on whether you can stick it all in memory. Neither do you usually want that nor do you need that in order to make good use of it.
  • Sharing objects between requests and even more between threads is something that needs to be done carefully. OO with its encouragement of mutability makes thread-safety a difficult undertaking.

Regards

Roman

Hi…

I was doing a half-assed philosophical charicature, and ignoring minor details like practicalities. I was able to do this by drawing on my huge range of ignorance :).

All we need is a Gemstone/Smalltalker to say that they have all these object graph hassles too, and that will be that. I’ll just be buying SQL books from now on :eek:.

yours, Marcus

Hi…

The main objects in play besides the persistent ones were things like Session and PaymentGateway, and many PageController subclasses. All of these were made manageable by DI as they were either mostly shared or completely stateless.

My last PHP project was accounting related. We had one Accounts class (stateless except memoisation) where each of the 200+ methods was actually a stored proc. This handled the whole of the web interface. You could do things like…

$accounts->signUp('Fred', 'Bloggs', ...)

Reporting was views or specialised tools. Integration was with views, but given a choice it would have been a message queue.

As the model was now behind the procs/views our main means of abstraction was internal views. A very powerful mechanism and one I now have a lot of respect for.

What was remarkable was the code size. 4500 lines of SQL, less than 500 on the web framework (Session, Request, Page, etc) and less than 1000 on all the other mechanics (the Accounts class, migrations, PaymentGateway, reporting tools). That’s was after nearly three years work (on and off) with between 2 and 5 people at a time.

Our biggest bottleneck to change was always the HTML/Javascript by two orders of magnitude.

Now obviously an accounting project is a very good fit for a SQL system and that was the reason for doing it that way from the start (after a quick prototype). What surprised me was the ease. Changes to the data model would hurt a little, but buffered by the views, only a little worse than with an OO system.

The only complication was when we had a transaction that involved the payment gateway. That meant some extra code in maybe three controllers, because the controller was seeing two separate objects…


try {
    $accounts->signUp(...);
    $confirmation = $payment_gateway->pay(...);
    $accounts->pay(...);
} catch (...) { ... }

There is a long list of parameters in each call here and a dozen exceptions worth of catch blocks.

That’s the only time the lack of domain model hurt the controllers. A good trade.

Not a typical project, and not the way I would recommend for every system by any means. Just that I am a lot more willing to look at alternatives before require_once’ing an ORM.

yours, Marcus

p.s. I’ve recently changed job (since this thread started) and I’m straight back into an ORM world. Suddenly I have to understand thousands and thousands of lines of code :(.

Domain model code or ORM code? You should not need to understand the ORM code :slight_smile:

Avoiding complicated ORM by avoiding OO domain models (in combination with relational databases) is a valid choice. I wish more people would be aware of the fact that you can not just “not use an ORM”, you have to avoid using the stuff that brings up the need for it.

I do like domain models and in fact I do like ORM, simply because I’m so deep into it that I don’t see the complexity anymore. For me its relatively simple, it has become “implicit complexity”, maybe much like a c/c++ guru does not think much about memory management anymore, he just does it. But I’m not so blind as not to see that it is very complicated for anyone not used to it.

Roman

Hi…

Domain model code, the custom extensions for missing ORM features, ORM leakage, clumsy migrations and the inevitable impedance.

Which is not to say that ORMs don’t work (and I’ve used and written them a few times). You win if the ORM controls the show (no need to integrate with anything else) and you can drop the objects into UI widgets without a lot of marshalling.

Of course an ORM and a domain model are not the same thing. In primitive cases the object model and data model may be isomorphic to start with. The ORM is presenting a more OO version of a data model. This allows augmentation of the data model invisibly to the client code. As that model grows into a full domain model the ORM may be mostly reduced to data transfer objects. The result can end up as something of a mish-mash, with the domain model forced to fit an old data model and the mechanics (e.g. ActiveRecord inheritance) of the ORM.

Is that path always the right one?

yours, Marcus