Registry Implementation for review

I’m going to be building a registry object later today that gathers up several property groups. It will be holding a copy of the template map, the settings from the database, the database object itself and some other odds and ends no doubt.

Registries bother me. They seem to work to undo effort elsewhere to create a dependency injection pattern where dependencies are clearly defined. That said having a central point for such information has uses, if it didn’t programmers wouldn’t continue down this road.

I’m going to try the following variation on the pattern. I’ll be creating a Registry object and passing it to the EventDispatcher. Most of the classes in the system, and I think all classes that need to see a portion of the registry, are events. As such they are handled by the EventDispatcher before they do anything. Before the EventDispatcher has them act I’m going to have it pass the object to a bind method on the Registry object…

The Registry object will inspect the type of object and only bind relevant data to the object as determined by its type. Hence responders will get a template library and template settings, but that’s it - they won’t see settings from elsewhere in the registry hive.

As a result the registry will have coded documentation of which objects are getting what properties. I hope this will solve my greatest fear with a registry - that once you open an object to it you can’t be exactly sure what settings in the registry affect the object.

After binding relevant data to the object the registry gives it back to the event dispatcher for dispatching.

Thoughts or comments?

Hi…

The problem with a registry is that when you get a bug, the number of lines of code that could have cause the bug is huge. Anyone can mess with that registry, even writing over objects with similar ones. The initial convenience can result in an almighty payback once the registry has found it’s way into every nook and corner.

I don’t know anyone who uses DI who has gone back to this pattern, but I could be wrong.

From your description of your pros and cons, you might be after a ServiceLocator. You can’t set objects into it like a registry, as it does the construction itself. Not as decoupled as DI with respect to using your the dependent code in another app, but at least the bug is either in the class causing wierdness or the ServiceLocator.

yours, Marcus

I’ve seen service locators and that isn’t quite what I’m gunning for. With a service locator the calling class can abritrarily call for any service.

Here the only control the class has over what it gets out of registry will be from inheritance. Perhaps a draft of the bind function I have in mind will help - this will be in the registry.


public function bind( $object ) {
  foreach ($this->registeredType as $type) {
    if ($object instanceof $type) {
      $object->bound( $this->getAttributes($type) );
    }
}

Again, that’s a draft. The internal registered types will start at Page, Event, HTMLResponder, JavaScriptResponder, Responder. The system will check from most specific to least specific. Since instanceof picks up on interfaces a class could receive multiple batches for implementing multiple interfaces.

In any event, the object gets a private copy of the data. Whatever it does with that data has no effect on the data the registry hands off elsewhere in the system.

The EventDispatcher alone has privy to all registry information, but even it cannot write to the registry, merely read.

If I can find something tighter I will. I’ve culled the registry from where it was to a scant handful of items and am continually working to split it all the way up till it doesn’t exist anymore, if possible.

Hi…

A sort of namespace/partitioned Registry?

yours, Marcus

Yeah, I suppose. I mean, template libraries only matter to HTMLResponders, HTTP 1.0 Cache modes are of value to all responders, and so on. The model classes do not and should not know of these details.

Many of these settings are kept on a unified setting table with the columns group, name and value. It’s easiest to store them that way as their method of modification is the same (for the moment phpMyAdmin :wink: ) and it’s easiest to slurp them up with one query. But I’m wary of giving too much information to my individual classes. I’m starting to figure out that the less the class knows about the rest of the system the better off it is at focusing at its task.

So that leads to a “registry” approach where the registry hands out relevant pieces to classes when needed rather than the whole shebang. Should make error tracing easier.

I’m not sure about the error tracing thing - registry objects can make it easier to spot errors. For example, assume an object is overridden - the registry object could simply use a __set function and list all of the times an object is overridden in the run of an application - more specifically, the last time. Using standard debugging techniques you can find what went wrong pretty easily.

More importantly - why limit what a class can access? Surely that’s rather inefficient?

As you know, when an object A is shared amongst objects, they all have only the address of object A rather than its value. So, say 20 objects utilising A, they use round about the same amount of memory for this functionality as 400 objects utilising A.

But if A has to decide what to allow each of these 400 objects to access - that’s overkill surely?

Or am I missing something?

Good points Jake. The processing overhead and creating copies of sections of the objects will be wasteful of memory.

I guess I’ll keep the thing as small as possible and split it up later with an eye of removing it. Here it is as of this morning.


<?php
namespace Gazelle;

class Services extends ReadOnlyArray {
		
	/**
	 * CONSTRUCT.  Lots of variable assignments.
	 * @param unknown_type $settings
	 */
	public function __construct( $settings ) {
		$this->storage = $settings;
		$this->storage['db'] = $settings['config']['database'];
		$this->storage['pages'] = new $settings['config']['events']['pages']( $this );
		$this->storage['settings'] = new $settings['config']['events']['settings']( $this );
	}
	
	public function offsetGet($offset) {
		switch($offset) {
			case 'db':
				if (is_array($this->storage['db'])) {
					$this->storage['db'] = new $this->storage['db']['class']($this->storage['db']['config']);
				}
				return $this->storage['db'];
			break;
			default:
				if ( isset($this->storage[$offset] )) {
        			return $this->storage[$offset];
		        } else {
        			throw new Exception("$offset does not exist");
		        }
			break;
		}

    }
}

The read only array object is getting used more heavily than I thought it would be.

Your special casing ‘db’, which looks like a code smell to me - but don’t listen to me. :stuck_out_tongue:

I’m not happy with db instantiation anyway. I want to delay it until I actually need to run a query rather than waste time setting it up when it won’t be used. For that reason no outside class obliquely starts the database.

The service is set up to start it when the first query is fired. The alternative to that special case is to start the database whether it’s needed or not.

If you know of a better way to accomplish that…

You don’t (and probably shouldn’t) need to connect to the database just to make the database object available to your application, just add a spot of lazy loading in the object or implement a connection manager of sorts.

Here’s a rough idea, very rough. :slight_smile:


<?php
class db
{
  public function __construct($host, $user, $pass, $port){
    $this->credentials = array(
      'host'  => $host,
      'user'  => $user,
      'pass'  => $pass,
      'port'  => $port
    );
  }
  
  /* throws db_exception */
  public function connect(){
    
  }
  
  public function isConnected(){
    return is_resource($this->handle);
  }
  
  public function query(){
    if(false === $this->isConnected()){
      $this->connect();
    }
    return $resultset;
  }
}
?>

I know and appreciate what you’re saying, but tell the PHP team that - PDO::__construct makes a connection - no way around it. I suppose I could have my db object defer the calling of it’s parent construct method - but calling parent::__construct from anywhere but self::__construct is VERY dangerous and much worse code smell than what you’ve put found.

The other route is to have the PDO object be a member of database class rather than database be a child, but I don’t like that idea much at all.

Yep, you’re right - but this is already getting messy isn’t it…

Refactor it, you know you want to. :smiley:

I agree with that. It’s better to be able to access all PDO functionality via your database rather than having to redirect calls to database to PDO.

In fact one of my recent framework tests used a database which wasn’t passed around explicitly, rather was a child of a data access class. Everything ran through that in a custom way. Turns out all it did was limit my options - although it allowed me to port to one of many database systems including XML and even flatfile if I wrote the classes for it. In the end - pointless :lol:

However if you do want a lighter load, extending isn’t the way to go.

Hi…

PHP uses copy on write internally - you’ll hardly notice any performance hit. The reason for passing references is semantic.

yours, Marcus

Objects copy by reference, everything else copies by value unless reference is specified. That’s the behavior and it’s annoying - I wish PHP would do one or the other but not a mix of both - Javascript is always by reference by comparison.

The implementation of this I don’t know in detail. If

$a = 3;
$b = $a;
$a = 4;

I know $b doesn’t change here, but I don’t know when the engine decouples from $a.

When the service hands out copies of objects they’ll be by reference so you’re right - memory isn’t an issue. It’s the array fragments I worry about a little bit since they are copies :\

I’ll need to look at in more detail. I don’t like this section of the code but the rest is shaping up nicely. I just need to limit its influence for when I have time to look at it in more detail.

Yes and no. Semantically, that is the behavior, but the internal implementation is copy-on-write. In php, variables and values are completely decoupled, except for primitives (scalar). So a variable $a referencing an array is simply a pointer to a place in memory, that holds the actual array. When you assign the value to a new variable, it will initially reference the same value. Only when you manipulate the value, will php make an actual copy and update the pointers to point to two different paces.

The thing is, if you store your settings in DB, then DB needs to be the first object to be instantiated, since settings array is usually always needed, there is no reason to do the lazy loading of db. Usually there are always a few objects that you know you going to have to instantiate on every page load. Usually in database driven sites, the DB in one of those objects.

Until performance becomes a priority and you implement caching. At that point the reliance on the database becomes less and lazy loading the connection becomes worthwhile.

The answer is not to make DB extend PDO. Just make your db class create a PDO object when it needs it.

Do you really want to tie your application down to a specific connection implementation anyway? PDO is lacking several features which are available in MySQLi.

How long ago was it that we replaced mysql_connect() with new mysqli()? Then to replace that with new PDO()? My point is, things change. It’s likely something will come out and replace PDO as the fad-of-the-month database connection extension. By not extending PDO you don’t need to worry about changes to that.

I’m thinking of creating a data management object that will pick between cache or database as necessary, and it will do the ‘lazy’ connecting. The existing Database class will continue to be an extension of PDO.

Yeah, I think this is going to work. Instead of one monolithic dispatcher that tries to keep tabs on everything, one dispatcher for each of the three areas of responsibility - model, view, control.

The DataDispatcher receives calls for data entities from the other two and decides how its going to respond based on cache state and availability. It makes the decision whether or not the database gets started, neither the EventDispatcher (control) nor the ResponseDispatcher (view) will do this.

Part of this is pulling back to this thought - what the registry is storing is application data state. That’s model stuff. The implementation of how it is stored and retrieved is not the business of control or view. This makes things clearer from here out.

So the EventDispatcher asks the DataDispatcher about the page hierachy it doesn’t have to care where the DataDispatcher gets the answer.

The Database class gets to be an extension of PDO. It doesn’t have to worry about the fact that it starts a connection when it’s started, because caching is NOT it’s responsibility, it’s the responsibility of the Cache object, and if the DataDispatcher starts the database there WILL be a query. The DataDispatcher is in charge of choosing between the two when fulfilling requests from views and controllers.