Confused about Factories

Hey guys,

I have been trying to figure out the factory pattern a while, but I keep getting different versions of them, and different explanations.

So are all these different patterns?
Factory
Factory Method
Abstract Factory

You don’t have to explain them to me, but why is a pattern called Factory Method? That makes it sound confusing.

Does anyone have a link that shows a clear distinction between the 3?

Factory pattern is great when multiple objects that share a common construction scheme and/or when the construction of an object or it’s resources is especially convoluted and the object is better off with the methods for creating it off in another class.

Factory pattern is horrible when applied to objects which have little to no construct activity in the first place. Misapplied in this manner it becomes useless boilerplate busy box code. The notion that the new keyword is bad is just plain stupid - there are times to use it.

Red herring. Dependencies are going to exist. You can write 20,000 lines of busybox factory code if you want, you’ll only accomplish moving where the dependencies are passed.

Complete. Waste. Of. Time.

Especially when you consider the factory is only abstracting the dependency - it by no means and in no way removes it. Indeed, it’s worst of both worlds with a misapplied factory - you both have dependencies and more miles of code to dig through to see them.

-The constructor is liable to change during development

The same can be said of any other piece of code in the system, so another moot point.

-It’s ever going to exist outside the scope of where it was created.

The same can be said of the factory constructor, unless the factory is static in which congratulations - you just went to the global access pattern.

-You may need polymorphism. Are you sure its always going to be that class you want to use? In 6 months you may create a child class to deal with specific cases. If you use a factory, you only need the logic to decide which to create in one place.

You don’t need a factory pattern to do new $class($args);

This is a good example of a bad example (: I know examples always have their problems… But nevertheless if A really only needs B then this should be the dependency needed to instanciate A.
I personally use factories only for higher level objects mostly living in the aplication scope. When the factories are getting bigger and bigger it’s a good time to switch to a IOC or something like that…

cheers
Chris

That separation doesn’t require a factory pattern though.


class A { 
    public $b = array(); 
    public $container = 'B';
      
    public function __construct(C $c) { 
        for ($i = 0 ; $i < 5; $i++) {
            // Not sure of this syntax at the moment.
            $this->b[] = new $this->{'container'}($c, $i);

            //this syntax will work for certain though
            $container = $this->container
            $this->b[] = new $container($c, $i);
        }   
    }       
} 

class B { 
    public function __construct(C $c, $num) { 
          
    }      
} 


class C { 
          
}

The factory’s only advantage now is it allows the container class to be arbitrarily set. That goes away with the inclusion of a single argument to class A though.


public function __construct(C $c, $container = 'B') { 

So here’s where we split. You’re arguing everything under the sun needs a factory. I’m arguing that factories are only justified when the setup logic becomes burdensome. When that occurs is a judgement call.

Maybe I’m splitting hairs here, but I don’t call such abstraction a factory unless the class in question quite literally does nothing but build other objects. And I’m considering dealing with that problem with only a factory for basic cases in this manner. This is a draft


class Factory {
  public function __invoke() {
     $args = func_get_args();

     // The first argument is the object we are creating.
     $object = array_shift($args);

     if (is_array($object)) {
         $args = $object;
         $object = array_shift($args);
     }

     // Reflection
  }
}

Psuedocode here needs further research on my part

  • Create a Reflection object of object. Check the type hinting of the constructor.
  • Check each argument. If match, pass along.
  • If array and the argument expects an object, recurse - so the first element of the array would be the object name and the remaining values would be the arguments that object receives. Try to create the object, test for match of type with reflection, and if match pass along.
  • For each object expected with no argument recurse to try to create an empty one.

My main worry that this over abstracts things because if this is abused you could only see dependencies in action by opening the class files and reading the type hints.

The problem is you still have the dependency of C in A and you have just moved the “which type of object should I create” logic up a level.

As I said earlier, By using the new keyword like that you forfeit polymorphism.
You’re suggesting this:


class A { 
    public $b = array(); 
    public $container = 'B';
      
    public function __construct(C $c) { 
        for ($i = 0 ; $i < 5; $i++) {
            // Not sure of this syntax at the moment.
            $this->b[] = new $this->{'container'}($c, $i);

            //this syntax will work for certain though
            $container = $this->container
            $this->b[] = new $container($c, $i);
        }   
    }       
} 

class B { 
    public function __construct(C $c, $num) { 
          
    }      
} 


class C { 
          
} 

class Extended_B extends B {

}

$a = new A(new C);
//ignore the fact that we're using constructor adn assume this works:
$a->container = 'Extended_B';

Which works fine until Extended_B needs something else, e.g.:



class Extended_B extends B {
	public function __construct(C $c, D $d, $num) { 
          
    }      
}

Now you need to modify A to account for the addition of the D parameter to the B class (sorry I’m starting to wish I’d not used letters in this example it’s all a bit confusing to read).

It’s likey B (or Extended_B) is being initialised in several places. Each time A is created you need the logic to set $a->container and you need to extend it to account for any dependeny that any subclass of B may ever have

By moving it to a factory A (or any other class which initialises B) never has to change, no matter if you want to use something like:


class Extended_B {
	public function __construct($c, $d, $e, $f) {

	}
}


Only the factory would ever need to be modified. One A is written it’s done, it can continue doing its own thing enjoying the benefits of polymorphism while the factory worries about getting the correct dependencies to any descendent of B.

From a testability point of view, I disagree, but ignoring that you should never use the new keyword if:

-What you’re initialising has dependencies
-The constructor is liable to change during development
-It’s ever going to exist outside the scope of where it was created
-You may need polymorphism. Are you sure its always going to be that class you want to use? In 6 months you may create a child class to deal with specific cases. If you use a factory, you only need the logic to decide which to create in one place.

Essentially, using new in arbitrary places in the code impedes testing and scalability by introducing tight coupling.

That said, delegating all object creation to factories is going to be a performance hit. In places which are unlikely to change such as core framework initialisation code the new keyword is probably preferable.

Consider this:


<?php 

class A {
	public $b;		
	
	public function __construct(C $c) {
		$this->b = new B($c); 		
	}
	
}

class B {
	public function __construct(C $c) {
		
	}	
}


class C {
		
}
?>

A now has an dependency on “C” for the sole purpose of passing it on to “B”. It has no use for the dependency itself. If B had 4 dependencies, A would have 4 extra properties that are beyond its real scope. By delegating object creation to a factory, it doesn’t have to care at all about what dependencies it needs.

The same can be said of any other piece of code in the system, so another moot point.

Far from it. If The constructor changes (e.g. a new dependency is added) then I have only to change the factory to account for it. If arbitrary code is calling “new $class” then I have to:
-Locate each part of code doing this
-Update it… probably finding the dependency from somewhere and then updating that class to have the extra dependency too

You don’t need a factory pattern to do new $class($args);

Nope but you need something like:


if ($foo) $class = 'a';
else if ($bar) $class = 'b';
else $class = 'c';

This logic can be delegated to a factory. If you have this repeated throughout the application you again need to go through and update the logic in multiple places if it changes.

Thanks guys! I will read the pages!
“The notion that the new keyword is bad is just plain stupid - there are times to use it.”
– I think so too

Here is a pretty clear (with examples in PHP too!) overview of the Abstract Factory and [URL=“http://sourcemaking.com/design_patterns/factory_method”]Factory Method design patterns. :wink:

Well it was a really trivial example to illustrate my point :p. In reality in that case a better solution would be to also inject B into A (not that Michael will agree with that either). A may need to create multiple instances of B in the real world. My point was that forcing A to contain a reference to C just to pass it on to objects it creates is messy and causes issues with both testing and scalability.

Perhaps a better example would be:


<?php 

class A { 
    public $b = array(); 
         
    public function __construct(C $c) { 
        for ($i = 0 ; $i < 5; $i++) $this->b[] = new B($c, $i);  
    } 
     
} 

class B { 
    public function __construct(C $c, $num) { 
         
    }     
} 


class C { 
         
}

By changing it to:


<?php 

class A { 
    public $b = array(); 
         
    public function __construct(BFactory $factory) { 
        for ($i = 0 ; $i < 5; $i++) $this->b[] = $factory->create($num);
    } 
     
} 

class B { 
    public function __construct(C $c, $num) { 
         
    }     
} 


class C { 
         
}

class BFactory {
	protected $c;
	
	public function __construct(C $c) {
		$this->c = $c;	
	}
	
	public function create($num) {
		return new B($this->c, $num);
	}
}
?>

If a new dependency (lets say D) is added to B, only the factory needs to be updated. A (and every other class which ever needs to initiate B) remains unchanged. A now works without C. A doesn’t know/care about Cs existence at all. What B needs is irrelevant to A.

It’s a simple case of separation of concerns.

And you’re wasting time planning for eventualities that may never come to be, and creating an unnecessary layer of complexity by religiously and blindly following the pattern.

If something is instantiated only once at one spot in the code new is perfectly fine. The second time it comes up a factory is to be considered. On the third go a factory or something like it is demanded.

Math analogy - Just because 2+2+2+2 is better written as 4x2 does not mean 2x2 is a better way to write 2+2.

Use the pattern when and where the situation demands it. The factory pattern is not suited to all object instantiations, or even most of them.

I’d rather refactor code when needed than lose time (both mine and the computer’s) refactoring code that doesn’t need it.

If something is instantiated only once at one spot in the code new is perfectly fine. The second time it comes up a factory is to be considered. On the third go a factory or something like it is demanded.

Which is fine, until you or another developer come back to something in 6 months time and can’t remember exactly where/how many it’s been used. If I can see it using a factory I know that any changes I make are not going to break something elsewhere in the system. I also don’t need to worry if i add a subclass, I can put the logic which decides which class to create here and know it affects everything from the start.

Finally, there’s testing. I can pass a mock factory. Using new B() means you need to test A and B at the same time.

Hi…

When to use a factory? You use a factory when you want to alter the object being instantiated, usually a different class. There are 3 main reasons I can think of:

  1. You actually need to decide the starting state of an object within the running app. This is a pretty clear cut decision. The factory will spit out an object in some state depending on the factory’s input. This input could be the incoming request, configuration information or some error state.

  2. You want to swap things out for testing/mocking. This is often the biggest driver if your process is test driven. You start by just passing the needed helper objects in, but find you need them in groups or you end up passing too many around. You group the instantiations into a factory and pass that instead.

Note that the driver here is second order. You just wanted to pass in dependent objects to make testing easier, but then ran into the problem of lot’s of passing when integrating the larger app. You can actually go a long way before this becomes a problem, so the decision to use a factory can be deferred until its obvious.

Why use a factory rather than just instantiating them in the higher up objects? Because that will fill those classes with creation and configuration code that will clutter the mechanics of the existing class. It will also make the class dependent on external input when it doesn’t need to be. Separating the instantiation of foreign objects will stop classes getting bloated.

  1. You want to decouple your system into clearly separated modules. The measure of separation is being able to substitute them cleanly.

Although constantly doing 2 will cause you to do 3 anyway, it’s 3 that has the biggest benefit long term. It’s so much easier to write correct code within a small isolated chunk. Even if your processes are not strongly test driven, most large apps will end up using some kind of dependency management tool. Otherwise the big ball of mud beckons.

The ultimate versions of these organisational factories are the ServiceLocator pattern and dependency injection.

yours, Marcus

That’s a failure in documentation procedure and coder discipline. If those are lacking a factory or anything else isn’t going to save you.

All three are great reasons, but let me add a corallary to this one and maybe get to something I’ve been trying to drive at - not all classes have multiple possible initial states and some by their very nature do an adequate job of segregating themselves off from the rest of the code, especially if they have no innate dependencies other than PHP itself.

For example, in my framework there’s the ReadOnlyArray object. It takes an array as an object and essentially locks it down. It has no factory of it’s own but a first generation child of it - Library - has one of the most complicated factories I’ve written. The final reason it needs no factory is it is one I call a first generation object - meaning it has no parents in the framework (it does implement Iterator and ArrayAccess though).

If I use a ReadOnlyArray and decide I need to use a Library then having a factory for ReadOnlyArray will not help - A Library takes very different arguments from its parent anyway. If I wrote a factory to try to handle both cases it would be quite lengthy and convoluted because it was doing too much. And if I don’t well, back to square one.

My main point is this everyone - simple objects do not need factories. No one I’ve seen build an arrrayFactory into your code to build the array primitives of PHP do you? I sure hope you don’t have a stdClassFactory either. Some classes do not have variant set up states, and in any event I find it hard to justify a factory for any class with no arguments or only one argument especially if they have no dependencies and no inheritance.

Tom’s argument about changing classes hinges on the idea that child classes constitute a different starting state. If you write your classes like that then yeah, you’re forcing yourself to use factories. I personally write child classes to significantly modify the behavior the class - not as a shorthand of expressing an argument. Hence


Class A {
  protected $Foo = 'Baa';
}

Class B extends A {
  protected $Foo = 'Blee';
}

You pretty much HAVE to have a factory here even if the constructor has no argument. But I discounted this in earlier arguments because I see it as a very bad way to use of polymorphism.

If a class starts another primitive class I don’t see it needing a factory - again, primitive classes shouldn’t need factories.

It’s complex classes that do. Any class that takes any complex class as an argument or needs to start one.

Maybe this grousing makes more sense.

[ot]Yes I said I put lastcraft on ignore and I did for about a week. Not because he did anything wrong but because I was about to - I use the list as means to control my temper more often than to block idiots because, truth be known, I’m the worst idiot here.

lastcraft, my apologies for my behavior in the previous thread.
[/ot]

EDIT: One other thing… Child classes must, in my opinion, fulfill the expectations of their parents unless their parents are abstract. Going back to the example above, if a class takes ReadOnlyArray as an argument, it shouldn’t care if it gets a Library or Registry instead, both of which are children of ReadOnlyArray. These child classes give the same public returns as their parent for the methods they inherit from their parent (They are free of course to establish new returns on methods unique to them). This furthers decoupling without resorting to the factory pattern.

The problem occurs when your “primitive class” has behaviour. While testing the class which is using it you are also testing the class itself. You will never know whether there is a bug in the class you’re testing or your primitive object. You seem to be suggesting that data structures and objects with no dependencies are the same. They’re not.


public function foo() {
	$bar = new Bar();
	$bar->process($this->data);
	return $data;
}

when it comes to testing this method, if it fails, is it failing because there is something wrong with the method itself or because $bar->process() is doing something iffy?

There is no way to know. This is the problem. If $bar was injected it can be substituted for a mock object and therefore ruled out. Same for a factory.

The best solution is still usually dependency injection rather than creating the object or a factory.

That’s a failure in documentation procedure and coder discipline. If those are lacking a factory or anything else isn’t going to save you

True but if there was a factory in the first place there would be no need to keep a running tally of what’s used where.

O.o Seriously?

Have you ever experienced a bug out of PHP itself?? Trust me, it’s Hell, because its the last place you’ll look for a problem.

And guess what, Factory or no I guarantee that if a primitive class that has passed all its unit tests and hasn’t caused problems for months or years has a problem that will also be the last place you’ll look for the problem - substituting it out for a “mock” object will be the last thing you try.

Maybe the notion of a stable API is foreign to you.

So again, your beloved factory pattern here has wasted time to no effect when you insist on misapplying it for eventualities which will never come up. Because I’m willing to bet tracing the problem to the primitive will take just as long with the factory as without so all the time building the factory was wasted.

A Factory is supposed to be a separation of concerns, not a substitute for the new operator. Separate the building of an object from its use. Some objects just don’t have any building process to speak of, so what are you separating? Nothing. Busybox code is what I call that sort of stuff.

You are building an object whatever you do. If that object has behaviour (e.g. it’s not just acting as a data structure), that behaviour can contain bugs.

If you can’t see why isolating code and doing tests on only that code is desirable I don’t think you’ll understand why the new keyword is bad.

o.O Seriously?

That’s not my argument. My argument is it’s no worse than the cure you propose. After all, your solution to code that could break is more code that can break.

At some point something has to depend on something else. That’s why I brought up the specter of PHP itself going bad. What then Tom? You can’t do any effective tests on PHP with a bad PHP Interpreter. That is what you don’t understand.

There is truth in the old adage “The more you overwork the plumbing the easier it is to stop up the drain.” Factories for everything is a perfect example of this - increased complexity at no benefit.

So run with that ball if you want. I’m done with this conversation.

At some point something has to depend on something else. That’s why I brought up the specter of PHP itself going bad. What then Tom? You can’t do any effective tests on PHP with a bad PHP Interpreter. That is what you don’t understand.

This is actually a perfect example. Say you upgrade PHP and the method above stops passing tests. Is it because of something in the method itself or in the $bar->process() method? Sure it may have been working for years but some odd php bug and or something removed may have broken it.

If you had isolated the code not to rely on “new Bar” then only $bar->process() would fail tests. If not, anywhere which calls “new Bar” now stops passing tests and you have to look through much more code to find the source of the problem.

Anyway, as you say this conversation is pretty much going around in circles so i’ll leave it there.