Dependency Injection Breaks Encapsulation

Jeff_Mott · March 4, 2015, 1:23am

No, the kind of code your end-user would write probably should not require knowledge of the BankManager’s dependencies.

Dependency injection almost always comes hand-in-hand with a dependency injection container. The container would manage object creation, configuration, lifetime, and dependencies. And you could even use it as the new entry point of your application.

<?php

namespace BankModule;

class ServiceContainer
{
    // This replaces singletons, globals, statics, etc.
    // Any instance that needs to be reused throughout your application will be saved here.
    private $services = [];
    
    public function getBankManager()
    {
        // This "if not set" guard looks very similar to the singleton pattern, and indeed it is.
        // It's still necessary to restrict some objects to just a single instance.
        // The crucial difference is that there's no statics, no global state.
        if (!isset($this->services['bankManager'])) {
            $this->services['bankManager'] = new BankManager($this->getAccountManager());
        }
        
        return $this->services['bankManager'];
    }
    
    public function getAccountManager()
    {
        if (!isset($this->services['accountManager'])) {
            $this->services['accountManager'] = new AccountManager();
        }
        
        return $this->services['accountManager'];
    }
}

And your end-users would use it this way:

$bankServiceContainer = new BankModule\ServiceContainer();

$bankManager = $bankServiceContainer->getBankManager();
// ...

The latter is more adaptable to change, which is a good thing, because software requirements change frequently. Good software architecture yields a long term payoff. Over the years, as you add, remove, and change features, good architecture makes those changes easy. Bad architecture increases the likelihood of hacks, bugs, inconsistencies, and all-around spaghetti-ness.

For me, good design means that when I make a change, it’s as if the entire program was crafted in anticipation of it. I can solve a task with just a few choice function calls that slot in perfectly, leaving not the slightest ripple on the placid surface of the code.

In the former example, your ShoppingCart class is hardcoded to use BankManager. It can’t swap in any alternative implementation. Also, your ShoppingCart has to know how to create a BankManager, which requires knowledge of, and access to, all of BankManager’s dependencies. That’s something ShoppingCart doesn’t need to concern itself with.

Instead, we resolve the dependencies in the container:

class ServiceContainer
{
    // ...
    
    public function getShoppingCart()
    {
        // No "if not set" guard this time.
        // Remember, the container manages object lifetime.
        // It's responsible for knowing whether you need to reuse an instance,
        // or whether you need to create a fresh instance with each request.
        return new ShoppingCart($this->getBankManager());
    }
}

saamorim · March 4, 2015, 10:23am

@Jeff_Mott, Right, I mention this before that a dependency injector is usually resolved by a container.

So do you believe that you should do configuration for each of the “modules”, so others don’t care about your internal ? Somehow, this is the implementation of Symfony’s container. Although only a single container exists, each module specifies how their own dependencies are resolved. If so, each module resolve its own dependencies, and leaves the extension point to others when it’s appropriate. In the extreme case, we can say that it is completely encapsulated without any “dependency injection”.

But having a container has it’s drawbacks also, first, your dependency can scale “out of control”, whenever it be a yml, xml or hard coded (just running app/console container:debug of symfony scares me sometimes) …, if you have business logic about which of the dependencies you would like to inject into a specific scenario, you must, somehow, do them outside the scope of their “use”, because your’re doing $this->service->get("something"), abstractions like factories can help on this, or directly if the container scopes. Also the container should resolve, circular dependencies, singletons, factories, version conflicts, and be sufficient robust to in terms of memory management and performance. All of those should be taken into account.

But, getting back to our question, what if you don’t have the service container (DI doesn’t enforce this) and you just want plain vanilla DI. Should you do the BankManager do it’s own new AccountManager() ? A new Address() on the new Person() ? A new Address on a AddressFactory->create() ?

Where does the DI “should” be resolved? Which can also mean, when and where do you make your business assumptions ?

So… does Dependency Injection Breaks Encapsulation ?

TomB · March 4, 2015, 12:46pm

The problem with this quite loose approach to the definition of encapsulation is that here:

new A(new B));

The calling code knows that A needs B, which by that definition is “broken encapsulation”. The problem is, from the other perspective:

class A {
    public function __construct() {
        $this->b = new B;
    }
}

Has just moved the issue. The A class, and the author of the A class know which exact implementation of B is being used. Which, by the same definition is “broken encapsulation” from the other end of the program: A knows about the implementation of B when it shouldn’t.

It really is an issue of perspective:

If you’re looking at encapsulation between A and B from the perspective of the A class itself, then DI has better encapsulation because A knows nothing about the implementation of B.
If you’re looking at encapsulation between A and B from the perspective of the code constructing A, then constructing B in A’s constructor has better encapsulation because the calling code doesn’t know anything about the dependencies of A.

The fundamental issue here is that you need to express the relationships between objects somewhere. The advantage of DI is that it’s only one place: The program entry point, not spread among every class.

Jeff_Mott · March 4, 2015, 1:03pm

That sounds like a decent idea to me, yes. The use of DI would allow the vast majority of your classes to be decoupled from each other, and only the front-most class, the container, would have knowledge of the bigger picture and act as the glue between the otherwise decoupled classes.

If for some reason you can’t use even a rudimentary container, like in my example above – maybe you need to maintain a public API and can’t break BC, or something like that – then under those restrictions, I would reconsider using DI. More than anything else, you should make life simple for your end-user.

I think so, yes. Though, the consensus among widely known and highly respected developers (Miško Hevery of Google, for example, has been mentioned in this topic) seems to be that dependencies should not be encapsulated. Instead, a class should make it’s dependencies obvious.

Jeff_Mott · March 4, 2015, 1:12pm

I don’t think merely knowing that B exists, or knowing its public API, count as breaking encapsulation. We often say it’s better to code to an interface, not an implementation – and it is – but that idea is separate from encapsulation.

TomB · March 4, 2015, 1:30pm

But you are knowing more than its public API, you’re knowing which particular implementation of that API is being used. Let’s put it this way:

$this->foo = new Foo;

$this->foo = $foo;

Im the first example, I know a lot more about the nature of $this->foo than I do in the second. I essentially know which implementation is being used. In the second I do not. This is as much “broken encapsulation” as new Bar(new Foo)

saamorim · March 4, 2015, 2:58pm

Glad to know that I put us again discussing the topic rather than code quality.

@TomB somewhere you have to create objects on the program, so you need to make assumptions. The questions that I place are “Where should you do the new Foo()?” and “What is an entry point ?”

If I want some isolation of my module, of my API, I will consider it as my “entry point”. I don’t want others to be messing around and report bugs that could be difficult to make the diagnosis. That doesn’t mean that, internally to the module, I won’t do DI. By I will only expose what I want to. In the scenario that I presented (the Bank Manager), I can test it and guarantee that it works under certain specifications, for instance “sum of all accounts is equal before and after executing the operations”. If I make all my injections possible I cannot guarantee that, because I don’t know that the injected object AccountManagerExtended will have the same assumptions that I had on the makeDebitTransfer method. I don’t know if they will follow the Liskov principle. The other way to put it is. “Do I trust my callers?” This is almost the same principle as letting the user access to the database (as he can give its own DataAccess layer implementation.

Again, for me is a matter of balance based on your context. Knowing where to encapsulate (resolve the dependencies on your own) or where to open them up.

TomB · March 4, 2015, 3:10pm

I meant the entry point of the program. In Java this is the main method, in PHP this is the place that creates the first object. Usually index.php or similar.

The problem I have with this is the caller is either:

Some code you wrote
A test case
Someone else using your module

In either case, it should be up to the caller to configure the application. If a third party using your library wants it to work differently, so doesn’t inject the implementation you provided that should be up to them. If their project has slightly different requirements and your module doesn’t meet them then they simply cannot use your module. With DI, they can tweak it by changing the dependencies.

I don’t want others to be messing around and report bugs that could be difficult to make the diagnosis

By using injection and TDD this simply won’t be an issue as everything is isolated for debugging purposes.

TomB · March 4, 2015, 3:25pm

I want to demonstrate another problem with the new keyword in constructors. Consider this class heirachy:


class A {
	public function __construct() {
		$this->b =  new B;
	}
}

class B {
	public function __construct()	{
		$this->c = new C;
	}
}

class C {

}

Fine, this works, but the problem is, if the requirements for C change slightly and it now needs to write to a temporary DIR:

class C {

	public function __construct($tmpDir) {
		$this->tmpDir = 
	}

	public function doSomething() {
		file_put_contents($this->tmpDir . 'foo.txt', $data);
	}
}

This causes a problem. $tmpDir must be supplied by the application developer as it’s likely going to be different on each system it runs on.

The only way to resolve this means changing class higher in the object graph and asking for a $tmpDir variable:


class A {
	public function __construct($tmpDir) {
		$this->b =  new B($tmpDir);
	}
}

class B {
	public function __construct($tmpDir)	{
		$this->c = new C($tmpDir);
	}
}

Of course, as the system grows, this gets very messy


class A {
	public function __construct($tmpDir, $language) {
		$this->b =  new B($tmpDir);
		$this->d = new D($language);
	}
}

class B {
	public function __construct($tmpDir)	{
		$this->c = new C($tmpDir);
	}
}

class D {

	public function __construct($language)	{
		
	}
}

A now has a dependency on appication configuration for everything in the object-graph. Clearly this becomes a bigger issue the bigger the object graph gets.

This breaks encapsulation the other way. A now knows about an implementation detail of C and D which it really doesn’t need to.

Every time I add a constructor argument to C that is application-configurable I must update B and A (and if the object graph is deep that could be a lot of places!)

The alternative to this is global/static variables but that’s worse!

saamorim · March 4, 2015, 4:09pm

Ok. Is your point of view and I respect it, but you’re forgetting to put yourself in the role of a provider. Do you want to receive notifications that your code breaks and is up to your to check why it did break?

BTW, a FactoryFoo Factory pattern does do the new Foo(), will you resolve it as a dependency, or you just don’t use this patten?

But most importantly, you’ve got yourself contradicted:

You’ve just show that, by your rules, if your application gets bigger, no matter what, you’ll get a system that’s it is just “not right”. You mentioned before, several times:

So, where does this best practice fails if you must follow it?

I think you’re missing the fact that you should consider alternatives. There isn’t only one way, one dogma, one religion, programming isn’t a dictatorship. Again, like I said before, I respect the “best practices” and tend to follow them, but knowing when to break them is as important as to follow them. Also keep in mind that there is no program that will last forever, everything changes, the question of minimizing the changes and trying to get the most values of it.

Be reasonable

TomB · March 4, 2015, 4:15pm

But if someone is using a custom implementation, they’re actively decided to do that, if it then breaks it’s their own fault. If the documentation says use the code as “new A(new B))” and they use a different argument it’s their own fault. The same goes for non-argument config. $foo->setNumRows(“FIVE!”); If someone uses an invalid argument it’s their own fault it breaks…

Sorry, I don’t understand how this relates to what you quoted.

In DI, I can easily solve that problem:

new A(new B(new C));

The implementation of C changes, I change the C class and one other place:

new A(new B(new C('/tmp'));

I don’t need to edit all the classes in the object graph. When using the new keyword in constructors these kind of requirements changes are very hard to implement and mean extra parameters in constructor arguments for everything down the chain of created objects.

As the application grows and constructor arguments change, I only ever have to change them in one place, not pass them through potentially dozens of other classes.

With DI they mostly become irrelevant… why add extra code if you don’t need to? You also then need some way of informing the factory what the constructor arguments for the class are.

saamorim · March 4, 2015, 4:30pm

Don’t get me wrong @TomB, but it seems you’re suffering the same sympton that @tony_marston404 does. “My way is the only and best way in any scenario”. But I respect you both, as long as you stick and assume your decisions.

I conclude with my previous message.

Regards

TomB · March 4, 2015, 4:56pm

But this is @tony_marston style logic. “I DONT WANT TO DO IT” without any kind of reasoning. I didn’t say never break them, however I will say that I’ve yet to be shown an example where breaking best practices in a non-trivial application could be classed as better design. Sometimes it saves some work as a shortcut, but that also often leads to you refactoring it when the requirements change.

As the discussion has mainly focussed on when should you use DI and when should not and nobody has really presented a clear description or use-case for when you should not, it doesn’t really leave much room for useful discussion.

Essentially we have a list of tangible reasons for using DI (Flexibility, loose coupling, more easy to maintain, better separation of concerns) and a few vague allusions to use-cases where you shouldn’t use DI that aren’t really backed up by anything more concrete than “always following the rules is bad”.

tony_marston404 · March 5, 2015, 3:21pm

But only in those situations for in which it was not designed.

Nobody needs to know the implementation of any of my Model classes. They only need to know which methods are available. So the getData() method on the Person class will retrieve Person data while the same method on the Address class will retrieve Address data. The fact that the data may have to be retrieved from more than one table should be irrelevant.

tony_marston404 · March 5, 2015, 3:24pm

But there is no single document called “best practices” which is universally accepted by all programmers. I may not be following your personal version of best practices, but so what?

TomB · March 5, 2015, 3:40pm

This is entirely irrelevant to what you’re quoting. The article I quoted stated that DI caused that to happen. Please stop relating everything back to your own code.

He shoots! He misses entirely! The question you should be asking here is “how do you define best?” Which of course, is a valid question. The answer to that, however, is generally considered to be software that is easy to maintain or modify when the requirements change. The article that saamorim posted earlier had a fantastic quote:

Uncle Bob Martin once said, I believe in his Clean Coders video series, that (paraphrased) the second most important characteristic of good software is that it meet the customer requirements. The most important characteristic is that it be easy to change. Reason being, if a system is correct today but rigid, it will be wrong tomorrow when the customer wants changes. If the system is wrong today but flexible, it’s easy to make it right tomorrow. It may not be perfect, but I like DI because I need to be right tomorrow.

Once we’ve established what “best” means only then can you use it as a metric.

As “Best” in programming tends to mean flexiblity, and the list here is quite good http://c2.com/cgi/wiki?GoodArchitecture :

Robust - lacking bugs and tolerant of external faults
Maintainable - easy to maintain and extend
Useful - utility, beyond the immediate need (due to flexibility and extensibility)
Scalable - ability to grow in capacity, not in features
Common Vision - direction, strategy
Agile - simple and “elegant” enough to refactor easily; flexible
Extensible - ability to grow in features or in depth
Responsive - performance now and after adding features or expanding scale

Then we can accurately determine whether a given practice meets those requirements only then can it be deemed “best” or “bad”. Once you have that metric you can accurately compare two practices to see how well they meet the given criteria and mark one as better than the other.

We can also, at that point create a list of traits that go against what we consider best and say they are “bad practice”. If flexibility is an underlying metric for “best” then things such as tight coupling and global variables actively work against this so can be considered “bad” if your metric is flexibility.

So Tony, the question to you is, how do you define “best”?

tony_marston404 · March 5, 2015, 3:43pm

Is this true? Object “A” knows about the existence of object “B”, and it knows about "B"s methods, but it should not know the implementation details for any of those methods.

Encapsulation has nothing to do with relationships between different objects. Object A encapsulates all the properties and methods of entity A, while object B encapsulates all the properties and methods of entity B. Object A may call object B, which means that A is dependent on B, that A is coupled to B, but there is never any encapsulation “between” objects.

But what types of object? I can inject any one of my 300+ Model classes into any one of my reusable Controllers and VIews, which is why I use DI there. But when it comes to Model classes talking to other Model classes, I don’t use DI simply because I wish to talk to a particular model class that has no alternatives, so as far as I am concerned it is far easier to have that dependency hardcoded and hidden from the outside world.

TomB · March 5, 2015, 3:44pm

Please stop relating everything back to your framework and provide complete, minimal examples that don’t need an understanding of a third party architecture that uses 9000 line god classes.

As I said to Jeff_Mott earlier.

Consider the difference between:

$this->foo = $foo;

and

$this->foo = new Foo();

In the second example, the code knows exactly which implementation of Foo is being used. In the first, it does not. In the first example that information is hidden from the object that contains the code.

tony_marston404 · March 5, 2015, 3:51pm

This is the point I have been trying to make. DI is most useful where an object has a dependency which can be fulfilled from a variety of alternatives, so you choose which object you want and inject it rather than having some complicated code inside the object which has to make that decision internally. On the other hand, if an object has a dependency which is fixed and where there are no alternatives, it is possible to keep that dependency hidden and not have it injected.

tony_marston404 · March 5, 2015, 3:56pm

You are assuming that everyone is writing reusable libraries with interchangeable components that must be configured before they can be used. But what if that is not the case? What happens if a component within an application has hard-coded and unchangeable dependencies which exist within the same application, and which do not need external configuration?