OOP: The courier anti-pattern

10 October 2012

Introduction

I'm going to coin this term because I don't believe this anti-pattern is formally recognised. Please email me (tom@r.je) if you know of an existing definition of this and I'll amend this page. I'm writing this because I can't find any documentation on the issues this code smell creates. Even though the implications can be huge and I see this anti-pattern used without question far more frequently than I'd like. Hopefully, providing a solid reasoning against its use and explaining better alternatives will help people avoid using it.

This anti-pattern is easy to identify: It's used as a method of making dependencies accessible to parts of the system which need them by passing them through components which don't require them. This code doing the passing is what I refer to as "the courier"

The two main signs of this are:

1) A class has a dependency but never calls any methods on it

2) A class has a dependency in expectation that subclasses will use it

The problems this causes aren't immediately obvious. Before discussing the implications, here's a simplified example of the anti-pattern:

Example 1a

This often appears when people are using dependency injection at its most basic level without understanding the implications of it.

class Dependency {}

class B {
    public function __construct(Dependency $dependency) {
        //...
    }
}

class A {
    private $dependency;

    public function __construct(Dependency $dependency) {
        $this->dependency = $dependency;
    }

    public function foo() {
        $b = new B($this->dependency);
        //...
    }
}

Here, the class A requries the dependency for the sole purpose of passing it on when it creates an instance of B. This is the most basic form of the courier anti-pattern.

However, a variant of this exists where the dependency is passed into a collaborator:

Example 1b

class Dependency {}

class B {
    public function bar(Dependency $dependency) {
        //...
    }
}

class A {
    private $dependency;
    private $b;

    public function __construct(Dependency $dependency, B $b) {
        $this->dependency = $dependency;
        $this->b = $b;
    }

    public function foo() {
        $b = $b->bar($this->dependency);
        //...
    }
}

Why is this a problem?

The downsides aren't immediately obvious. It's clear that A has a redundant property but that alone doesn't cause any real issues beyond untidyness.

The problems come with maintainability and scalability. What if the implementation of B changes and now requires a second dependency? Both the client and the supplier need to be updated to account for it along with the code anywhere else the supplier is created in this manner. Suddenly yet another dependency has been introduced in A. If A is creating objects alongside B, all the dependencies for those are required in A as well and suddenly A has become a god class.

From an OOP perspective, the underlying issue is one of encapsulation. A shouldn't be aware of the implementation of B. It shouldn't be concerned with what B has as dependencies and it certainly should not force those dependencies up the chain. Using this method throughout the project, every class ends up having a dependency on every dependency that every object below it in the chain has. This makes for very rigid code with zero portability. It becomes impossible to move any class within the chain to a new project unless everything below it is moved as well!

Imagine if, in example 1a, a new dependency is added to B. In this instance, A must be updated to account for it, but in the bigger picture, whatever is initialising A must as well, and whatever is initialising that needs updating too this knock on effect will happen all the way to the very top level of the application code.

Suddenly, every object in the chain has an updated constructor. And who knows where all those objects are constructed? It becomes a huge task to track down all the creation code that exists in the entire application that is affected by the API change forced by simply adding a new dependency to one class at the very bottom of the chain.

The fix

The root cause of this problem is having object creation in business logic. The simplest fix is to inject a fully constructed B into A:

class B {
    public function __construct(Dependency $dependency) {
        //...
    }
}

class A {
    private $b;

    public function __construct(B $b) {
    $this->b = $b;
    }

    public function foo() {
    //...
    }
}

Simple and less code. The problem is, it's also less convenient because whatever creates A has to also create B which pushes the object initiation code up the chain. However, this is preferrable because A is now entirely self-contained. It has a dependency on B but nothing else. This is ideal because any dependencies B has are irrelevant to A. They can change as much as needed throughout the lifetime of the project and A stays exactly as it is.

Sometimes an object does genuinley need to initiate a collaborator. In this instance, the answer is a Factory. This allows the factory to keep track of what dependencies the subclass has and return a fully constructed instance.

class Dependency{}

class B {
    public function __construct(Dependency $dependency, $name) {
        //...
    }
}

class BFactory {
    private $dependency;

    public function __construct(Dependency $dependency) {
        $this->dependency = $dependency;
    }

    public function create($name) {
        return new B($this->dependency, $name);
    }
}

class A {
    private $bFactory;

    public function __construct(BFactory $bFactory) {
        $this->bFactory = $bFactory;
    }

    public function foo() {
        $b = $this->bFactory->create('foo');
        //...
    }
}

In this example, object creation has been delegated to BFactory (Aside: Factories should not be static). This keeps the B's dependencies hidden from any object which needs to initialise an instance of B. Any dependencies B needs can change throught the project's lifetime without needing any modification to A. If B suddenly gets an extra dependency C, this new dependency can be passed to the factory and A remains unchanged and unaware this has happened.

The obvious flaw here is that BFactory contains all the problems A used to, they've just been moved. But, this is the goal, now the logic is all in one place and is therefore reusable. The factory is the only thing to know what dependencies B needs, everything else in the system is totally apathetic. Any place B is created can use the factory and avoid all the problems associated with trying to locate the dependencies needed to initiate B.

Although the factory shows the symptoms of the courier anti-pattern, its sole responsibility is to create B objects. However, it will only ever have the exact same dependencies as B, not the dependencies of B's dependencies and the dependencies of B's depenendencies dependencies (Confusing, eh? Factories help keep things simple) removing the problem of having every dependency in the system in one place. Most of the time, the factory is irrelevant because the object can simply be injected fully constructed. The factory simply makes the problem manageable by recognising it exists.

Variant 1: Inheritance courier

The second case of the courier anti-pattern is a bit more subtle and arguably less problematic. It abuses inheritance by defining a dependency in the parent class for the purpose of making it available to the child class.

Example 2

class Dependency {}


class A {
    protected $dependency;

    public function __construct(Dependency $dependency) {
        $this->dependency = $dependency;
    }
}

class B extends A {
    public function foo() {
        $this->dependency->bar();
    }
}

Here, A has a dependency but doesn't use it. It's there to allow child classes to make use of it. While this provides some convenience for the programmer, it couples the parent class to the dependency for no reason whatsoever. This greatly reduces the portability of it and assumes a lot about its subclasses.

I don't want to stray too much into the subject of "Inheritance breaks encapsulation" as thats a quite large topic all by itself but the biggest issue here is that the parent class has implied knowledge of the implementation of potential subclasses. This directly breaks encapsulation. It also directly exposes the child classes to the implementation of the parent class because of the protected property, but that really is another topic for another time! (Protected/public properties always break encapsulation!)

The problems this causes are arguably minimal and it's certainly less of an issue than the first example. However, scale this type of thinking up and it starts to become messy. What if one child class needs one dependency but another has a different one? Should they both be available to both? As the system grows, so does the base class. It will eventually reach the point where the base class is essentially a service locator. Extending a service locator with arbitrary classes doesn't sound like a good idea does it? But that's essentially all this is.

Although it doesn't present any major problems on the small scale, it's still best avoided in order to offer maximum flexibility. This type of code also hints at bigger issues in the codebase. If all child classes do really need the dependency, what are they doing with it? Presumably it's something similar. Perhaps it's time to refactor that similar code into something reusable elsewhere in the system, possibly even a protected method in the parent with some relevant parameters.

Fixing this issue

class Dependency {}

class A {

}

class B extends A {
    private $dependency;

    public function __construct(Dependency $dependency) {
        $this->dependency = $dependency;
    }

    public function foo() {
        $this->dependency->bar();
    }
}

Once again, the simplest fix is removing the problem entirely. Now this is done, the API on both A and B are explicit in exactly what they need. A knows nothing of the implementation of its subclasses and B has no knowledge of the inner workings of A. Perfect! A can be moved to an entirely different project because it is entirely self-contained.

In the real world, thing's aren't so simple.

Taking a step back to look at the bigger picture, a lot of people will have looked at my suggested fix for the second example and taken issue with it along the lines of thinking "But that won't work in my code because of polymorphism". That's true to an extent! In fact, the main reason code like example 2 gets created in the first place is so that subclasses can be initiated on the fly because the constructor is consistent across them all. With the same constructor this is a lot simpler for the developer.

But once again, it's sacrificing encapsulation and flexibility for convenience. Which is fine, but these kind of things tend to spiral out of control as projects grow and it becomes very difficult to fix. It's impossible to remove a dependency from the parent because there's no way to know which (if any) subclasses use it and as more subclasses with more dependencies are added the consistent constructor suddenly needs to take more and more parameters.

You end up with a very brittle base class that cannot be touched by developers for fear of breaking subclasses. I'm sure everyone has worked with code like this at some point. It's never fun, it's like spinning plates and is one of the biggest problems with breaking encapsulation. The best rule of thumb about breaking encapsulation is: don't. Although there are cases where it's needed, any time encapsulation can be ensured it should be.

So, how can you keep polymorphism and encapsulation? There are plenty of methods: Either a factory for creating the specific subclasses which contains logic to ensure that each subclass is created with the exact dependencies it needs and none that it doesn't. Or even better, a Dependency Injection Container such as Dice can handle all object creation and make this task even easier than calling new $subclass!

Conclusion

This anti-pattern is often used as a method of handling dependencies without the forethought of the implications it causes. I hope that now I've pointed out the serious problems it causes it can be avoided by developers.

Although like a lot of code smells, most of the problems associated with this anti-pattern don't become clear until projects are scaled up or code is reused, it's worth taking the steps to avoid it as there is little to no extra work involved and the minimal benefit (slight convenience for the programmer) is heavily outweighed by the maintainability and flexibility problems it creates.

The real alternative is a proper Dependency Injection Container with factories that sits at a level above the business logic that allows a true separation of object creation and the usage of those objects.