The $this variable isn't as Object-Oriented as you think it is

17 October 2013

Introduction

Update: 07/11/2013 - Added several links to real world examples of this problem.

Update: 24/10/2013 - Thanks for your questions, I have amended this post to improve clarity. Feel free to ask further questions.

Here's something I've had drafted for a few years but never been able to express eloquently enough to pre-empt all the questions I'll get from it. However, it's been on my to-do list too long so I'm just going to put it out there and provide clarification based on any questions I get so please, ask away! Today I present my attempt at explaining a concept which to most people reading this will seem very counter-intuitive:

$this->method(); is essentially procedural programming.

$this keyword

Before I explain, let's take a look at the $this keyword. It only exists for the purposes of Object-Oriented Programming and has no meaning whatsoever outside OOP. What could possibly be more OOP than that?

It's the $this keyword which magically allows each instance of an object to access its own state. $this refers to the current object. In this regard, when using variables, the $this keyword is incredibly powerful in the world of OOP.

For accessing state, $this is fantastic and poses no problems with functionality, flexibility or any of the other things that OOP strives for.

However, as soon as you use it to call a method in the same class, that line blurs and it's less Object-Oriented than it first appears. You have essentially locked yourself into very specific implementation for that method. Ideally, in OOP, the implementation for almost every method should be able to be substituted. Firstly to make testing easier, and secondly, because it gives the whole application far greater flexibility.

Imagine a program that consisted of a single instance of a single class. It would still work exactly the same if you removed the class {} wrapper and had all the variables as global and the methods as procedural functions. Then imagine multiple instances of that same single class. Now you can use different data but the behaviour (methods) cannot be changed.

Consider this:

class A {
    public function foo() {
        $this->bar('x');
    }


    public function bar($x) {
        echo $x;
    }
}

The $this->bar() call in A::foo() can only ever reference A::bar(). It cannot be substituted. Any time $this->bar(); is called from any other method in the A class, the $this variable will contain an instance of A and the implementation for the method call must be supplied by A::bar() (albeit being called non-statically).

The effect of the same-class method call is that it essentially makes the method call... static. Ignoring access to object state, consider the functional difference between these two classes:

class A {
    public function foo() {
        $this->bar('x');
    }


    public function bar($x) {
        echo $x;
    }
}

class A {
    public function foo() {
        A::bar('x');
    }


    public static function bar($x) {
        echo $x;
    }
}

Or even:

function foo() {
    bar('x');
}

function bar($x) {
    echo $x;
}

Answer: there is none. They all work in exactly the same way. Once A::foo() gets called, A::bar gets called. In all three scenarios, there is no way to alter that and initiate an A object with a different implementation of the bar() method. In some cases that may be useful, but in most, it causes problems. Consider unit testing. It's impossible to test A::foo(); without also testing A::bar(). If there's a bug in A::bar() it will directly affect A::foo(). By using $this->bar(); it's impossible to substitute the bar() implementation during testing and quickly isolate the location of the bug (whether it's in the foo method or the bar method). If a test on the a::foo() method fails, it's impossible to know whether that was caused by code in the foo() method or code in the bar() method.

Essentially, at a less significant level, it's the same reasoning as dependency inejction vs singletons. It's Tell, Don't Ask at a method level rather than an object level. With dependency injection you can inject a mock object for testing. Using singletons, changing the behaviour of the dependency becomes problematic and I'm sure most people reading this will have already come across the hundreds of articles which explain why singletons are a bad idea.

Although the negative impact of $this->method(); is small compared to singletons because all the methods involved exist within one class, the reasoning is the same and generally an inappropriate $this->method(); call highlights a much bigger issue with Separation of Concerns.

You'll also notice these arguments are identical to those provided by OO purists such as Misko Hevery regarding static methods. That's because, as I mentioned earlier, a call to $this->method(); is essentially a static method call because it cannot be substituted during testing (and by extension at runtime or during the life of the application).

If those three examples are all functionally equivalent, what is gained by using OOP in the first place?

What if I wanted to write to a file instead of echoing to the screen? Inheritance is one possibility:

class A {
    public function foo() {
        $this->bar('x');
    }


    public function bar($x) {
        echo $x;
    }
}

class B extends A {
    public function bar($x) {
        file_put_contents('file', $x);
    }
}

At first glance, this solves the problems I presented earlier. The bar() method has been successfully substituted and:

$this is actually context-dependent. It can refer to an A object or a B object.
The method bar() can be substituted using a subclass so doesn't always mean A::bar()

However, it's been done in a very inflexible way.

Extending the whole class to override one method is problematic because what if you want a variant that overrides two methods in different ways?

It's still not very OO because at the point that the object is constructed, the implementation is locked. An A Object will always echo and an B object will always write to a file. Consider this:

class A {
    public function foo() {
        $this->bar('x');
    }


    public function bar($x) {
        echo $x;
    }
}

class B extends A {
    public function bar($x) {
        file_put_contents('file', $x);
    }
}

$a = new A;
$b = new B;

$a->foo();
$b->foo();

At this point, the instance of A in the $a variable can only ever output to the screen and the instance of B in the $b variable can only ever write to a file. That's because of the internal method call $this->bar(); referencing a predetermined implementation of bar(), determined at the time the object was declared a specific type (A or B)

So how is that fixed? As many will have identified, this is a separation of concerns issue and would be fixed by splitting the class out into two parts:

interface AOutputter {
    public function bar($output);
}


class A {
    public function foo(AOutputter $outputter) {
        $outputter->bar('x');
    }
}


class AScreenOutput implements AOutputter {
    public function bar($x) {
        echo $x;
    }
}

class AFileOutput implements AOutputter {
    public function bar($x) {
        file_put_contents('file', $x);
    }
}


$a = new A;
$b = new A;

$a->foo(new AScreenOutput);
$b->foo(new AFileOutput);

The B class has been scrapped entirely and two new classes with the sole responsibility of outputting have been created and passed into A's foo method

The result is the lack of a $this->bar() call, instead it's being called on an dynamic object. The $outputter variable, could refer to an object of any type. Whereas $this can only refer to an object of the current object's type. The benefit here is that the output mechanism for any A object can be decided at runtime on the fly and not at the moment the object is initialised.

This is a crucial concept in OOP and the reason that $this->method(); should be avoided wherever possible.

How to avoid this problem

Every time you're calling $this->method(); in a class, move the method into another class and add it as a dependency to the original class. For example:

class A {
    public function foo() {
        $this->bar();
    }

    public function bar() {
        //....
    }
}

Should be refactored to:

class Dependency {
    public function bar() {
        //....
    }
}


class A {
    private $dependency;

    public function __construct(Dependency $dependency) {
        $this->dependency = $dependency;
    }

    public function foo() {
        $this->dependency->bar();
    }
}

Here, the implementation of the bar() method is not contained with A. That means that as A is constructed, the implementation of the bar() method that it will use can be decided and passed in. Whereas in the first example, the bar() implementation is entirely static and every A object must use the exact same implementation. With dependency injection method, the bar() implementation can be decided at runtime.

Real world

The example above is very contrived and makes it easy to see the issues. In the real world it tends to be slightly more complicated than that, but here's a very real widely implemented example of this type of problem:

class BlogController extends Controller {

    public function indexAction() {
        //Assign some variables to a template

        //..
        //..
        $this->render();

    }

    public function editAction() {
        if ($submitted) {
            //Do some data processing then...
            //...
            //...
            $this->indexAction();
        }
    }

    public function deleteAction() {
        //Delete selected entry then....

        $this->indexAction();

    }

}

Look familiar? This has the exact same issues with flexibility that the contrived example above does. It always calls $this->render() so must render the data in a very specific, predetermined way. It also suffers poor separation of concerns by calling $this->indexAction() from other methods. It's not possible to execute the logic for editing/deleting and then show a different view. It can only ever call BlogController::indexAction() and never anything else!

Calling $this->method(); is almost always a symptom of poor separation of concerns. Method chaining within the same class can usually be better represented by splitting those methods into separate classes which have a single responsibility.

Feedback

I recieved several emails commenting that this isn't a problem with the $this keyword, but a poor separation of concerns. And I agree, $this->method is exactly that. However, it's also common:

Play Framework (Action chaining) - In this example it's impossible to do anything other than show a client after creating one. For instance, it's impossible to reuse the client creation logic and in some circumstances send the new client a welcome email. Or create a user using the same logic without then rendering data about the newly created user.
Symfony Blog Tutorial - Here it's impossible to reuse the form processing logic and have it do anything apart from redirect on success or show the view on failure. That's incredibly infelexible.
CakePHP Blog Tutorial impossible to substitute what is having the variables set and impossible to change the behaviour of what happens on success/failure

There are plenty more similiar real-world examples and the most common culprit is in the controller layer however it also exists in the model layer.

I'd like to dissect one of the examples, the Symfony Blog tutorial:

public function createAction($blog_id)
    {
        $blog = $this->getBlog($blog_id);

        $comment  = new Comment();
        $comment->setBlog($blog);
        $request = $this->getRequest();
        $form    = $this->createForm(new CommentType(), $comment);
        $form->bindRequest($request);

        if ($form->isValid()) {
             $em = $this->getDoctrine()
                       ->getEntityManager();
            $em->persist($comment);
            $em->flush();

            return $this->redirect($this->generateUrl('BloggerBlogBundle_blog_show', array(
                'id' => $comment->getBlog()->getId())) .
                '#comment-' . $comment->getId()
            );
        }

        return $this->render('BloggerBlogBundle:Comment:create.html.twig', array(
            'comment' => $comment,
            'form'    => $form->createView()
        ));
    }

Without modifying the above createAction() method (or other methods in the class), can you:

Send an email to an admin when a comment is added?
Send an email and redirect when a comment is added?
Redirect to a different page if the user adding a comment is logged in?
Send an email to an admin when a comment is added to a blog in a specific category?
Use a different form for logged in users (e.g. logged in users do not need a CAPTCHA)?
Add a comment without redirecting?
Stop spam by only allowing 10 comments every hour per user?
Use a different template engine instead of Twig?

The answer to all of these answers is of course No. However that is due entirely to the poor separation of concerns whose symptom is the $this->method() calls.

Consider this minor amendment to the functionality of the original class:

interface CommentAction() {
    public function success($comment);
    public function failure($comment);
}

interface Renderer {
    public function render(array $data);
}


class TwigRenderer implements Renderer {
    private $template;

    public function __construct($template) {
        $this->template = $template;
    }

    public function render($data) {
        $this->twigRender($this->template, $data)
    }

    private function twigRender($template, $data) {
        //whatever the controller::render() method was doing previously.
    }

}

class CommentController {
  private $form;
  private $commentAction;
  private $renderer;

  public function __construct(Form $form, CommentAction $commentAction, Renderer $renderer) {
          $this->form = $form;
          $this->commentAction = $commentAction;
          $this->renderer = $renderer;
  }

  public function createAction($blog_id)
    {
        $blog = $this->getBlog($blog_id);

        $comment  = new Comment();
        $comment->setBlog($blog);
        $request = $this->getRequest();
        $form    = $this->form;
        $form->bindRequest($request);

        if ($form->isValid()) {
             $em = $this->getDoctrine()
                       ->getEntityManager();
            $em->persist($comment);
            $em->flush();

            $this->commentAction->success($comment);

            );
        }
        else {
            $this->commentAction->failure($comment);
        }

        return $this->renderer->render(array(
            'comment' => $comment,
            'form'    => $form->createView() ));
    }
}

I've made some minor changes, removing several of the $this->method(); calls. Now let's ask the same set of questions:

1. Send an email to an admin when a comment is added?

class EmailCommentAction implements CommentAction {
    public function success($comment) {
        mail('.....');
    }

    public function failure() {
        return;
    }

}

$commentController = new CommentController($form, new EmailCommentAction(), new TwigRenderer('BloggerBlogBundle:Comment:create.html.twig'));

2. Send an email and redirect when a comment is added?

class RedirectCommentAction implements CommentAction {
    public function success($comment) {
        header('Location:: ...');
    }

    public function failure() {
        return;
    }

}

class CommentActionList implements CommentAction {
    private $commentActions = array();

    public function addAction(CommentAction $action) {
        $this->commentActions[] = $action;
    }

    public function success($comment) {
        foreach ($this->commentActions as $action) $action->success($comment);
    }

    public function failure() {
        foreach ($this->commentActions as $action) $action->failure($comment);
    }

}

$actions = new CommentActionList();
$actions->addAction(new EmailAction());
$actions->addAction(new RedirectAction());

$commentController = new CommentController($form, $actions, new TwigRenderer('BloggerBlogBundle:Comment:create.html.twig'));

3. Redirect to a different page if the user adding a comment is logged in?

class SmartRedirectCommentAction implements CommentAction {
    private $auth;

    public function __construct(Auth $auth) {
        $this->auth = $auth;
    }

    public function success($comment) {
        if ($this->auth->isLoggedIn()) {
            header('Location:: /user/page');
        }
        else {
            header('Location:: /guest/page');
        }
    }

    public function failure() {
        return;
    }

}


$commentController = new CommentController($form, new SmartRedirectCommentAction, new TwigRenderer('BloggerBlogBundle:Comment:create.html.twig'));

4. Send an email to an admin when a comment is added to a blog in a specific category?

class EmailForCategoryCommentAction implements CommentAction {
    public function success($comment) {
        if ($comment->getBlog()->cateogry == 4) {
            mail('....');
        }
    }

    public function failure() {
        return;
    }

}


$commentController = new CommentController($form, new EmailForCategoryCommentAction, new TwigRenderer('BloggerBlogBundle:Comment:create.html.twig'));

5. Use a different form for logged in users (e.g. logged in users do not need a CAPTCHA)?

if ($auth->isLoggedIn()) {
    $form = new UserForm;
}
else $form = new GuestForm;

$commentController = new CommentController($form, new EmailForCategoryCommentAction, new TwigRenderer('BloggerBlogBundle:Comment:create.html.twig'));

6. Add a comment without redirecting?

class EmptyCommentAction implements CommentAction {
    public function success($comment) {
        return;
    }

    public function failure($comment) {
        return;
    }
}

$commentController = new CommentController($form, new EmptyCommentAction, , new TwigRenderer('BloggerBlogBundle:Comment:create.html.twig');

7. Stop spam by only allowing 10 comments every hour per user?

class AntiSpamCommentAction implements CommentAction {
    public function success($comment) {
        //incomplete, needs the hour check but you get the idea.
        if ($_SESSION['numcomments'] > 10) {
            $comment->delete();
        }
        $_SESSION['numcomments']++;
    }

    public function failure($comment) {
        return;
    }
}
$commentController = new CommentController($form, new AntiSpamCommentAction, , new TwigRenderer('BloggerBlogBundle:Comment:create.html.twig');

8. Use a different template engine instead of Twig?

class SmartyRenderer implements Renderer {
    public function render($data) {
        //Use smarty to deal with $data
    }
}


$commentController = new CommentController($form, new SomeCommentAction, , new SmartyRenderer('x.tpl.php');

With some very minor changes the flexilibty and reusability of that piece of code has been greatly enhanced. I can now reuse the logic for different use-cases. This works particulary well when you want to share code across projects and the rules are slightly different. You can fix a bug in your CommentController class and deploy it across all your projects because the code is shared, even if the rules are different. On one site you might send an email to the admin, on another you might redirect logged in users to a different page, on another you might use a different template engine, on another you might send an email to a group of administrators instead of just one, etc etc.

All these scenarios are accounted for with the exact same CommentController class. So you can fix a bug in the CommentController::createAction() method and push it to all of the sites even though their behaviour is slightly different.

The same goes for adding a new action to the controller, making changes to other methods in the class, you can do it once and deploy it to all your sites without worrying about the differences in the behaviour between them.

And how beautiful is the answer to question 2? The CommentActionList class which lets you chain actions in any order/combination you want without a single if statement.

Conclusion

$this->method(); is always procedural programming because the implementation of the method cannot be substituted at runtime, severely limiting the application's flexibility. Method chaining in this manner is a symptom of poor separation of concerns. Instead, move the implementation to another class so that it can be substituted at runtime.

That's not to say calling $this->method(); is always bad. If you're calling a private method which builds a data structure or reduces repetition of code for a task then a $this->method() call is probably ok. However, if you're using $this->method(); to chain actions and alter program flow then you're almost certainly doing something wrong and needlessly limiting the flexibility of your code.

Symptoms of a problematic $this->method(); call include:

The method being called causes a change in state (but method() is not a simple setter)
The method being called is defined in a parent class but the method is only ever called by subclasses
Calling $this->method(); when method() does not have a return value
Directly returning the return value of the method call: return $this->method();
Using $this->method(); to chain different behaviours (e.g. Add to the database then send a confirmation email, or process a form then perform a redirect)
Calling a method in a parent class which is never used by anything other than sublcasses (E.g. $this->redirect() or $this->set() in controllers)

It's not all $this->method(); calls which are problematic, however, a lot of cases the same class method call indicates a poor separation of concerns.

Good uses of $this->method(); call include:

Calling public getter/setter methods which are also used by client code (if they exist only for subclasses to use then it highlights a poor separation of concerns)
Calling private methods which reduce repetition or help break up a problem that do not alter the object's state (However, it's often still better to err on the side of caution and move this to its own class).

Potentially good use of $this->method():

class Product {
    private $price;
    private $taxRate;

    public function foo() {
        $price = $this->getTotal();

    }

    public function getTotal() {
        return $this->price * $this->taxRate;
    }

}

(However, even this could be better described using a TaxCalculator Object, why would a product need to know what the tax rate is?)

Problematic uses of $this->method():

class ProductController {

    public function insertAction() {
        if ($this->model->insert($data)) {
            $this->redirect('/success.html');
        }

    }

}

class ProductController {
    public function ViewAction() {
        $product = $this->model->findById($_GET['id']);

        $this->setVar('title', $product->title);
    }
}

class BlogModel {
    public function add($data) {
        $blog = $this->mapper->insert('...');
        $this->addImage($data['image']);
    }
}

class BlogModel {
    public function addComment($blogId, $commentData) {
        //Add blog comment
        //....
        //....

        $this->sendNotifications();
    }
}

Generally speaking, the negative impact of a $this->method(); call itself is minimal and the only direct effect is limiting flexibility. However, that limited flexibility is a symptom of a much larger, more problematic issue: Poor Separation of Concerns.

The problem with imposing restrictions on flexibility is that it becomes difficult to work around if you need to at a later date. If you've already deployed an appliction with limited flexibility trying to add flexibility back in afterwards is considerably more difficult that doing so up front.