Tom Butler's programming blog

Model-View-Confusion part 1: The View gets its own data from the Model

Update 30/11/2012:

I have received several requests for clarification on the code examples at the bottom of this article. Because of this, I have updated the code to be far more demonstrative and complete. Feel free to send more feedback or questions you may have.

Abstract

MVC is a common subject in the PHP community at the moment. There are a large body of articles, tutorials and code examples on the subject. Most of which deviate from the traditional approach to MVC and subscribe to an approach closer to MVP. Very few of these follow the MVC standard of allowing the view access to the model. Yet this is what they ascribe to be. This article explains why the view should have access to the model and the problems which are created by taking the differing approach commonly used throughout the PHP community.

MVC: Model View Confusion

MVC is a very common architecture in PHP these days. Everyone seems to be writing their own MVC framework. Something which I wholly support and encourage. It's a great learning exercise. The problem however, is the information which is available to PHP developers. Much of it is poorly written, a great deal is fairly old fashioned and teaches bad practices. The majority of MVC related articles go off on tangents about such topics as the merits of DataMapper vs ActiveRecord, template systems, directory structures, how to do form processing and validation and other irrelevant implementation details. The biggest problem, however, is that despite the inclusion of all this unrelated information they still almost all get the basics wrong. From the model (Brady, 2008) to the view. In this article I'd like to discuss the view specifically and its interaction with the model in MVC.

In MVC the view should access the model directly. This something which I have to argue a lot and it's one of the most overlooked facts about the architecture. Among the PHP community there is a lot of erroneous information and general confusion surrounding MVC.

For example, this article on sitepoint which states:

It is important to note that in order to correctly apply the MVC architecture, there must be no interaction between models and views: all the logic is handled by controllers.

This is wrong. It specifies this nowhere in MVC. In fact, it states the exact opposite. This article is a good example because it epitomises the attitude of most PHP developers and what's taught as "MVC" throughout the PHP community.

Lets take a step back and take a look at some explanations of MVC from outside the PHP world.

As a starting point, the wikipedia article on MVC states:

A view queries the model in order to generate an appropriate user interface (for example, the view lists the shopping cart's contents). The view gets its own data from the model. The controller may (in some implementations) issue a general instruction to the view to render itself. In others, the view is automatically notified by the model of changes in state (Observer) which require a screen update.

Of course wikipedia is not a credible source so here's some more academic explanations:

The views know of the model and will interact with the model.
So the view clearly must be informed of each change to the text so that it can update its display. Yet the model (which we will assume is an instance of String) need not take responsibility for communicating the changes to the view because these changes occur only by requests from the user. The controller can assume responsibility for notifying the view of any changes because it interprets the user's requests. It could simply notify the view that something has changed -- the view could then request the current state of the string from its model
The standard interaction cycle in the Model-View-Controller metaphor, then, is that the user takes some input action and the active controller notifies the model to change itself accordingly. The model carries out the prescribed operations, possibly changing its state, and broadcasts to its dependents (views and controllers) that it has changed, possibly telling them the nature of the change. Views can then inquire of the model about its new state, and update their display if necessary.
The view renders the contents of a model. It accesses enterprise data through the model and specifies how that data should be presented.

It does not specify anywhere in MVC that the controller queries the model and passes the result to the view. In fact, quite the opposite. It actually states that the view gets its own data from the model. The article is wrong. It is clearly teaching erroneous information about MVC. Now, don't get me wrong I'm not singling out this article or its author. I don't blame the author. He is far from alone. Simply google "PHP MVC tutorial" and it's the same thing over and over, again and again. I could go on but I'll stop. It's no wonder people get confused!

Most of these articles don't state anything quite as strong as "there must be no interaction between models and views" but they all imply and teach it with their code examples. They all support and encourage an architecture where the controller extracts data from the model and passes it to the view with no reasoning as to why they do it this way.

This kind of confusion is widespread. The more people write it, the more people learn from it and more people blog about it and the more erroneous information there is. It becomes a de facto standard. This has certainly happened in this case. The sheer amount of MVC tutorials and code examples which do it this way speaks for itself. It's not all bad. There are articles out there which get it right, but they are few and far between. Far outnumbered by those spreading the same skewed interpretation of MVC, which unfortunately includes most of the popular "MVC" frameworks.

What all these articles fail to do is explain why the controller should do all the work. They all effectively state "this is how MVC works". In fact, It's not. Read through the first few pages of the academic references I have provided from outside the PHP community and you'll see this isn't how MVC works at all. As for the articles I linked to, I can't try to second guess their reasoning for doing this this way, though I suspect it boils down to argumentum ad populum. Perhaps there is, perhaps I'm missing something important. The problem is that none of them specify why they do things this way. Where an explanation is given, it's explained as "this is how it's done in MVC" which itself is neither a valid argument (argumentum ad antiquitatem) or even true.

Usually at this point I'd examine the validity of their reasoning. However, as there is none, all I can do is point to academic examples outside PHP, explain why the controller should not do all the work and include my own reasoning.

The view is more than a template

The catalyst for this error appears to be the idea that a "view is a template". It contains the HTML, perhaps a few processing instructions for looping through arrays but they state that it gets passed its variables from the controller.

The result of this self-imposed restriction makes putting any logic at all in the MVC's' "view" seem wrong, harking back to the days of mixing HTML and PHP. Don't get me wrong I'm not going to argue that mixing HTML and PHP is a good thing. Templates serve a valuable purpose. They just don't constitute a view in MVC.

Of course, when the model is incorrect or even non-existent (Brady, 2008) and the controller is doing the model's job it presents yet another hurdle. When the controller is doing the job of a model anyway, as in most PHP "MVC" frameworks, there is no model for the view to access! This has to be done by the controller.

Because of these fundamental problems, the only solution is to make the controller act as a mediator. Generally by querying the model and passing the result to the view.

Example 1

PHP Code:

$user $this->model->user->findById($id); $this->template->assign('firstname'$user['firstName']); $this->template->assign('surname'$user['surname']); $this->template->assign('address1'$user['address1']);

or pass a simple key-value array which has been retrieved from the model (which is just a data access layer) to the template:

Example 2

PHP Code:

$user $this->model->user->findById($id);  $this->template->assign('user'$user);

While the latter is certainly an improvement, it merely makes the problem slightly less noticeable: binding logic.

This puts extra work into the controller: Binding variables to the template. Controllers are not designed to be reused (Brady, 2009; Dalling, 2009). By putting this logic here the reusability has instantly been reduced. There are bigger issues than that though.

Binding logic is problematic

By including this binding logic, a large number of potential issues have been created in terms of reusability and maintainability. Views are intended to be reused. In the examples above I'm showing some basic user information. The template here may well (and, because the template is separated, can) be reused. This is a good thing and certainly something to strive for. However, when the binding logic is introduced, much of this reusability is lost.

The first problem, then, is repeated code. The binding logic needs to be written in each and every controller which uses the view. This will do nothing but slow down development.

There is a potential for inconsistent naming. For example, this is possible:

PHP Code:

$this->template->assign('lastname'$user['surname']));

This is something easily dismissed as "I wouldn't do that" however, when working with a team, it's easy for things like this to slip in unnoticed and cause bugs which are difficult to track down

What if there are two users being represented in the view? Both of them have surnames and firstnames. The view now needs to reference "firstname1" and "firstname2" How can this be kept consistent across controllers and views? It can't. This all causes confusion for anyone else looking at the code. Isn't it better to prevent this in the first place?

In a similar vein, it's possible for the controllers to be inconsistent. Because this logic even exists, it removes any contract between the view and the model. This means 'address1' in the view doesn't really represent anything. It can be anything. There's nothing to stop the controller sending the wrong data.

For example, in one job I worked on a site with a similar set up. The requirements changed along the lines of "can you remove the address from this page" rather than modify the view one of my colleagues decided it would be a good idea to change the controller to do $template->assign('address', null); Of course when the view was reused elsewhere by another developer, he had to spend time working out why the address was showing on the new page and not on the old one. The controller now inadvertently contained display logic: it was telling the view not to show the address. Again, prevention here is better than cure.

The last two are somewhat trivial and should be manageable with proper team training but they're the kind of things that do happen and create bugs which are difficult to track down. Even ignoring those...

...consider a situation where the view (which in this case is just a template) is used in a dozen places and the the user's phone number has to be added. In Example 1 this presents a substantial problem. The developer now needs to locate and edit a dozen controllers which are using the template to also supply the phone number. If the view was fetching its own data from its model then this wouldn't be a problem, controllers wouldn't need to be opened or modified, only the view and the model. In Example 2, in this case there is no problem. But what if it was the user's latest invoice number instead of the phone number? Now each controller still needs to be modified to supply an invoice as well as a user. Example 2 shares the problem of binding. It is just less apparent.

The most obvious solution is, in Example 2 the $user variable should be an object which has a ->getLatestOrder() method. Ignoring the issue regarding separation of concerns, this creates logic used solely for the purposes of one specific view in the "User" class. What if there are 100 views related to the user and they all need different information? Should the user class be polluted with functions that are only really used for display purposes? At best, it causes the user class to become bloated, however the view has now lost any real reusability-- it needs to be passed an object which has the method "getLatestOrder()" meaning it is tied to a very specific model, reducing reusability. This conflict of interest will be covered in detail in the next article.

Performance should be the final consideration. The binding logic creates more code. Although code cleanliness/reusability/maintainability shouldn't be sacrificed for performance, in this case worse performance is a result of worse code.

When the view really needs the model

The above examples are trivial. As with most trivial examples, the bigger problems don't become apparent until the scale and complexity are increased. A better example is pagination.

A pagination controller may look like this:

PHP Code:

$perPage 10; $page $_GET['page']; $users $this->model->user->find($perPage$perPage $page); $totalPages $this->user->count() / $perPage; $this->view->assign('users'$users); $this->view->assign('totalPages'$totalPages); $this->view->assign('page'$page);

Here is basic logic behind pagination in the controller. The big problem here is that none of this code is reusable. Even though it will be the same for any paginated data. The only thing that will change is the current page and number of records to show per page.

Just because the controller is calling a function on the model doesn't mean the display and business logic have been moved out of the controller.

How else could this be done? A better solution would be to set the state in the model and have the view read it. This way the view is agnostic about the state of the model (which page it's on) and all the pagination logic is contained (and reusable) within the view.

PHP Code:

$this->model->setPage($_GET['page']);

This should be the entirity of the controller code.

This way, all the pagination logic: working out the total number of pages, working out the limit/offset (Details which are only relevant to how the data is going to be displayed when it's paginated) can be moved to the view. The view will call $user->find() and $user->count() with the relevant parameters. The controller is doing less and the view is doing more. This inevitably leads to higher reusability.

It could be argued that the view should contain its own state: Which page it's on. However, the reason that the model holds the state and not the view is that the model can be in use by multiple views. It may be useful for multiple views to use the same data. For example, having a second view which used the paginated data and created some aggregated statistics about the data displayed on the current page. This would need to be in sync with the first view and always use the same data set. By storing the state in the model rather than the view, the page can be changed and both views will reflect the changes. This reduces repeated code and adds flexibility while improving separation of concerns.

The View always has a contract with a model. Whether it's a primitive array or a complex object, the view requires specific information. This is naturally unavoidable. By using the Model directly, this contract is immediately fulfilled. If primitives are passed to the view, it has no way of knowing whether this contract is fulfilled until it breaks. Even if the result is a set of objects, the view cannot know whether the set is correct (because it's likely a primitive array containing objects) until it tries to use an object. By using the Model, the view can check this contract is fulfilled as the model is assigned to it, preferrably in its constructor.

Even ignoring this, and assuming your controller always passes a valid data set you still lose re-usability. The biggest issue with the "controller as a mediator" approach is that all the data fetching logic (the calls to the model) has to be done in every single controller using the view, causing repeated code and therefore reduced reusability. In the pagination example, all the calculations need to be repeated in each controller that uses the view. As an application developer, each time you want to use pagination, would you rather pass a single variable to the model which set's the page and does everything for you or fetch the records, work out the total number of pages then pass the result to the view?

And pagination is a trivial example. In the real world, that controller logic is liable to change. If it's repeated, it needs to be updated in every controller. By creating the reusability and separation of concerns suggested by MVC the code doesn't need repeated and changes to the model will automatically be reflected in any view without changes to the code. This is a result of the reduction of binding logic and therefore increasing reusability.

The idea here (and the one suggested by MVC in general) is putting all the business logic in the model and all the display logic in the view. All the controller should be doing is getting these talking. The less the controller is doing the better. Less controller code means more reusable code.

Better reusability

With all this theorising, it would certainly be worthwhile to include a basic real world example of why views are more than templates. I believe templates are worthwhile but they do not constitute a "View". Much in the same way the model layer is more than data access, the view is more than a template.

To demonstrate, here's a simple scenario: Show a list of users which can be filtered by searching for their name.

Some example controller code. This will now be as minimal as possible.

PHP Code:

class UserListController {     private $model;          public function __construct(UserListModel $model) {         $this->model $model;     }     public function search($criteria) {         $this->model->filterByName($critiera);     } }

Clean, empty controller code. Absent of both the view's display logic and the model's domain logic. One thing to note here is that the controller does not require access to its view, only the model. This is because in MVC, the only reason a controller needs access to the view is for the observer pattern: to inform the view to refresh from the model. In PHP this is a non-issue because the view must have been refreshed anyway.

A simplistic model would look something like this:

PHP Code:

class UserListModel {     private $searchName;     private $db;          public function __construct(Db $db) {         $this->db $db;     }          public function filterByName($name) {         $this->searchName $name;     }     public function getUsers() {         //Find the selected users.         return $this->db->query('SELECT * FROM users WHERE name like :name', array(':name'$this->searchName);     } }

The important part to note here is that the model has a state, in this example, the name which has been searched for. In most pseudo-MVC frameworks the model has no state, the state is retained in the controller. This seemingly trivial fact is actually the core of the problem: the view needs to know about this state. This is why the erroneous "controller as a mediator" architecture has developed.

The view then fetches the data from its model. In this example the usage of a basic template system has been assumed:

PHP Code:

class UserListView {     private $model;     private $template;     public function __construct(UserListModel $modelTemplate $template) {         $this->model $model;         $this->template $template;     }     public function output() {         $this->template->assign('users'$this->model->getUsers());         return $template->output();     } }

The view has become reusable anywhere a list of users is required. Substitute the model, which can supply any dataset from any source, it doesn't have to be a database. As long as the model supplies a list of users the controller is agnostic about where they come from or any knowledge about what them at all. This promotes robust code by making each component (the model, view and controller) entirely interchangeable. This is possible because it enforces the strict separation of concerns that MVC strives for. The controller has no knowledge of how the view works. From an OOP perspective this is ideal as there is complete encapsulation. The view can be substituted and it will never break the controller, the controller can be substituted and it will never break the view.

However, the goal here is reusability. Is the logic behind a listing a set of users any different to listing a set of blogs, albums, products, etc? No, so why can't that logic be reused? A few simple amends and it can:

Firstly, some interfaces are defined in order to enforce an API:

PHP Code:

interface Listable {     public function getData(); } interface Searchable {     public function setCriteria($criteria); }

It's now possible to create several models which use these interfaces:

PHP Code:

class UserModel implements ListableSearchable {     private $searchName;     private $db;          public function __construct(Db $db) {         $this->db $db;     }          public function setCriteria($criteria) {         $this->searchName $name;     }     public function getData() {         //Find the selected users.         return $this->db->query('SELECT * FROM users WHERE name like :name', array(':name'$this->searchName);     } } class BlogModel implements ListableSearchable {     private $searchTitle;     private $db;          public function __construct(Db $db) {         $this->db $db;     }          public function setCriteria($criteria) {         $this->searchTitle $title;     }     public function getData() {         //Find the selected blogs.         return $this->db->query('SELECT * FROM blogs WHERE title like :title', array(':title'$this->searchTitle);     } }

The controller code looks like this:

PHP Code:

class SearchController {     private $model;     public function __construct(Searchable $model) {         $this->model $model;     }     public function search($criteria) {         $this->model->setCritera($criteria);     } }

Notice that it's not tied to any specific model. Because of the very strict separation of concerns which is achieved by the view getting its own data from the model rather than being fed it by the controller, the controller code is reusable. And so is the view:

PHP Code:

class ListView {     private $model;     private $template;     public function __construct(Listable $modelTemplate $template) {         $this->model $model;         $this->template $template;     }     public function output() {         $this->template->assign('data'$this->model->getData());         return $template->output();     } }

Again, it's not tied to a specific model. The view does not care where its data comes from or what the data is. But there's more. Any of the three components can be substititued and reused. The controller can be changed where needed. The view can be substituted for one which doesn't use a template engine while reusing the controller and model. Another controller can be added to the system which uses $_POST and it can use the exact same model and view.

By enforcing the separation of concerns that MVC dictates, the real power of it shines through.

Conclusion

My main goal here is to show that just because an idea is widespread does not mean it is correct or best practice. Not everything you read is entirely accurate. Without valid reasoning, a programming practice, regardless of its popularity is not always a good practice. Take singletons as a perfect example. 5 years ago they were widespread and fad-of-the month. These days they are shunned and taught as an anti-pattern. Their inherent problems did not stop them being used. Valid reasoning is something many articles seem to miss. They tell you how but not why. Don't take statements such as the one I quoted in the introduction at face value without examining the reasons behind it. This is important, if someone doesn't include the reasons why they are doing something a specific way including the benefits of it over other methods, question it.

Which brings me neatly to part 2. Did you question what I wrote here? There are obvious flaws in the solution I presented in the section "Better reusability". In part 2 I'll turn this on its head and show why passing the model to the view is better but in fact is not a particularly good idea either and I'll show how the problem can refactored away.

Continue to part 2 > >

References

Brady, P (2009) Zend Framework: Surviving the deep end (http://www.survivethedeepend.com/zendframeworkbook/en/1.0/
the.model#zfbook.the.model.the.fat.stupid.ugly.controller
)

Brady, P (2008) The M in MVC: Why Models are Misunderstood and Unappreciated (http://blog.astrumfutura.com/2008/12/the-m-in-mvc-why-models-are-misunderstood-and-unappreciated/)

Burbeck, S (1992) Applications Programming in Smalltalk-80(TM): How to use Model-View-Controller (MVC) (http://st-www.cs.illinois.edu/users/smarch/st-docs/mvc.html)

Dalling, T (2009) Model-View-Controller explained (http://www.tomdalling.com/software-design/model-view-controller-explained)

Deacon, J (2000) Model-View-Controller (MVC) Architecture (http://www.jdl.co.uk/briefings/MVC.pdf)

Krasner & Pope (1988) A Description of the Model-View-Controller User Interface Paradigm in the Smalltalk-80 System (http://www.itu.dk/courses/VOP/E2005/VOP2005E/8_mvc_krasner_and_pope.pdf)

Sun Microsystems (2000) Java Blueprints Model View Controller (http://www.oracle.com/technetwork/java/mvc-detailed-136062.html)