Static Methods/Variables are bad practice

12 August 2012

In short: Yes. There are many disadvantages and static methods should almost never be used.

Static methods allow procedural/functional code to be shoe-horned into an Object Oriented world. Using static methods and variables breaks a lot of the power available to Object-Oriented code.

Why are they used?

Before any analysis on the benefitis or problems of static methods, the question must be asked: "Why are static methods used?"

It turns out this is a difficult question to answer. For programmers there is plenty of information on how static methods can be used but little information on why.

The technical implementation of them is to allow state to be maintained across all instances of a class. The problem is that this is intrinsically not OOP because it disregards encapsulation. If a variable can be altered by any instance of a class then the fundamental principle behind encapsulation/information hiding is lost entirely: An object is no longer in complete control of its state. Its state now relies on variables which are essentially global. Which we know is bad. Even private static variables maintain state at a global level but simply limit its access. Any instance of the object can alter the static variable which causes ambiguity as individual instances of the object no longer have control over their own state. State changes can arbitrarily happen without knowledge of an object which relies on that state which is problematic because the object may not work correctly when this happens. Much as it's often said that "Inheritance breaks encapsulation" statics do this in a far more severe way: By not just exposing internal implementation but also by exposing internal state.

Static methods may be stateless. One of the thought processes behind using static methods is that if they don't use state, they only work on local variables supplied as arguments then they should be static. However, this poses the question: Where do I put the function? If, indeed, the function is totally independent and works without state then adding it to a class which is also designed to be constructed immediately violates the Single Responsibility Principle.

Static methods/variables are often used because they put people from a procedural background back into their comfort zone under the guise of being "object oriented" because the functions are wrapped in a class. While static methods do offer some advantages over procedural programming techniques, they don't offer all the power that Object-Oriented programming does.

db::query() is functionally identical to db_query(). A global function. Which is all static methods are, namespaced global functions. The problem with this is that all of the power that OOP provides is sacrificed with their use. Static methods make it impossible for your code to use powerful and vital OOP features: encapsulation, polymorphism and inheritance.

For instance, consider the following example:

class DB {
    public static $db;

    public static function connect($server, $database, $user, $password) {
        self::$db = new PDO('mysql:dbname=' . $database .';host=' . $server, $user, $password);
    }

    public static function query($query) {
        return self::$db->query($query);
    }

    public static function close() {
        return self::$db->close();
    }
}


class Blog {
    public $id;

    public function __construct($id) {
        $this->id = $id;
    }

    public function getData() {
        return DB::query('SELECT * FROM blog WHERE id = ' . (int) $id)->fetch();
    }
}

db::connect('localhost', 'test', 'foo', 'bar');
$blog = new Blog(123);
echo $blog->getData()->title;

There are several immediate issues here:

Hidden Dependencies

The class Blog requires the class DB to be present in the system for it to work. Someone looking at the code can't immediately see that the DB class is required to be present and defined for the functionality of the Blog class.

This makes testing the class in isolation very difficult. The first test would need to exist solely to locate the dependencies which, at best is extra work but at worst almost impossible due to potential code branching. Perhaps a static method may be called on an error condition that only happens under certain conditions. It's not immediately obvious what dependencies the Blog class has.

What if the getData() method was rarely called? It is difficult to keep track of what classes are dependent on others. If the DB class is not present, the an instance of Blog will construct properly and look like it will work but will break as soon as getData() is called. This causes ambiguity: there is no way to know whether an object can fulfil its responsibility even after it's been constructed. Once an object has been constructed it should then fulfil a contract with the code that has created it that it's fully ready to be used doesn't need anything else in order to function and will work as documented.

If a constructed class can guarantee to fulfil its responsibilities without ambiguity it makes the developers life a lot easier. Testing code becomes less of a game of whack-a-mole and more structured and easier to accomplish. Whereas use of static methods and the introduction of arbitrary static method calls throughout the class, the object has an unknown state. It may work successfully, it may not, depending on the presence of any static dependencies it has. It is tied to the global state, and the state of the global (static) variables the static methods used by it

The obvious downside to this is portability. The class cannot be used on its own, without the static functions being present. This makes reusability far more difficult and isolating the class near impossible.

Unknown global state

In the example above, there is a global state. The db::$db variable can be altered from anywhere and there is no guarantee it has been set. Blog::getData() makes the assumption that the database has been connected. There's no way to know whether it has been or not resulting in code that will break when it shouldn't.

One common method around this would be the singleton pattern. Using a singleton, it's impossible to query the datbase before it's initialised. While this does fix that problem it also introduces another dependency in the singleton: It has to get the configuration options from somewhere. And because the singleton is global, this also has to be somewhere global. This tightly couples the DB class to where it gets its configuration from, e.g. a static config class. Now the class has a dependency on both DB and the Configuration that DB needs.

For reusability alone, it's far better to push configuration options into a constructor than have the constructor pull them from elsewhere. The pull method adds dependencies which don't need to exist meaning your code is reliant on external code just to run. This reduces both portability and reusability. Beyond that, it means only a single database configuration can exist for the entire system. And while this may often be true in some cases, like the simple database class, it still adds a limitation. Imagine if you had a class that created an FTP connection and allowed a simple API for creating files. Just calling FTP::upload($file); would be quite clean wouldn't it? The problem is, once the requirements change to "can the file now be uploaded to the backup server as well." then you have a problem: there's no way to connect to two different servers at once.

No polymorphism

In the example above, because DB exists as a global entity, only one database connection can exist at any one time with a shared state. Each entity uses the same instances. Now it is unlikely you will want more than one database class but it's not impossible. Imagine you have found a piece of code where there's a sporadic error so you want to start logging every query for a specific section of your code. If you're calling db::query() all over the place this is impossible. You could alter DB::query() to log every single query but not a small subset. Of course there are ways around this using a global $log boolean value but that has its own problem. One obvious one being, somewhere in the object tree a developer who was debugging a single query left in a db::$log = false; which obviously another developer won't know about and will find their log files empty. Or missing queries which have been executed. This is a trivial example but by using static methods you completely forfeit the benefits of polymorphism; one of OOPs most powerful features.

The solution

class DB {
    private $db;

    public function __construct($server, $database, $user, $password) {
        $this->db = new PDO('mysql:dbname=' . $database .';host=' . $server, $user, $password);
    }

    public function query($query) {
        return $this->db->query($query);
    }

    public function close() {
        return $this->db->close();
    }
}


class Blog {
    protected $id;
    protected $db;

    public function __construct($id, DB $db) {
        $this->id = $id;
        $this->db = $db;
    }

    public function getData() {
        return $this->db->query('SELECT * FROM blog WHERE id = ' . (int) $id)->fetch();
    }
}

$db = new Db('localhost', 'test', 'foo', 'bar');
$blog = new Blog(123, $db);
echo $blog->getData()->title;

So why don't people do this? It's simple code and no more work. However, in complex systems the issue arises passing instances around the whole system. This causes it's own problems with dependency resolution and is the main reason people see static methods as the answer. They avoid the problem of locating dependencies all together. And it is a huge problem which I'll discuss in another article. The use of static methods to locate dependencies is simple to grasp and makes more sense than passing objects all over the place when you're just starting to learn. it's a very small leap from procedural programming so it is quick and easy.

Which is another reason: It's easy. It's quick to implement and doesn't require any overall design or forethought. Locating dependencies using singletons/static methods can make hacking together working code a simple task, removing any need to make any architectural decisions or plan how the system works. The problem is, it works but it lacks scalability and the time you save at the start is lost later on when you need to update the code, or use it in a way that doesn't fit within the very narrow scope of the environment it needs to exist in in order just to run.

Static methods are a quick fix and one that cause more issues than they solve. Accessing Dependencies is one of the biggest hurdles in creating any OOP program and definitely one of the most difficult to solve. This is something I'll discuss in another article, but for the most part it's what static methods are used to work around in what I like to call "PseudOOP".

Static variables

I've seen it argued that static primitive variables don't break anything. A good recent example is someone who was using a variable called Layout::$title to define the <title> tag on their page. On the face of it, this seems acceptable: There is only ever going to be one title. However, once the system grows and expands problems start to quickly arise.

Firstly, whatever sets it (in this instance it was a MVC controller action), now has a dependency on the class which contains the variable. This presents issues if the code is ever moved outside the system or isolated.

Secondly, the code makes an assumption about the structure of the system it's used in: It has exclusive access to the Layout::$title variable. It wouldn't be unreasonable to convert the MVC structure into a hierarchical one which allowed the system to be modularised where multiple triads can exist at once. In this scenario, which triad's title should be used? It creates ambiguity and breaks encapsulation. In this instance, the Layout class should have the responsibility of locating the title, rather than being told what it is elsewhere.

All the problems associated with global state are still there and static variables are just as bad as global variables, if not worse, as they introduce hidden dependencies as well.

Static variables are global variables. They share all the problems that using $_GLOBALS does.

Static leaf methods are also bad!

It's often stated that leaf (A method that is self contained and calls no other methods) and factory methods can remain static. However, this breaks encapsulation.

According to wikipedia, encapsulation is: "the packing of data and functions into a single component." and static methods break this by forcing the data to be passed into a different component.

Consider the seemingly innocuous abs() function. It takes a number and returns its absolute value. Surely having this as a static method is harmless? I'd never want to mock the method for testing, it's inbuilt in the language and never going to change?

Well, maybe but it still limits your flexibility. The following method looks like it would be reusable and non-problematic:

public showAbs($num) {
      return $num . "'s absolute value is " . abs($num);
    }

The problem? You're assuming that $num will always be an integer. The maximum size for an integer in PHP is 64 bit. To deal with numbers larger than that you must use a library such as GMP or BCMath.

If you try to pass in something else such as a GMP instance then the showAbs() function will break at the abs() call.

Whereas if PHP properly supported OOP on integers and GMP/BCMath instances each could implement an abs() method and the showAbs() method would be usable with any numeric type!

class MyClass {
    public function showAbs($num) {
      return $num . "'s absolute value is " . abs($num);
    }
}

$myobj = new MyClass;

$num = 12445;

//This will work because $num is a standard integer
echo $myobj->showAbs($num);



$gmp = gmp_strval("124543543565764765");

//This will break because $gmp is not a standard integer
echo $myobj->showAbs($gmp);

This problem is very easily fixed by encapsulation. If integer and gmp were both classes that implemented an interface that had an abs() method this would not cause any problems and the showAbs method would be reusable!

interface Numeric {
    public function abs();
}

class Integer implements Numeric {
   public function abs() {
       //integer implementation of abs
   }
}

class GMP implements Numeric {
   public function abs() {
       //GMP implementation of abs
   }
}

Then we can use a polymorphic $num instance and work with any numeric type!

class MyClass {
    public function showAbs($num) {
      return $num . "'s absolute value is " . $num->abs();
    }
}

$myobj = new MyClass;



$num = 12445;

//This will work because $num is an integer with a ->abs() method()
echo $myobj->showAbs($num);




$gmp = gmp_strval("124543543565764765");

//This will work because $num is a GMP instance with a ->abs() method()
echo $myobj->showAbs($gmp);

This now works for any numeric type. I can write a custom numeric type and reuse my showAbs() method. Another developer in ten years time can do the same, the method is more flexible because the static abs() call has been removed

Static methods always break encapsulation. By moving the method into the class that contains the data being acted on, the flexibility is greatly enchanced!

What about class constants?

Class constants use the double colon "static" operator as well. However, they should also only have meaning to the class which is using them.

Constants are a good method of configuring classes or sending options to methods.

For example

$ftp = new FTPConnection('host', 'user', 'password', FTPConnection::PASSIVE);

Here, the constant is used as an identifier and exists solely to add readability to the code. It's passed back into a method (in this case the constructor) of the class which defined it. Because constants are exactly that, constant, they don't have an issue with unknown state. They only have one state and it's always set and available. Any client code calling them can be guaranteed that they are fully initialised and populated any time they are used, unlike variables which do not have this guarantee.

In addition to that, because the constant only has meaning to the class which defined it, version two of the class can completley change the true value of the constant internally without breaking any backwards compatibility. This helps encourage encapsulation.

Constants are fine, and encouraged for readability and encapsulation. Provided they're only ever passed back into methods that belong to the class (or a subclass of) which they are defined in they are beneficial and should be encouraged. However, this would be bad practice:

echo 'You have ' . Locale::CURRENCY  . $amount . ' in your bank account';

In this example, the client code has a hidden dependency on the Locale class and with the exception of unknown state, suffers all the same problems as static variables.

What about "static factories"?

This technique is widespread and bizarrely always crops up as if it's an exception when discussing the merits of static methods. A static "factory method" is a method which creates another object for the purpose of delegating object creation to a third party and away from your client code. This is a fantastic idea, it means any dependencies that the object being created needs are not also dependencies in the client code. This is hugely beneficial an any non-trivial system. The problem is, static factories are one step forwards and two steps back. It's an odd solution to the problem of abstracting object creation. It recognises that a problem exists and attempts to solve it, but at the same time it shoots itself in the foot by adding little, if any, real benefit.

Like all static methods, they have the problem of introducing hidden dependencies. But even worse, they force all the dependencies which need to be passed into the objects they're creating to be accessible globally/statically. By making object creation static, it's irreplaceable. You can't substitute it, ever. For example, a simple static factory maybe doing something like this:

class ModelFactory {
    //This is generally set earlier in the program.
    public static $db;
    public static $request;

    public static function create($name) {
        return new $name(self::$db, self::$request);
    }

}

By using a factory here, the dependencies $db and $request have been moved out of the client code. Any code which creates a model doesn't need to worry about locating the correct dependencies to do it. This is a very good thing. Any logic which needs to be placed here is in a central location, for example, some models may have additional dependencies. Logic in the create method can be added to account for this. Because of this, using factories for object creation is a highly encouraged and something I plan to discuss in detail in another article.

However, the problem with static factories methods is that they are share the same problems with any other static method. Unknown state of the factory dependencies and hidden dependencies on the static Factory class as well as forcing all the dependencies to be accessible globally.

In the example above it's impossible for any Model to use a $db variable other than the single one set statically. What if you want to initiate two versions of the model using different database connections... it wouldn't be absurd to require two copies of the model to copy some data from one database to another. In this scenario, it's not possible without a significant amount of hacking around in the client code.

The solution, once again, is to make the factory a real objet with a real constructor and pass the factory into the constructor of the code that requires it. What this achieves is that the code that needs to use factory-created-objects is self documenting: A developer looking at the constructor parameters will see the factory listed as part of the class API and instantly know that the class he or she is looking at will utilise the factory to create objects. If a static factory method was used, there would be no way to know this.

The second benefit is that ModelFactory::$db is not static meaning I could have two ModelFactory instances that were using different databases!

So static is always bad?

Most of the time, if you're using static methods or variables then there is almost certainly a better way.

That said, there are a few instances where static methods and variables are useful.

However, all of these are accessed solely by methods in the class that also contains the statics, and do not have dependencies on anything outside them.

A good example is an auto increment type feature

<?php class Foo {
    private static $count = 0;
    public $id;

    public function __construct() {
        $this->id = self::$count++;
    }
}

for ($i = 0; $i < 10; $i++) {
    $foo = new Foo;
    echo $foo->id . '<br />';
}
?>

Although a very rare case, sometimes it is useful to store data across every class instance. File locking information for a file access class, performance related tasks such as transparent caching of result sets across instances and limiting the number of FTP connections which can be created at any one time. In these few edge cases (and others) static methods are an acceptable compromise. However, the implementation details of the static variables are hidden in the class itself, never accessed externally and only relied on when absolutely necessary.

Private static methods

What about private static methods that don't use global state? If they're private and don't rely on global state (so don't use static variables) are they a problem?

Well, no. They're not but they are pointless! Take a look at the following method that uses regex to extract a name="value" pairs from a string such as a HTML tag:

// takes html style attribute string such as: foo="bar" a="b" and returns the array ['foo' => 'bar', 'a' => 'b']
    private function getAttributes($html) {
        preg_match_all('/([^ ]+?)=([\'\"])([^\2]*?)\2/S', $html, $attr);
        $attributes = [];
        for ($i = 0, $c = count($attr[0]); $i < $c; $i ++) $attributes[trim($attr[1][$i])] = trim($attr[3][$i]);
        return $attributes;
    }

This method doesn't require any instance variables or static variables so could potentially be made static. However it's private (as public static methods break encapsulation!) so can only be called from a method in the same class e.g.:

class HTMLTagReader {
    // takes html style attribute string such as: foo="bar" a="b" and returns the array ['foo' => 'bar', 'a' => 'b']
    private function getAttributes($html) {
        preg_match_all('/([^ ]+?)=([\'\"])([^\2]*?)\2/S', $html, $attr);
        $attributes = [];
        for ($i = 0, $c = count($attr[0]); $i < $c; $i ++) $attributes[trim($attr[1][$i])] = trim($attr[3][$i]);
        return $attributes;
    }

    public function getData($html) {
        $attributes = $this->getAttributes($html);
        //do something with $attributes
        //...
    }

}

In this case, $this->getAttributes() is absolutely equivalent to self::getAttributes() as the method needs no variables apart from its arguments and can only be called from methods in the same class. This code is equivalent:

class HTMLTagReader {
    // takes html style attribute string such as: foo="bar" a="b" and returns the array ['foo' => 'bar', 'a' => 'b']
    private static function getAttributes($html) {
        preg_match_all('/([^ ]+?)=([\'\"])([^\2]*?)\2/S', $html, $attr);
        $attributes = array();
        for ($i = 0, $c = count($attr[0]); $i < $c; $i ++) $attributes[trim($attr[1][$i])] = trim($attr[3][$i]);
        return $attributes;
    }

    public function getData($html) {
        $attributes = self::getAttributes($html);
        //do something with $attributes
        //...
    }

}

So should this method be static or not? In this example it makes zero difference because in this case, the $this keyword is essentially making a static call as both self::getAttributes() and $this->getAttributes() will always use the implementation of getAttributes() defined in the HTMLTagReader class.

A word on Unit Testing

For the purpose of this article, I've ignored the elephant in the room which is Unit Testing. The reason for this is that anyone who's uses Test Driven Development (TDD) will already be familiar with the pitfalls of using static methods/variables and it's more beneficial to show the issues from a practical perspective.

Conclusion

There are very few times when you should use static methods or variables and certainly never for locating dependencies. They should never be used by external classes and cause far more problems than they solve. They result in poorly designed spaghetti code and try to introduce procedural code into an object oriented world. They prevent objects from being able to guarantee they can fulfil their contracts and make code very difficult to debug and test. Code stops being self-documenting and a lot of the power of object-oriented programming is sacrificed entirely. Why would you ever want impose those limitations on your code and anyone else who has to use it when there are almost always better methods that achieve the same thing?

Static variables always introduce global state (which is bad) and Static methods always break encapsulation (Which is also bad). Static should be avoided at all costs!