Test Driven Development: Are You Test-infected?

When given a set of requirements to develop an application, most programmers can knock out something that works eventually, right? But all too soon after come requests for changes, and with those changes come the bugs. Perhaps they start small initially. They may even go completely unnoticed, but over time these small bugs begin to plague your application, and make you want to run away from the project screaming with your arms in the air. Yet it’s not fixing the bugs that’s usually the problem – it’s finding them in the first place. And as we know, a programmer can spend hours, perhaps even days trying to track down a bug that requires one line of code to fix.

Aside from the accumulating bug problem, you’re also faced with the potential for feature-creep as you add a little extra code here and there. Before you know it, your application has all these fancy bells and whistles that nobody will ever use, and you have a lot more code to maintain than the original specification intended.

Three years ago this was an all-too-familiar scenario for me. I’d heard people advocating the use of test-driven development (TDD) but put simply, I just never bought into the idea. But when I eventually got around to giving it a go for myself, I never looked back! This seems to be a regular occurrence among software developers. Generally we say people who took the leap into TDD and subsequently never develop without it are “test-infected”. Apparently there’s still no known cure!

Don’t get me wrong … learning to use TDD correctly is not easy. I got off to an incredibly bumpy start and very nearly gave up. My tests were breaking whenever I made changes to my source code and I wasn’t sure how to write many of the tests at all. It took a lot of concentration and the end result wasn’t nearly as elegant as I had been led to expect. I wasn’t the first and I won’t be the last. But, as when you learn anything for the first time, making these mistakes is what you often need to do in order to understand why you must do things a particular way. Three years on, I’m still learning how to write better tests but the tests I do write are a lot more flexible and very maintainable.

Introducing TDD

Let me tell you a little about my experiences with TDD, how it’s added to the enjoyment I get from writing software, and how the quality of my source code has improved as a result.

To explain the basics of TDD to those who have never used it before, we should perhaps think about the traditional form of testing. Traditionally, the three stages of software development are Planning, Implementation, and Testing. A developer would spend roughly equal amounts of time on each of these steps. The problem here is that although the plans seem bulletproof and the code may not be too difficult to write, the testing just happens far too late in the game. Bug reports get back to the developer after the software is already built, but obviously, with nearing deadlines, this is far from an ideal situation.

In reality, the process has never been quite as clear-cut as those three steps, however. Developers usually test small portions of their code once they’ve finished writing them; we call those small portions of code “units”. Whether you’re using the system from an end-user perspective, or you’re adding debug output to the code and emulating particular scenarios the software might encounter, this sort of testing gets tedious very quickly. There are only so many times somebody can perform the manual checking of values before they tire of repeating the process and simply stop. Traditionally, it’s unlikely that a developer would return to aspects of the system they’ve already written and perform that manual testing once again. As a result, small changes here and there are likely to introduce hidden bugs in other parts the system.

Test-driven development, on the other hand, goes against the grain a little. The process looks more like Planning, Designing, Testing, Implementation, Refactoring, Designing, Testing, Implementation, Refactoring … and so on, with far less emphasis on the planning phase as compared with traditional development. The Designing step is important. In order to write a test, the developers need to know what they’re testing. To do this, they’re constantly designing very isolated components in the system. This encourages a great deal of flexibility with the code, since readily testable code and flexibility often go hand in hand.

TDD usually starts with a unit test framework. That is not to say that test-driven development is unit testing – it’s not. Because I develop primarily with PHP, I chose SimpleTest, written by Marcus Baker. PHP offers another popular unit test framework called PHPUnit, but I opted for SimpleTest because it was the more mature of the two, and had more accessible support (largely due to the fact the author is a regular visitor here on the SitePoint Forums). Almost all unit test frameworks are extremely similar. JUnit – Java’s most widely used unit test framework – brought this methodology into the mainstream. Various frameworks written in other programming languages subsequently emerged more-or-less following JUnit’s minimal API. We call this group of frameworks the xUnit frameworks.

How do we test the system before we’ve implemented it, though? The philosophy seems flawed at first glance, but once the concept begins to sink in, it makes perfect sense. The tests themselves are written in code and make various assertions about the behaviour of the system under test (SUT). Writing the test is done inside a unit test framework such as SimpleTest. The idea is to write a small, simple test that expresses a behavioural requirement. For example, the SUT may be expected to return a negative value when passed a particular set of arguments. The unit test framework provides the means to make these assertions, and gives useful feedback to the developer if one or more of those assertions fail.

In TDD, the developer will write a deliberately failing test (since the SUT won’t have any implementation for the requirement yet), then go on to write the most obvious and simplest code to make this test pass. Once the test passes, the developer writes the next (probably failing) test, implements a little code to make it pass, and then moves on. The end result is that over time you have two large sets of source code – one is the test code itself; the other is the source code of the software. It’s highly probable that there will be more lines of code providing test coverage than there will be actual source code. There are several benefits of keeping all that test code in your project directory:

Running the tests ensures that the behavioural requirements they specify are still met.
They provide support to developers who wish to modify the software without breaking or changing its existing behaviour.
They provide a form of documentation for developers who need to understand how the software works.

All the xUnit frameworks provide mechanisms to group these tests together into a test suite and to run them all in one go, automatically. This makes the process of retesting the software far less cumbersome and much more appealing than all that traditional manual testing.

TDD In Action

Let’s look at a simple example of how one would use TDD with the SimpleTest unit test framework in PHP. I’m afraid you’ll have to use your imagination here because this example is obviously incredibly minimal in comparison to the large-scale applications on which TDD is usually employed. It’s beyond the scope of this article to tackle a larger problem. This article is not intended to be a beginner’s introduction to TDD – there are plenty of other articles floating around the Internet for that. Here, we’ll take the example of a basic filter chain, which we’d like to use to generate hyperlinks within text and to filter out naughty words. You can download the code for the example in this article, along with the tests.

Our interface might look something like this:

...  

  

/**  

 * Performs a single filtering method on some input text.  

 */  

interface TextFilter {  

  /** Process $text and return a filtered value */  

  public function filter($text);  

}  

  

/**  

 * Performs filtering on text using a series of filters.  

 */  

class TextFilterChain {  

  /** Add a new filter to this chain */  

  public function addFilter(TextFilter $filter) {  

  }  

  /** Pass $text through all filters and return the filtered value */  

  public function filter($text) {  

  }  

}  

  

...

We’d like an AutoHyperlinkFilter and a NaughtyWordFilter, so we pick one and create merely the skeleton code for it:

...  

  

class NaughtyWordFilter implements TextFilter {  

  public function filter($text) {  

  }  

}  

...

Then we create a test case. A test case is a single class. Typically, you have one test case for each concrete class in your system. The “thing” that the test case is testing is, as we saw above, often referred to as the SUT (the system under test). Any method inside the class, which begins with the word “test”, will be executed and reported upon. We expect a set of naughty words to be replaced here:

...  

  

class NaughtyWordFilterTest extends UnitTestCase {  

  public function testNaughtyWordsAreReplaced() {  

    $filter = $this->_createFilter(array('foo', 'bar'));  

    $this->assertEqual(  

      "smurf! There's no way I'm doing that smurf...",    

      $filter->filter("foo! There's no way I'm doing that bar...")  

      );  

  }  

  

  private function _createFilter($words = array()) {  

    return new NaughtyWordFilter($words);  

  }  

}  

  

...

Now we run it:

...  

  

$test = new NaughtyWordFilterTest();  

$test->run(new TextReporter());  

  

...

The test fails because we haven’t implemented our NaughtyWordFilter yet. This is good because the failing test is our signal to go ahead and write some code. It’s a specification, if you like – a target to hit.

The failure looks something like this:

NaughtyWordFilterTest.php  

1) Equal expectation fails at character 0 with [smurf! There's no way I'm doing that smurf...] and [] at [/Users/chris/word_filter/tests/unit/NaughtyWordFilterTest.php line 11]  

  in testNaughtyWordsAreReplaced  

  in NaughtyWordFilterTest  

FAILURES!!!  

Test cases run: 1/1, Passes: 0, Failures: 1, Exceptions: 0

All we do is implement enough code to make this test pass:

...  

  

class NaughtyWordFilter implements TextFilter {  

  private $_badWords = array();  

  

  public function __construct($badWords = array()) {  

    $this->_badWords = $badWords;  

  }  

  

  public function filter($text) {  

    foreach ($this->_badWords as $badWord) {  

      $text = str_replace($badWord, 'smurf', $text);  

    }  

    return $text;  

  }  

}  

  

...

Now that this test is passing, we see a less intimidating output when we run the test:

NaughtyWordFilterTest.php  

OK  

Test cases run: 1/1, Passes: 1, Failures: 0, Exceptions: 0

Now we can move onto our next class, AutoHyperlinkFilter. Here’s the skeleton code:

...  

  

class AutoHyperlinkFilter implements TextFilter {  

  public function filter($text) {  

  }  

}  

  

...

We write a failing test next. We expect URLs to be turned into hyperlinks:

...  

  

class AutoHyperlinkFilterTest extends UnitTestCase {  

  

  public function testURLsAreHyperlinked() {  

    $filter = $this->_createFilter();  

    $this->assertEqual(  

      'Go to my web site at <a href="http://site.com/">http://site.com/</a> and see!',  

      $filter->filter('Go to my web site at http://site.com/ and see!')  

      );  

  }  

  

  private function _createFilter() {  

    return new AutoHyperlinkFilter();  

  }  

  

}  

  

...

Implementation is next:

...  

  

class AutoHyperlinkFilter implements TextFilter {  

  public function filter($text) {  

    return preg_replace('~(http://S+)~i', '<a href="$1">$1</a>', $text);  

  }  

}  

  

...

We work through all of our required components, writing really short, concise tests for the behaviour of each component until we’re finished.

So, why did I make some of those small (so small, in fact, that you may not even have noticed) design decisions along the way? Why, for example, did I choose to specify some “naughty words” using the constructor of NaughtyWordFilter? The answer is that this felt like the cleanest solution at an API level. It just came instinctively. This is the sort of constant design thought process that developers go through to produce clean, flexible, testable code. Writing the tests encourages you to think carefully about the interface of your code, otherwise you won’t be able to test it very easily.

Often, you’ll want to avoid using real components (which may have their own bugs, or may be awkward to set up) in a test so that you can focus entirely on the behaviour of the SUT. In this case, we use mock objects. Mock objects are objects that look and feel like real objects but are able to replace real components and play their roles in the SUT (they’re often referred to as “actors” or “stubs”). Mock objects are also able to make assumptions about what the SUT will do with them (often referred to as “critics”). This makes them an extremely powerful tool for use within any unit test framework.

SimpleTest provides its own mock object framework, but with other programming languages, you often have to download a separate tool if you want automated mock object generation.

Let’s quickly write our TextFilterChain class. Since this class has dependencies on instances of the TextFilter interface, it presents a prime opportunity for using mock objects.

First, we create a failing test case with a generated mock object. We expect each filter to be invoked:

...   

Mock::generate('TextFilter', 'MockTextFilter');   

   

class TextFilterChainTest extends UnitTestCase {   

  private $_filterChain;   

   

  public function setUp() {   

    $this->_filterChain = new TextFilterChain();   

  }   

   

  public function testEachFilterIsInvoked() {   

    $filter1 = $this->_createMockFilter();   

    $filter2 = $this->_createMockFilter();   

   

    $filter1->expectOnce('filter');   

    $filter2->expectOnce('filter');   

   

    $this->_filterChain->addFilter($filter1);   

    $this->_filterChain->addFilter($filter2);   

   

    $this->_filterChain->filter('foo');   

  }   

   

  private function _createMockFilter() {   

    return new MockTextFilter();   

  }   

}   

   

...

The failure looks something like this:

TextFilterChainTest.php   

1) Expected call count for [filter] was [1] got [0] at [/Users/chris/word_filter/tests/unit/TextFilterChainTest.php line 20]   

  in testEachFilterIsInvoked   

  in TextFilterChainTest   

2) Expected call count for [filter] was [1] got [0] at [/Users/chris/word_filter/tests/unit/TextFilterChainTest.php line 21]   

  in testEachFilterIsInvoked   

  in TextFilterChainTest   

FAILURES!!!   

Test cases run: 1/1, Passes: 0, Failures: 2, Exceptions: 0

The test specified that if two filters are added to the filter chain, the filter chain should call the filter() method on both of them. This hasn’t happened, because we haven’t implemented such a feature yet.

Implementation follows:

...   

   

class TextFilterChain {   

  private $_filters = array();   

   

  public function addFilter(TextFilter $filter) {   

    $this->_filters[] = $filter;   

  }   

   

  public function filter($text) {   

    foreach ($this->_filters as $filter) {   

      $filter->filter($text);   

    }   

  }   

}   

   

...

The test now passes, and we move on to specify what else should happen. Each filter should be given the text we pass in:

...   

   

  public function testFilterInvocationReceivesTextInput() {   

    $filter = $this->_createMockFilter();   

   

    $filter->expectOnce('filter', array('foo'));   

   

    $this->_filterChain->addFilter($filter);   

   

    $this->_filterChain->filter('foo');   

  }   

   

...

This particular test already passes, so we move on. If one filter changes the text, the next should receive the changed value:

...   

   

  public function testChangesAreChained() {   

    $filter1 = $this->_createMockFilter();   

    $filter2 = $this->_createMockFilter();   

   

    $filter1->expectOnce('filter', array('foo'));   

    $filter1->setReturnValue('filter', '***FOO***');   

    $filter2->expectOnce('filter', array('***FOO***'));                                        

   

    $this->_filterChain->addFilter($filter1);   

    $this->_filterChain->addFilter($filter2);   

   

    $this->_filterChain->filter('foo');   

  }   

   

...

The test fails, so we adjust our implementation to make it pass:

...   

   

  public function filter($text) {   

    foreach ($this->_filters as $filter) {   

      $text = $filter->filter($text);   

    }   

  }   

   

...

Finally, we expect the filtered value to be returned from the chain:

...   

   

  public function testFilteredValueIsReturnedFromChain() {   

    $filter = $this->_createMockFilter();   

       

    $filter->expectOnce('filter', array('foo'));   

    $filter->setReturnValue('filter', '***FOO***');   

     

    $this->_filterChain->addFilter($filter);   

       

    $this->assertEqual('***FOO***', $this->_filterChain->filter('foo'));   

  }   

   

...

Faced with the failing test, we adjust our code:

...   

   

  public function filter($text) {   

    foreach ($this->_filters as $filter) {   

      $text = $filter->filter($text);   

    }   

    return $text;   

  }   

   

...

That’s it, we’re done! If all these tests are collected into a test suite, they can be executed in a single test run, automatically. Obviously, the small scale of this project doesn’t make an overwhelming case for using TDD, but when you extrapolate the same technique into a much larger project, the benefits soon begin to speak for themselves.

Getting Test Infected

I’m currently in the process of rewriting a well-known open source software project of mine. Just a few weeks back, I decided that one of my earlier design decisions could have been a little better so I set about a massive refactoring exercise. Without the tests, I would never have dared attempt this exercise, given the scale of it, and even if I had braved it, I wouldn’t have believed that there were no hidden bugs. I had 77 test cases covering that several thousand-line code base at the time, and the comfort offered by those tests was significant, to say the least. Initially, the tests failed due to the scale of the work I was doing, but these failing tests acted as a guide while I changed interfaces, moved code around, and renamed methods. Eventually, one by one, the tests began to pass, until finally all 77 of them passed. I felt reassured that the huge changes I’d just deployed to my code base were non-breaking.

Back to a point I made earlier in this article, though. TDD wasn’t always this satisfying for me. In fact, my first few months with TDD were hell: half of the time, I was fixing up tests that were failing because I changed some implementation detail slightly. Although the system still worked, my tests said otherwise. We call these tests fragile tests. I also spent a lot of time tearing out my hair wondering how to even begin to test a particular part of the system. It’s all too common a story, and it’s a shame to hear that developers have decided not to use TDD as a result of these initial tough experiences. Let me share some tips with you that I’ve learned along the way:

Write extremely short, concise test methods – a good rule of thumb is one assertion per test method, but it’s only a rule of thumb!
Never – and I can’t stress this enough – test non-pubic parts of the system. Even if you think they play an important role, they’re extremely likely to change with refactoring. TDD focuses on behaviour, not on implementation.
Add some abstraction between the test and the SUT. Specifically, create the SUT in the setUp() method of your xUnit framework where possible, and/or create small factory methods for creating the SUT. These factory methods are often referred to as Creation Methods. Creation Methods make it easier to minimize the number of places in which you’ll need to edit your tests if you change the way you need to initialize the SUT.
If you feel that you’re repeating yourself when you’re testing a common set of components, consider whether you can refactor to provide an abstract superclass to test some of the common functionality.
Opt to use dependency injection when creating components. Doing this significantly increases the test-friendliness of your code – it lets you gain more control through the use of mock objects.

Although these solutions became obvious to me over time, I must admit that they became obvious to many other developers years ago. Gerard Meszaros has an entire 800-page book dedicated to this subject area. xUnit Test Patterns – Refactoring Test Code demonstrates the common approaches developers have found to improve the way they automate tests. If you have some testing experience, it’s well worth a read.

I hope this article was insightful for those who have dabbled in TDD. I hope it’s encouraging for those who have tried TDD in the past, but gave up when faced with similar problems to the ones I dealt with in my early TDD days. More importantly, I hope it provided an interesting read about my experiences with TDD for everybody who made it this far!

And don’t forget you can download the code for the example in this article, along with the tests. Go away and have play. Be warned, though – it’s contagious …