Weird shapesThe first time I read about serious testing was in The Pragmatic Programmer. The book explains the usual (boring) benefits of testing, but a twisted detail rolled my eyes up: also test your tests. Testing is a net that helps you to change the code without breaking the logic, and as a real life net, you should verify it works as expected. Tests should be in a tight relation with the code.

When is a test good? Trying to find the differences between a good test and a bad one is not obvious, however. Looking for lacks or anti-patterns in our tests is a good option to improve them.

Thanks to PHPUnit and Xdebug, the PHP community started to care about testing years ago. Since then, the easiest way to show the quality of a test suite is the code coverage, that is, the percentage of the code the test stresses. That worked until programmers started to focus on a 100% coverage, creating artificial tests that doesn’t stress the logic correctly, but instead get a fake 100% line coverage. If a line is executed once, even if the subject was a different test unit that uses that class, the line “is tested”.

Are you really testing each class? Following the logic? Even if you use proper unit tests, you may be missing things.

Let’s start with a stupid example, a function that does an “AND”, and a tests that gets a 100% coverage:

class MyOperator
{
    public function doAnd($param1, $param2)
    {
        if ($param1) {
            if ($param2) {
                return true;
            }
        }
        return false;
    }
}

class MyOperatorTest extends \PHPUnit_Framework_TestCase
{
    public function testDoAnd()
    {
        $operator = new MyOperator();
        $this->assertEquals(true,  $operator->doAnd(true, true));
        $this->assertEquals(false, $operator->doAnd(true, false));
    }
}

The test is only stressing 2 cases! Actually an “AND” has 4 possible cases, so the 2 missing cases (false-true, false-false) were totally ignored, despite you get a 100% line coverage.

This was a basic example to show the difference between line coverage and path coverage (in this case, 4 possible paths). The good news is that Derick Rethans is working on it. I wonder how many programmers will get surprised while seeing their code’s path coverage is low.

Another way to test your tests is to change the source code and see if the test fails (it should!) or not. This is called Mutation Testing, and helps to detect when a test is not working perfectly. For instance:

    public function biggerThan5($number)
    {
        if ($number > 5) {
            return true;
        }
        return false;
    }
/* ... */
    public function testBiggerThan5()
    {
        $operator = new MyOperator();
        $this->assertEquals(true, $operator->biggerThan5(8));
        $this->assertEquals(false, $operator->biggerThan5(3));
    }

This test looks complete, but there is no test for the bound case, biggerThat5(5). This test is not really accurate.

In the PHP ecosystem there are 2 only available Mutation Testing tools: Humbug and Mutatesting. The second one, despite the author is also the creator of the excellent PHP-metrics, seems abandoned.

So the only real option is Humbug, developed by the author of Mockery. Unluckily it only works with PHPUnit for the moment. It basically finds places where the code can be easily changed, like a true for a false, or a number N for N+1, and runs the tests to see if that mutation is killed (that is, the test fails). For instance, in the previous example it changes 5 to 6, and the tests still work, so the mutation was not killed.

I just hope these tools become more popular, in order to improve the quality of our industry. And let’s hope soon Humbug will work in PHPspec too, as many companies are moving from PHPUnit to Behat-PHPspec.

The code of this post can be find at its github repo.