Mutation Testing with Mutant

Key Takeaways

Mutant is a mutation tester for Ruby that modifies code in small ways to create ‘mutants’. If these changes cause a test to fail, the mutant is considered ‘killed’. Only if all mutants are killed can the code be considered to have 100% mutation coverage.
Mutation testing, while time-consuming, can be a powerful tool to ensure the effectiveness of test cases. It deliberately introduces faults into the code to see if the test cases can detect these changes, thereby improving the overall quality of the software.
Mutant has been used successfully in various open source and commercial projects. It is currently available for MRI and Rubinius, with JRuby support planned. However, it should be noted that support for Ruby 2.1.0 is unstable.
Mutation testing should not replace other testing methods, but rather complement them. It provides a unique perspective on the quality of test cases and should be used in conjunction with other methods to ensure comprehensive software testing.

mutant_logo

As Rubyists we are no strangers to testing. Unit testing is not just best practice, it is dogma. Code is considered broken until we have tests to prove otherwise. But not all test suites are created equal. Writing good tests is an art, and bad tests can be worse than no tests, slowing development down without increasing the confidence we have in our code.

So who tests the tests? The question may seem frivolous, but the answer is simple, Mutant does!

Mutant is a mutation tester. To appreciate what it does, let’s imagine doing its job by hand. Appoint a person in your team to be the saboteur, their job is to pick a piece of fully tested code and deliberately introduce defects. If this can be done without raising an alarm, in other words, without causing a test to fail, then a test is missing. It’s then up to the original author to add a test case to detect the sabotage.

Do this long enough and it will become very difficult to find code that can still be tampered with freely. Contrast this with traditional “line coverage” tools. Does 100% line coverage mean the code is impervious to sabotage? Certainly not! In fact, it’s possible to write tests that execute every single line of code without making a single useful assertion about them. The fun our saboteur will have!

Mutant automates this process, it changes your code in many small ways, creating hordes of mutants. If this freak code causes a test to fail, the mutant is considered killed. Only if, at the end of the line, not a single mutant is left alive have you achieved 100% mutation coverage.

We’ll explain Mutant with an example from the real world, demonstrating both the workings and the workflow. Our running example will be a tool that takes a local HTML file as its input, and bundles all local and remote assets together in a directory, so the document can be viewed afterwards without a network connection. Here’s how to use it:



AssetPackager.new('foo/bar.html').write_to('baz')

The result is a file baz.html, and a directory baz_assets containing all stylesheets, scripts and images. When encountering a reference like



<link rel="stylesheet" src="http://example.com/style.css" />

it will download the stylesheet, give it a unique file name based on its contents:

<link rel="stylesheet" src="baz_assets/48d6215903dff56238e52e8891380c8f.css" />

I only have space to reproduce the interesting bits here. The full revision history can be found on Github.

As a first step, we’ll write a method that can handle the different types of URI’s we want to handle. HTTP and HTTPS URI’s need to be retrieved as such, relative URI’s as well as URI’s using the file:// scheme will be searched for on the local file system.

This is the implementation:



module AssetPackager

  class Processor

    attr_reader :cwd
    # @param cwd [Pathname] The working directory for resolving relative paths

    def initialize(cwd)

      @cwd = cwd

    end
    def retrieve_asset(uri)

      uri = URI(uri)

      case

      when %w[http https].include?(uri.scheme) || uri.scheme.nil? && uri.host

        Net::HTTP.get(uri)

      when uri.scheme.nil? || uri.scheme == 'file'

        File.read(cwd.join(uri.path))

      end

    end

  end

end

And the first version of our tests. For the local URI’s, we’ll point to a fixture file. For the remote URI’s, we’ll mock out the call to Net::HTTP.get.



describe AssetPackager::Processor do

  let(:cwd) { AssetPackager::ROOT }

  let(:processor) { AssetPackager::Processor.new(cwd) }
  describe '#retrieve_asset' do

    subject(:asset) { processor.retrieve_asset(uri) }
    shared_examples 'local files' do |uri|

      it 'should load the file from the local file system' do

        expect(processor.retrieve_asset(uri)).to eq 'section { color: blue; }'

      end

    end
    shared_examples 'remote URIs' do |uri|

      it 'should retrieve the file through Net::HTTP' do

        expect(Net::HTTP).to receive(:get).with(URI(uri)).and_return('abc')

        expect(processor.retrieve_asset(uri)).to eq 'abc'

      end

    end
    fixture_pathname = AssetPackager::ROOT.join 'spec/fixtures/section.css'
    include_examples 'local files', fixture_pathname.to_s

    include_examples 'local files', "file://#{fixture_pathname}"
    include_examples 'remote URIs', 'http://foo.bar/baz'

    include_examples 'remote URIs', 'https://foo.bar/baz'

  end

end

According to rpsec all is green and good, and we’re certainly covering all lines of retrieve_asset. Let’s see what Mutant has to say.



mutant -I lib -r asset_packager --use rspec 'AssetPackager*'

That’s a mouthful. First, tell Mutant how to load our code under test using the same -I, --include and -r, --require flags that Ruby itself uses.

Then specify which “strategy” to use to “kill” mutants. Currently only the RSpec strategy is implemented, which makes for easy picking. Finally, hand Mutant one or more “patterns”. In this case, tell it to do its magic on the complete AssetPackager namespace (notice the *).

We could also pass it the name of a single class, module, class method (Foo::Bar.the_method), or instance method (Foo::Bar#an_instance_method).

Based on the pattern, Mutant will search for subjects to drag off to the lab and have their genes rearranged. Mutant can currently handle instance and class methods. Meta-programming constructs like attr_accessor or class level DSL’s are not supported, although there is talk of handling specific DSL’s through plug-ins.



AssetPackager::Processor#initialize

........

(08/08) 100% - 0.45s

AssetPackager::Processor#retrieve_asset

...................F...........F.................

(47/49)  95% - 3.49s
evil:AssetPackager::Processor#retrieve_asset

@@ -1,10 +1,10 @@

 def retrieve_asset(uri)

   uri = URI(uri)

   case

-  when ["http", "https"].include?(uri.scheme) || (uri.scheme.nil? && uri.host)

+  when ["http", "https"].include?(uri.scheme)

     Net::HTTP.get(uri)

   when uri.scheme.nil? || (uri.scheme == "file")

     File.read(cwd.join(uri.path))

   end

 end
evil:AssetPackager::Processor#retrieve_asset

@@ -1,10 +1,10 @@

 def retrieve_asset(uri)

   uri = URI(uri)

   case

   when ["http", "https"].include?(uri.scheme) || (uri.scheme.nil? && uri.host)

     Net::HTTP.get(uri)

   when uri.scheme.nil? || (uri.scheme == "file")

-    File.read(cwd.join(uri.path))

+    File.read(uri.path)

   end

 end
(47/49)  95% - 3.49s

Subjects:  2

Mutations: 57

Kills:     55

Alive:     2

Overhead:  29.31%

Coverage:  96.49%

Expected:  100.00%

Having a closer look at Mutant’s output, it found two subjects to operate on, #initialize, and #retrieve_asset. For each, the output looks a lot like any old test runner, with green dots and red F’s indicating success or failure. In this case, though, a character doesn’t correspond with a single succeeding or failing test, but with a complete run of the test suite, exercised against a mutated version of the subject.

Our constructor is a simple enough method, but Mutant still managed to find 8 ways to change it. This includes omitting the argument list, or assigning nil instead of a value. However none of these freak versions made it past our defenses. The same can’t be said of #retrieve_asset. There 49 mutants were created, and at the end of the run two are left alive! This means we have behavior in our code unspecified by our tests, let’s fix that before the mutants come back to haunt us with production incidents.

To make life easier, also stick the Mutant invocation in a Rakefile, and tell Mutant to fail when mutation coverage is below 100%. This way we can run rake mutant from our CI to make sure everything stays fully covered.



desc 'Run mutation tests on the full AssetPackager namespace'

task :mutant do

  result = Mutant::CLI.run(%w[-Ilib -rasset_packager --use rspec --score 100 AssetPackager*])

  fail unless result == Mutant::CLI::EXIT_SUCCESS

end

Now to dissect the mutants that are left alive. For each altered version of the code that made it past our defenses mutant gives us an easy to read diff.



-  when ["http", "https"].include?(uri.scheme) || (uri.scheme.nil? && uri.host)

+  when ["http", "https"].include?(uri.scheme)

Here our sabotaging mutation tester deleted the second half of the conditional, which is supposed to recognize URIs of the form //example.com/foo/bar. This was indeed a case we forgot to cover in our tests, but that’s easy to fix.



include_examples 'remote URIs', '//foo.bar/baz'

The second diff initially leaves us a bit stumped though.



-    File.read(cwd.join(uri.path))

+    File.read(uri.path)

We need to be able to resolve both absolute (/foo/bar/style.css) and relative (assets/stuff.js) local files. For relative paths, we look them up starting from the “current working directory” or cwd, a Pathname instance. For absolute paths, join will simply pass through the absolute path. This code should cover both cases, and we cover both in our tests, but according to mutant removing the call to cwd.join doesn’t make a difference. The test for the relative path isn’t working properly.

On closer inspection, the path used in our test as the “working directory” is the same location from which we run the tests. In the mutated version, File.read gets the relative path, and resolves it for us. To make sure our path resolution works as expected we need to change the test to work off a different directory.



describe 'with a relative path' do

  let(:cwd) { super().join('spec/fixtures') }

  let(:uri) { fixture_pathname.relative_path_from(cwd).to_s }

  include_examples 'local files'

end

It is possible that a test-first, watch-the-test-fail style of development would have caught this error. But through the life of a bigger project, some things are bound to be missed. Especially after refactoring, you’ll encounter lots of live mutants indicating untested behavior. By going back to full mutation coverage, you will also find any defects that slipped in while refactoring.

Mutation testing isn’t new, in fact it’s been around since the seventies, and an experimental gem called Heckle was the first to bring mutation testing to Ruby. Heckle had some significant shortcomings, however. It never supported all possible Ruby syntax, and the latest release dates from 2009, making newer Ruby versions completely off limits.

This led Markus Schirp, part of the ROM team (formerly: DataMapper), to start working on Mutant. An ambitious effort to write a robust, production-ready mutation tester. Mutant is still pre-1.0, but is already used with success on various open source and commercial projects.

It’s no small feat to get a tool like Mutant right. A problem seen in the early days was that by altering the syntax tree, Mutant could generate code that isn’t syntactically valid Ruby, such as the following:



def foo(a = 1, b, c = 2) # second optional argument deleted

These problems seem to all have been solved now. Under the hood, Mutant is powered by the excellent Parser and Unparser gems, which have been validated against Rubyspec, the Rails code base, and more.

Mutant is currently available for MRI and Rubinius. JRuby support is planned, but stalled on the fact that JRuby does not support the fork system call. Support for Ruby 2.1.0 is unstable.

If your Ruby version supports it, you need to get Mutant into your workflow. It may save your app’s life.

Frequently Asked Questions (FAQs) about Mutation Testing

What is the main purpose of mutation testing?

Mutation testing is a type of software testing where we modify the software’s source code and check if the existing test cases can find the errors. It’s a way to measure the effectiveness of your test cases. It helps in validating the quality of your tests and ensures that they are robust enough to catch errors, thereby improving the overall quality of the software.

How does mutation testing differ from other testing methods?

Unlike other testing methods that focus on finding bugs in the code, mutation testing aims to assess the quality of the test cases themselves. It deliberately introduces faults or mutations in the code to see if the test cases can detect these changes. This makes it a unique and powerful tool for improving test case effectiveness.

What are the types of mutants in mutation testing?

There are two main types of mutants in mutation testing: killed and survived. A killed mutant is one where the test cases have successfully detected the changes, indicating good test case effectiveness. A survived mutant, on the other hand, is one where the changes have gone undetected, suggesting that the test cases may need improvement.

What are the challenges in mutation testing?

Mutation testing can be time-consuming and computationally expensive, especially for large codebases. It also requires a good understanding of the code and the testing process. Additionally, interpreting the results can be complex, as not all surviving mutants necessarily indicate a problem with the test cases.

How can I improve the effectiveness of my test cases using mutation testing?

By identifying the survived mutants, you can gain insights into potential weaknesses in your test cases. You can then focus on improving these areas, for example, by adding new test cases or modifying existing ones. This iterative process can significantly enhance the effectiveness of your test cases over time.

Is mutation testing applicable to all types of software?

While mutation testing can be applied to any software, it is particularly useful for critical systems where high test coverage is essential. It can also be beneficial for complex systems where traditional testing methods may not be sufficient to ensure the quality of the test cases.

What tools are available for mutation testing?

There are several tools available for mutation testing, including Mutant, PIT, and Jumble. These tools automate the process of introducing mutations and running the test cases, making mutation testing more manageable and efficient.

How does mutation testing contribute to software quality?

By improving the effectiveness of the test cases, mutation testing can significantly enhance the overall quality of the software. It helps ensure that the software behaves as expected under various conditions and can handle potential errors gracefully.

Can mutation testing replace other testing methods?

No, mutation testing is not meant to replace other testing methods. Instead, it complements them by providing a unique perspective on the quality of the test cases. It should be used in conjunction with other testing methods to ensure comprehensive software testing.

What is the future of mutation testing?

With the increasing complexity of software systems, the importance of effective test cases is growing. As such, mutation testing is likely to become an integral part of the software testing process. Advances in tooling and techniques are also expected to make mutation testing more accessible and efficient in the future.