Key Takeaways
- Golden Master Testing is a technique where the output of a test is compared to a previously accepted result, known as the ‘Golden Master’. It is particularly useful for testing complex systems where exact output prediction is challenging.
- Refactoring complicated views is crucial for maintaining software quality as it improves code readability, reduces complexity, and enhances source code maintainability and extensibility.
- The Golden Master Testing process can be automated using various testing tools and frameworks. It involves creating a ‘Golden Master’ or an accepted version of the output by running the system with a known set of inputs and saving the result.
- Golden Master Testing is best used for complex systems where predicting the exact output is difficult, and the system is stable with infrequent changes. It can be used in conjunction with other testing methods like unit testing and integration testing.
View templates should be simple. They should, in fact, be so simple that there is no point in calculating a complexity score for them. In successful applications, however, this is often not the case. Logic sneaks into the templates unnoticed while everyone’s attention is fixed on some emergency elsewhere.
Refactoring templates can be thorny when their test coverage is dismal, which is most of the time. And rightly so! Why bother testing something that is going to change on a regular basis and should not contain any logic?
The trick, then, is to find a way to put a wall at your back when you decide that you do need to move logic out of the templates and into more appropriate locations in your application.
A Real-World Example
Tracks is a time-management tool based on David Allen’s 2002 book Getting Things Done.
The application has been under continuous, active development since Rails was in Beta, keeping up to date with Rails releases. As with many applications that have been around for a while, some parts of the codebase have gotten a bit unwieldy and out of control.
This provides us with an opportunity to examine the Golden Master technique using real-world code rather than a tiny, synthetic example that leaves the reader wondering how to transfer the lesson to actual production code.
Exploring Tracks
Looking closely at the codebase, you’ll discover large models, large controllers, and complex views. The controllers and views, for example, share a huge number of instance variables.
All of these things suggest that there might be missing abstractions and unnamed concepts that could make it easier to reason about the codebase.
The StatsController
exemplifies this. The file is over 450 lines long. It contains 37 methods, only 17 of which are controller actions.
The actions come in two flavors:
- Vanilla HTML pages
- plain-text collections of
key=value
pairs
The key=value
pairs are used to render charts within the HTML responses. The templates for these do some computation, set some local variables, and render a whole host of instance variables that were calculated in the controller.
Also, they are duplicated.
That is, they’re not perfectly duplicated, that would be too easy. They’re similar, but not identical.
In this exercise, we’ll take two templates that are very similar—actions_visible_running_time_data.html.erb
, and actions_running_time_data.html.erb
–and pull all the differences up one level into the controller. Finally, we’ll delete one of the views.
Wait, what do you mean pull stuff into the controller? Isn’t the controller already too big?
Well, yes, it is.
Also, it contains even more duplication than the templates do.
This is a common quandary when refactoring. Very often the act of refactoring makes things temporarily worse. It adds duplication and complexity. It can sometimes feel as though you’re pushing messy code from one junk drawer to another. But then you suddenly identify a useful abstraction, allowing you to collapse the complexity.
Those abstractions are hard to identify when the logic is spread out. Once everything is in the controller, we can get a better idea of what we’re tackling.
A Wall at Your Back
And so, it begins…
Oh wait. It doesn’t. We need to make sure we have good tests.
Running rake test
takes about 20 minutes. That’s too slow to be useful for refactoring. We can narrow it down to just the tests that cover the chart data endpoints.
The easiest way to figure out which tests are implicated is to raise 'hell'
from within the erb
templates we are looking to refactor:
<%-
raise "hell"
-%>
About 20 minutes later we can confirm that three tests blow up. All three tests are in the StatsController
tests, which can be run in isolation:
$ ruby -Itest test/controllers/stats_controller_test.rb
That takes 5:17 seconds. It’s not fast, but it will do.
How good are the tests? Let’s see what they complain about if we delete the entire contents of the two erb
files.
24 runs, 126 assertions, 0 failures, 0 errors, 0 skips
We’re going to need better tests.
Blissful Ignorance
The actions in the StatsController
look very complicated. If we write tests that make faulty or incomplete assumptions, we might introduce regression bugs while refactoring.
It would be less tedious and error-prone to just assume that whatever is happening right now is exactly what should be happening. If we could just capture the full key=value
output for a chart endpoint, it can be copied/pasted into the test and used as the assertion.
We won’t even have to understand what is going on!
This type of test is dirty, brittle, and very, very handy. Michael Feathers calls it a characterization test in his book Working Effectively with Legacy Code.
A characterization test is a temporary measure. You may never commit it to source control, it’s just there to get you out of a bind. Sometimes you’ll be able to replace it with better tests. Other times, such as with these views, the test will just serve to simplify the views, and then it can be deleted altogether.
Characterizing the Views
The test must:
- Capture the entire HTML response that gets generated when we call the controller action.
- Compare it to the response we’ve previously defined as “good” (i.e. the Golden Master).
- If they’re the same, the test passes. Otherwise it fails.
While we could write the test, assert that the response is “LOLUNICORNS”, watch it fail, and then copy/paste the entire response from the failure message, we’re going to use Llewellyn Falco’s Approval Tests pattern to semi-automate the process.
It goes like this:
- Always capture the response, as
received.html
. - Compare it to
approved.html
. - Fail if
received
is different fromapproved
.
In the case of a failure, eyeball the difference by opening both in browsers or using a diff tool. If you like received
better than approved
, manually rename the file.
This gives us a way to handle the very first Golden Master rule: Run the test, see it fail (because approved
is empty), move the received file over to be approved.
Voila, the test is now passing.
Setup
We need to do a bit of setup. First, it’s nice to be able to run these tests completely independently of any other test, so create a new file for them in test/functional/lockdown_test.rb
.
require File.expand_path(File.dirname(__FILE__) + '/../test_helper')
class LockdownTest < ActionController::TestCase
# tell Rails which controller we're using
tests StatsController
def test_output_does_not_change
# do some setup
# call the endpoint
# write the response to a file
# compare the two files and make an assertion
end
end
We actually need two tests, one for each endpoint. All but the call the endpoint step is exactly alike in both cases, so create a helper method for all the boiler-plate:
def with_golden_master(key)
FileUtils.mkdir_p('.lockdown')
FileUtils.touch(".lockdown/#{key}-approved.txt")
login_as(:admin_user)
Timecop.freeze(Time.utc(2014, 6, 5, 4, 3, 2, 1)) do
yield
end
received = @response.body
File.open(".lockdown/#{key}-received.txt", 'w') do |f|
f.puts received
end
approved = File.read(".lockdown/#{key}-approved.txt")
unless approved == received
assert false, approval_message(key)
end
end
def approval_message(key)
<<-MSG
FAIL:
The output changed. Eyeball the differences with:
diff .lockdown/#{key}-received.txt .lockdown/#{key}-approved.txt
If you like the #{key}-received.txt output, then
cp .lockdown/#{key}-received.txt .lockdown/#{key}-approved.txt
MSG
end
The fixture data gets generated relative to today’s date, so you will probably need to change the Timecop value to something a week or two in the future in order to get the tests to run.
Then, actually call the controller actions:
def test_visible_running_chart_stays_the_same
with_golden_master('vrt') do
get :actions_visible_running_time_data
end
end
def test_all_running_chart_stays_the_same
with_golden_master('art') do
get :actions_running_time_data
end
end
Run the tests:
$ ruby -Itest test/functional/lockdown_test.rb
Copy the received
file over the approved
file, and re-run the test to see it pass.
How good are these tests?
That depends on how good the data is. Tracks has fixtures, and we end up with more than one datapoint in the charts, so let’s assume that it’s fine.
Refactoring (Finally!)
We’ll do these one at a time. Here’s the actions running time view:
<%-
url_labels = Array.new(@count){ |i| url_for(:controller => 'stats', :action => 'show_selected_actions_from_chart', :index => i, :id=> "art") }
url_labels[@count]=url_for(:controller => 'stats', :action => 'show_selected_actions_from_chart', :index => @count, :id=> "art_end")
time_labels = Array.new(@count){ |i| "#{i}-#{i+1}" }
time_labels[0] = "< 1"
time_labels[@count] = "> #{@count}"
-%>
&title=<%= t('stats.running_time_all') %>,{font-size:16},&
&y_legend=<%= t('stats.running_time_all_legend.actions') %>,10,0x736AFF&
&y2_legend=<%= t('stats.running_time_all_legend.percentage') %>,10,0xFF0000&
&x_legend=<%= t('stats.running_time_all_legend.running_time') %>,11,0x736AFF&
&y_ticks=5,10,5&
&filled_bar=50,0x9933CC,0x8010A0&
&values=<%= @actions_running_time_array.join(",") -%>&
&links=<%= url_labels.join(",") %>&
&line_2=2,0xFF0000&
&values_2=<%= @cumulative_percent_done.join(",") %>&
&x_labels=<%= time_labels.join(",") %> &
&y_min=0&
<%
# add one to @max for people who have no actions completed yet.
# OpenFlashChart cannot handle y_max=0
-%>
&y_max=<%=1+@max_actions+@max_actions/10-%>&
&x_label_style=9,,2,2&
&show_y2=true&
&y2_lines=2&
&y2_min=0&
&y2_max=100&
First, because the values will need to be shared between the view and the controller, turn url_labels
into @url_labels
, and see the tests pass. Do the same for time_labels
.
When we move @url_labels
into the controller, the test fails.
# approved
/stats/show_selected_actions_from_chart/avrt?index=0
# received
http://test.host/stats/show_selected_actions_from_chart/avrt?index=0
The controller added http://test.host
to the generated urls. We can force it to return just the path by passing the :only_path => true
option to it:
url_for(:controller => 'stats', :action => 'show_selected_actions_from_chart', :index => i, :id=> "avrt", :only_path => true)
Moving time_labels
into the controller works without a hitch.
The next changes are even easier. For every single interpolated value, make sure that the controller sets an instance variable for the final value:
# Normalize all the variable names
@title = t('stats.running_time_all')
@y_legend = t('stats.running_time_legend.actions')
@y2_legend = t('stats.running_time_legend.percentage')
@x_legend = t('stats.running_time_legend.weeks')
@values = @actions_running_time_array.join(",")
@links = @url_labels.join(",")
@values_2 = @cumulative_percent_done.join(",")
@x_labels = @time_labels.join(",")
# add one to @max for people who have no actions completed yet.
# OpenFlashChart cannot handle y_max=0
@y_max = 1+@max_actions+@max_actions/10
We can make the same changes in visible actions running time. Now that both templates are identical, both endpoints can render the same template:
render :actions_running_time_data, :layout => false
Delete the unused template, and celebrate.
Then get a shovel, some smelling salts, and get ready to attack that controller. That, however, is a topic for another post (coming soon, don’t fret.)
Frequently Asked Questions (FAQs) about Golden Master Testing and Refactoring Complicated Views
What is Golden Master Testing?
Golden Master Testing is a software testing technique where the test’s output is compared against a previously accepted result, known as the ‘Golden Master’. This technique is particularly useful when dealing with complex systems where it’s difficult to predict the exact output. The Golden Master acts as the ‘truth’ against which the test results are compared. If the output matches the Golden Master, the test passes; otherwise, it fails.
How does Golden Master Testing differ from traditional testing methods?
Traditional testing methods often involve predicting the output and comparing it with the actual result. However, Golden Master Testing involves comparing the test’s output with a previously accepted result. This makes it a powerful tool for testing complex systems where predicting the exact output can be challenging.
What is refactoring in the context of software development?
Refactoring is the process of restructuring existing computer code without changing its external behavior. The goal is to improve the non-functional attributes of the software, such as readability, reduced complexity, improved source code maintainability, and extensibility.
Why is refactoring complicated views important?
Refactoring complicated views is crucial for maintaining the quality of the software. It helps in improving the readability of the code, making it easier for other developers to understand. It also reduces complexity, making the code easier to manage and less prone to errors.
How can I implement Golden Master Testing in my project?
Implementing Golden Master Testing involves creating a ‘Golden Master’ or an accepted version of the output. This can be done by running the system with a known set of inputs and saving the result. This saved result then serves as the ‘Golden Master’. For future tests, the system is run with the same set of inputs, and the output is compared with the ‘Golden Master’.
What are the challenges in Golden Master Testing?
One of the main challenges in Golden Master Testing is managing the ‘Golden Master’. If the system is complex and the output is large, storing and comparing the ‘Golden Master’ can be challenging. Also, if the system is non-deterministic, i.e., it gives different outputs for the same input, Golden Master Testing may not be suitable.
What are the benefits of Golden Master Testing?
Golden Master Testing is particularly useful for testing complex systems where predicting the exact output is difficult. It can also be used for regression testing, i.e., to check if a change in the system has inadvertently affected the existing functionality.
Can Golden Master Testing be automated?
Yes, Golden Master Testing can be automated. The process of running the system, capturing the output, and comparing it with the ‘Golden Master’ can be automated using various testing tools and frameworks.
When should I use Golden Master Testing?
Golden Master Testing is best used when dealing with complex systems where predicting the exact output is difficult. It’s also useful when the system is stable, and changes are infrequent.
Can Golden Master Testing be used in conjunction with other testing methods?
Yes, Golden Master Testing can be used in conjunction with other testing methods. It’s particularly useful as a complement to unit testing and integration testing. While these methods test individual components and their interactions, Golden Master Testing can be used to test the system as a whole.
Katrina is a developer working primarily in Ruby and Go, and formerly of Splice, Turing School of Software and Design and Bengler. She is the creator of exercism.io, an open source application that developers use to level up their programming skills. She is the co-author of 99 Bottles of OOP.