Golden Master: Discovering Abstractions
What started as a quest to tame some unruly views, turned into a long and arduous slog through controllers, helper methods, duplication, and—finally—confusion.
The current state of affairs is that we have a model object that encapsulates all the logic that used to live in the views and controllers. We’ve extracted methods and renamed variables finding some measure of understanding. The code is no longer scary or intimidating, but it’s still not anything to boast about.
module Stats
class RunningTimeData
include Rails.application.routes.url_helpers
attr_reader :title, :dates, :key, :today
def initialize(title, timestamps, key, now)
@title = title
@dates = timestamps.map(&:to_date)
@key = key
@today = now.to_date
end
def y_legend
I18n.t('stats.running_time_legend.actions')
end
def y2_legend
I18n.t('stats.running_time_legend.percentage')
end
def x_legend
I18n.t('stats.running_time_legend.weeks')
end
def values
datapoints_per_week_in_chart.join(",")
end
def links
url_labels.join(",")
end
def values_2
cumulative_percentages.join(",")
end
def x_labels
time_labels.join(",")
end
def y_max
# add one to @max for people who have no actions completed yet.
# OpenFlashChart cannot handle y_max=0
1 + datapoints_per_week_in_chart.max + datapoints_per_week_in_chart.max/10
end
private
def url_labels
Array.new(total_weeks_in_chart) { |i| url(i, key) } << url(total_weeks_in_chart, "#{key}_end")
end
def url(index, id)
options = {
:controller => 'stats',
:action => 'show_selected_actions_from_chart',
:index => index,
:id=> id,
:only_path => true
}
url_for(options)
end
def time_labels
labels = Array.new(total_weeks_in_chart) { |i| "#{i}-#{i+1}" }
labels[0] = "< 1"
labels[total_weeks_in_chart] = "> #{total_weeks_in_chart}"
labels
end
def total_weeks_in_chart
[52, total_weeks_of_data].min
end
def total_weeks_of_data
weeks_since(dates.last)
end
def weeks_since(date)
(today - date).to_i / 7
end
def cumulative_percentages
running_total = 0
percentages_per_week.map {|count| running_total += count}
end
def percentages_per_week
datapoints_per_week_in_chart.map(&percentage)
end
def percentage
Proc.new {|count| (count * 100.0 / dates.count)}
end
def datapoints_per_week_in_chart
frequencies = Array.new(total_weeks_in_chart) {|i|
datapoints_per_week[i].to_i
}
frequencies << datapoints_per_week.inject(:+) - frequencies.inject(:+)
end
def datapoints_per_week
frequencies = Array.new(total_weeks_of_data + 1, 0)
dates.each {|date|
frequencies[weeks_since(date)] += 1
}
frequencies
end
end
end
Shades of Good
Steve Freeman and Nat Pryce, authors of Growing Object-Oriented Software Guided by Tests GOOS) talk about four desirable characteristics of object-oriented code:
- Loosely coupled
- Highly cohesive
- Easily composable
- Context independent
The Stats::RunningTimeData
object fails on all four accounts.
Loosely Coupled
This code depends on the I18n
library.
I18n.t('stats.running_time_legend.percentage')
Worse, though, you can’t instantiate it without Rails!
include Rails.application.routes.url_helpers
Imagine wanting to use the same functionality in a Sinatra application or a stand-alone email application that delivers reports on a regular basis. It wouldn’t work in its current form, because you’d need to drag Rails along.
Highly Cohesive
The code is certainly not cohesive.
It has some methods that are all about describing a chart: x_legend
, y_legend
, values
. It also has a bunch of methods that have nothing to do with charts. They’re more about weeks: weeks_since(date)
, total_weeks_of_data
, datapoints_per_week
.
Cohesiveness is often related to the Single Responsibility Principle (SRP). Does this object do too many disparate things? Does it have many different reasons to change?
It seems like there are at least two different things here: weekly statistics and chart configuration.
Easily Composable
This is a big hulking object. You can use it or you can leave it, but you’re not going to combine it with other objects to make something useful.
Context Independent
This class can only be used in the context of the two specific charts that it was made to handle. If you wanted to email a plain text report of the same data in addition to showing the chart, you would be plain out of luck.
Make It So
The methods we need to extract out of the Stats::RunningTimeData
model are:
- datapoints per week
- percentages per week
- cumulative percentages per week
Since per week repeats, it seems reasonable to make that part of the class.
But what if you want monthly statistics too?
Yeah, that’s a fair question. We might. But right now we don’t, and we don’t know.
Let’s pretend that we aren’t going to need it. Later, if we see duplication, then perhaps there will be an obvious abstraction. For now it’s easier to deal with what is actually here, than some nebulous and hypothetical future.
Introducing WeeklyHistogram
We’ve been relying heavily on the Golden Master tests. Now it’s time to start fresh:
mkdir test/models/stats
touch test/models/stats/weekly_histogram_test.rb
One of the nicest things about creating a completely context-independent, loosely coupled class is that you don’t have to wait for the entire framework to load when running the tests.
Up until now, we’ve had to wait 3 seconds for every test run, even though we’ve been executing a single file. The lockdown test goes through the database, controllers, views—everything.
3 seconds. That’s long enough to check Twitter.
In the new tests, all we need is date
from the Standard Library, minitest
, and the (as yet hypothetical) weekly_histogram
class.
require 'date'
require 'minitest/autorun'
require './app/models/stats/weekly_histogram'
class StatsWeeklyHistogramTest < Minitest::Test
end
This runs in less than 300 ms. It feels almost instantaneous.
Implementing Histogram
In the spirit of programming by wishful thinking, here’s the API that I’d like to have:
histogram.counts
histogram.percentages
histogram.cumulative_percentages
The refactoring we did in the previous two articles uncovered some interesting characteristics about the stats:
- There is a cutoff: A maximum number of weeks to include as independent statistics.
- If the cutoff is older than the oldest datapoint, then you don’t fill in empty weeks up to the cutoff.
- If you have more weeks worth of data than the cutoff, then everything in the overflow will be combined into a single bucket.
- If we have no data for a given week in the middle of our dataset, it will correctly be counted as
0
.
We’re going to need two tests to cover this: a test for fewer weeks than the cutoff, and one for more weeks than the cutoff. The test should include a week of no data, to make sure we don’t accidentally lose that behavior.
Here’s the test suite that I ended up with covering the entire API of the new object:
def test_fewer_weeks_than_cutoff
today = Date.new(2014, 7, 1)
cutoff = 10 # weeks
dates = [
Date.new(2014, 6, 30),
Date.new(2014, 6, 27),
Date.new(2014, 6, 23),
Date.new(2014, 6, 23),
Date.new(2014, 6, 22),
# no data for 3 weeks ago
Date.new(2014, 6, 9),
]
histogram = Stats::WeeklyHistogram.new(dates, cutoff, today)
assert_equal [2, 3, 0, 1], histogram.counts
end
def test_more_weeks_than_cutoff
today = Date.new(2014, 7, 1)
cutoff = 3 # weeks
dates = [
Date.new(2014, 6, 30),
Date.new(2014, 6, 21),
Date.new(2014, 6, 20),
Date.new(2014, 6, 1), # June
Date.new(2014, 5, 1), # May
Date.new(2014, 4, 1), # April
Date.new(2014, 3, 1), # March
]
histogram = Stats::WeeklyHistogram.new(dates, cutoff, today)
assert_equal [1, 2, 0, 4], histogram.counts
end
def test_percentages
today = Date.new(2014, 7, 1)
dates = [
Date.new(2014, 6, 30),
Date.new(2014, 6, 19),
Date.new(2014, 6, 18),
Date.new(2014, 6, 7),
]
histogram = Stats::WeeklyHistogram.new(dates, 4, today)
assert_equal [25.0, 50.0, 0.0, 25.0], histogram.percentages
end
def test_cumulative_percentages
today = Date.new(2014, 7, 1)
dates = [
Date.new(2014, 6, 30),
Date.new(2014, 6, 19),
Date.new(2014, 6, 18),
Date.new(2014, 6, 7),
]
histogram = Stats::WeeklyHistogram.new(dates, 4, today)
assert_equal [25.0, 75.0, 75.0, 100.0], histogram.cumulative_percentages
end
We can crib the implementation directly from the Stats::RunningTimeData
class to make the tests pass. Purists might want to re-implement everything from scratch here. I don’t really care one way or the other, as long as the result is readable and does what it needs to.
module Stats
class WeeklyHistogram
attr_reader :dates, :cutoff, :today
def initialize(dates, cutoff, today=Date.today)
@dates = dates
@cutoff = cutoff
@today = today
end
def counts
frequencies = Array.new(length) {|i|
datapoints_per_week[i].to_i
}
frequencies << datapoints_per_week.inject(:+) - frequencies.inject(:+)
end
def percentages
counts.map(&percentage)
end
def cumulative_percentages
running_total = 0
percentages.map {|count| running_total += count}
end
def length
[cutoff, total_weeks_of_data].min
end
private
def percentage
Proc.new {|count| (count * 100.0 / dates.count)}
end
def weeks_since(date)
(today - date).to_i / 7
end
def total_weeks_of_data
weeks_since(dates.last)
end
def datapoints_per_week
frequencies = Array.new(total_weeks_of_data + 1, 0)
dates.each {|date|
frequencies[weeks_since(date)] += 1
}
frequencies
end
end
end
Is This Better?
Yep, it sure is.
It is loosely coupled: All it needs is the collection of dates to operate on.
It is highly cohesive: Everything in the class cares about those dates.
It is easily composable: The code can be used to create text-only reports in a stand-alone email program, or to generate stats for a Sinatra application or a command-line application.
It is context independent: It is not important where those dates come from. The context in the current application is TODOs. Unfinished TODOs, to be specific. But that’s incidental. Now we can use this in wildly different contexts: Cupcake sales. Zombie attacks. Celebrity sightings.
The Wall at Your Back
Over and over again, we make decisions. That’s our job, and most of the time we’re wrong. We can’t know the future. Our codebases reflect our guesses, our assumptions, and most of all, our mistakes.
Golden Master tests give us a way to change our mind when all the original reasons are long gone. When we no longer remember why we made the choices we did. When we’re no longer certain how the code does what it does. When we’re not even sure what the code does anymore.
It’s a path out when there is no obvious way forward, or when forward might be actually be backwards, round-about, and convoluted.
The Golden Master technique is not a tool for design or architecture or beauty. Golden makes it sound so fancy, so noble.
It isn’t.
It’s dirty, temporary, and utterly practical.