What's in a Name? Anti-Patterns to a Hard Problem

hildren's names written on a brick wall

If you wish to make an apple pie from scratch, you must first invent the universe. —Carl Sagan

We name and name and name. And name. Naming is notoriously difficult, but it’s not like we’re starting from scratch every time. We have habits, conventions, and a personal style.

Often we don’t give a name much thought, and we still do a reasonable job of it. Of course, sometimes our first idea is terrible.

There is no formula for choosing a name. In some situations our habits are no good. Our strategies—unspoken or not—fall short. Naming is fraught with ambiguity. A good name answers important questions. What does it contain? Why does it exist? Why is it important? What does it mean? How would I use it? What role does it play? But it can hardly answer all the important questions at once. A bad name is confusing or unhelpful. It misinforms and misleads.

There are some common strategies that harm more than they help. Recognizing an anti-pattern makes it easier to choose a better strategy. A better strategy tends to lead to a better name.

An anti-pattern is a common response to a recurring problem that is usually ineffective and risks being highly counterproductive. —Wikipedia

As with all things naming, an anti-pattern isn’t always the wrong choice. The usual admonitions apply: “usually”, “probably”, “maybe”, “use your judgement”, etc., etc.

Underlying Types and Data Structures

If you see a name that encodes an underlying type, such as word_string or new_hash, there’s almost always a better name waiting in the wings.

Type information is just not that compelling. It doesn’t answer any of the important questions. In most situations it’s irrelevant. The type is an implementation detail, and implementation details can change without fundamentally changing the solution.

def anagrams(string, string_array)
  string_array.each do |str|
    str != string && same_alphagram?(string, str)
  end
end

This code is simple. The names are correct but unhelpful.

One question you can ask yourself when faced with a bland collection of datatypes is:

What does it contain?

In the case of anagrams, it contains words.

def anagrams(word1, words)
  words.each do |word2|
    word1 != word2 && same_alphagram?(word1, word2)
  end
end

Now we have a different problem. We’re wording the words so that we can word. There’s no meaningful distinction between words, word1, and word2. We need to say something about how the words relate to each other in the context of detecting anagrams.

The original word or phrase is known as the subject of the anagram. —Wikipedia

So word1 is subject. The words that we’re looping through may or may not be anagrams. They’re potential_anagrams, but it’s a bit annoying to repeat anagram in the name. Another word for a potential match is a candidate.

def anagrams(subject, candidates)
  candidates.each do |candidate|
    subject != candidate && same_alphagram?(subject, candidate)
  end
end

When computing Scrabble scores we run into the same thing.

def compute_score(chars)
  chars.inject(0) {|num, char|
    num + char_to_num[char]
  }
end

Again, ask yourself what the variables contain, this time in the context of a game of Scrabble. The num is the thing we’re computing, the score. The char is a letter or tile. The hash of characters to numbers contains the point value for each tile.

def compute_score(tiles)
  tiles.inject(0) {|score, tile|
    score + points[tile]
  }
end

Using the data type in the name is not always an anti-pattern.

When the scope is small it can be redundant to give a variable a more expressive name. The context already answers the important questions about it. There’s no reason to bloat the code with extra descriptions. Just use the type, such as s for a string, or i for an int.

Sometimes the name of the data structure helps clarify important details. A queue is a concept that is familiar to programmers. The name jobs might communicate your intent. But maybe not. If the FIFO (first in, first out) aspect is crucial, then job_queue might be better. It expresses what the thing contains, as well as how to use it.

Structural

Another common strategy is to name things for their role in the program. It’s the input or the output. It’s the recurring phrase or the middle sentence. It’s a memo or sum or result.

Here’s some code that counts differences, a simplification of an algorithm known as the Hamming Distance.

def self.compute(first, second)
  first.length.times.count { |i|
    first[i] != second[i]
  }
end

The algorithm is expressive enough, but the names first and second seem pretty arbitrary. They’re the first and second parameters, but does the order even matter? It’s unclear. And first and second what?

First and second DNA strand. Duh.

It turns out, order doesn’t matter. We only care how many mutations there are between two similar strands.

The name strand answers the question of what it is. A simple suffix to differentiate between the two is enough. We don’t need to tell more of a story than that. We could use A and B, which don’t emphasize order quite as much as 1 and 2.

def self.compute(strandA, strandB)
  strandA.length.times.count { |i|
    strandA[i] != strandB[i]
  }
end

Here’s the Scrabble scoring method from earlier, with structural names.

def compute_score(input)
  input.inject(0) {|sum, x|
    sum + lookup[x]
  }
end

The interesting thing about input isn’t that it happens to get passed to the method as an argument. The interesting bit is what it contains, which in this case is Scrabble tiles. Likewise, the sum isn’t any old sum, it’s someone’s score. It’s an undeniable fact that we’re looking something up, but lookup explains nothing essential. The question is what are you looking up? Points. There’s drama here if you look for it.

Idea Fragment

This is an alluring trap in Ruby, and once you see it you can’t unsee it. It’s everywhere.

The reason it’s so seductive is that it leads to many small methods.

“Wait, what?”

Yeah, sorry. It’s not that small methods are bad. It’s that everything is a trade-off. It turns out there are more important things than SLOC (source lines of code). Who knew?

Here’s a method from some code for scheduling meetups:

def prev_or_next_day(date, date_type)
  date_type == :last ? date.prev_day : date.next_day
end

The name of the method repeats the conditional that it contains.

There’s no good name for this, because the method doesn’t isolate an entire idea. It takes a small sliver of an idea and sticks it in a method. When each method represents a fragment of a concept, the solution becomes incomprehensible. You can fit all the individual pieces in your head, but they don’t form a coherent picture.

The solution here is to inline it back to where it came from along with all the other shards of ideas that are in arbitrarily defined methods throughout the code. Then—once everything is in the same place—you’re more likely to find and name the whole idea.

Implementation Fragment

Sometimes the method isolates the complete thought, but the method name misses the mark.

Some code to generate the lyrics to The 99 Bottles of Beer song had this method in it.

def bottle_or_bottles(quantity)
  if quantity == 1
    "bottle"
  else
    "bottles"
  end
end

(The above is taken from an upcoming book on using this song to study OOP. Full disclosure, I am one of the collaborators of the book.)

This, too, repeats the conditional. Bottle and bottles are two different instances of a single concept. Other fragments of that same concept might be “growler” or “keg” or “six-pack”.

def container(quantity)
  if quantity == 1
    "bottle"
  else
    "bottles"
  end
end

A good name doesn’t join the implementation in the weeds. It lifts its eyes a bit and sees a bigger picture.

Here’s a method found in code to generate the lyrics to the song that goes “I know an old lady who swallowed a fly”.

def swallowed
  "She swallowed the #{predator} to catch the #{prey}."
end

Predator and prey are great names. They explain what the variables contain, as well as how they’re related to each other. But swallowed doesn’t help the reader much.

The author took a small piece of the implementation, and echoed it for the name.

A method should name an idea, not a random little piece of an idea. The song is about a little old lady who inexplicably swallows a fly. She then compounds the problem by swallowing larger and larger creatures. This method has isolated the part of the song that tries to explain why someone would do such a thing. It explains the reasoning behind her choices. Her motivation.

def motivation(predator, prey)
  "She swallowed the #{predator} to catch the #{prey}."
end

Conclusion

Each example was problematic in a different way, but the strategy to fix them was similar. The first step was to describe the problem in English. The programming terms might end up being important later. For now, just find the words from that domain.

Scrabble has points and scores and tiles.

Anagrams are about words. But not just words. Words that relate to each other in a specific way. A subject and candidates.

The Hamming distance between two DNA strands is not any old sum, it’s a count of mutations.

Any song can have a first line and a last line. Many songs will have recurring sentences. There’s a difference between a song about drinking beer and one about swallowing critters. Those differences matter.

Make meaningful distinctions. Remove gratuitous or unnecessary details.

In short, tell a good story.

What’s in a Name? Anti-Patterns to a Hard Problem

Underlying Types and Data Structures

Structural

Idea Fragment

Implementation Fragment

Conclusion