Smelly Cucumbers

Dave Kennedy

It was such a cliche of a title, I just had to use it. I’m sure you have heard of the great BDD tool Cucumber, but what you may not know about is how smelly cukes can be.

I have recently been revising and refactoring my cucumber features in absolute disgust. In my defense, I started writing cucumber in earnest 6-7 months ago. I didn’t spend too much time on it. It was quick to get up and running with a couple of Rails applications, but I stopped using it when I found them becoming clumsy and difficult to maintain.

This was, by no means, any fault of Cucumber. I neglected to learn or develop so much of the tool, that it’s no wonder I ran into troubles. In recent months, I have invested some time in Cucumber again and realized what a chump I had been.

Let’s look at one of the offending Cucumber tests, modified slightly. For what it’s worth, I don’t just make todo applications, but they do make a good example. In this case, it’s a good example of a poor cuke.

Feature: Todo item management

Scenario: Adding a todo item
    Given: I have a todo list named "Mondays list"
    When I go to the home page
    And I fill in "username" with "dave"
    And I fill in "password" with "secret"
    And I press "Log In"
    And I go to the todo page
    And I click on link "Mondays list"
    And I fill in "todo" with "Grab some milk"
    And I press "Add todo"
    Then I should see "Todo item added successfully"

It’s pretty awful yeah? It gets worse, when you realize there are update and delete scenarios.

An Aside – How Cucumbers Work

Before delving any deeper, it’s best we establish the gist of how Cucumber works. You have three component parts:  the feature, scenarios and implementation.

The feature is a scenario, or more likely a collection of scenarios written in Gherkin. I can only describe Gherkin as something like markdown. It’s human readable text with a specific structure and a handful of keywords. The more commonly used/known are Given/When/Then.

Behind these features, we have the implementations which are written in Ruby and more commonly called steps. Each line of a scenario equates to a block of Ruby code.  For example, Given: I have a todo list named "Mondays list" could be implemented like so:

Given /^I have a todo list named "([^"]*)"$/ do |list_name|
  List.create(:name => list_name)

Put simply, when Cucumber runs, it scoops up the lines in our scenarios and calls the corresponding step block.

A couple of things to remember is that you treat each scenario in isolation. There is nothing carried over to the next scenario, you need to use some of the special “setup”, “teardown” features of cucumber for that.

Dissecting the Cucumber

Reading back the scenario at the start of the article, I get about a third into reading this test and lose the will to live. Does it describe the behavior of the application? Well, to me all it describes is the mechanics of what I want to do. It’s more a badly written integration test than a description of behavior.

Fortunately for me, scenarios like this are pretty common. I don’t feel so bad knowing that some poor developer out there has suffered the same pain I have.

Nothing is chicken soup to the soul like a bit of self critique. What is so wrong about the accused scenario?

It’s Confusing

We can definitely smell a couple of things wrong here. At first glance, it’s confusing. We have to imagine pages springing into view, populating text fields, pressing buttons. The behavior is lost in lots of pointless detail.

Cucumber was developed in order to involve clients and stakeholders in the development process. We could be dealing with people who may not be technically minded or whose experience of the web may only be lolcats. As great as that may be, asking them to imagine filling out a couple of forms online before making sure we are meeting their requirements is a bit much.

That said, I rarely present my Cucumber scenarios to clients, but I present them to other developers every day. My colleagues are all very familiar with web applications, and the Feature should describe the intended behavior. They do not have to inspect my code and work out what I’m doing. In the interest of time, I can just write them a little story. Also, the story proves to be a great reminder down the line when you are working on your own.

It’s Brittle

Secondly, the test is brittle. What if the log in process changes? Or the login URL? Even if the mechanics of adding a todo item has not changed, the test will fail. That cannot be good news.

It’s Lazy

The final smell is the assertion, or “Then”. What is that actually testing? The Rails flash. It’s a lazy assertion that made me shudder.

Fools Rush In

Over time, I have have been bitten by Cucumber and had to get a bit smart before rushing in to refactoring steps.

Sure, the steps in the above scenario were a snap to write. I hardly had to code any steps in Ruby (the deprecated web_steps.rb handled that for me). However, if I have 20 or so scenarios that require a user to log in and the login URL changes, I’m in trouble. Sure, Vim macros can save some of the pain, but this is not even close to ideal.

Looking at Cucumber in the grand scheme of things, if we slightly modify a scenario, we are really modifying the expected behavior of our application.

In our case a login is pretty slight. We only care if we are logged in, after all. What if we needed to log in with certain credentials to gain privileges? I find it really helps to keep your scenario items as generic as possible.

Doing it Right

It’s time to stop beating myself up and do something about the now infamous scenario. First of all, I want to describe the feature properly. Gherkin allows us to just write text after we declare the feature and it wont kick in until it reaches a keyword.

Feature: Todo item management

I want to track items I need to do in a list. That way I will never forget them. I want to add, edit, delete and mark todo items as finished.

Gherkin actually provides a nice syntax to remove decisions about what to write, it takes the format “In order/As a/I want”. I prefer to write a story, but I could have easily have written:

Feature: Todo item management
  In order to remember things
  As a person with too much on his mind
  I want to maintain todos on a list

I hold my hands up here and say I do not know which is better. I feel the story format has advantages, as we have license to be more descriptive. However, I know more than anyone how daunting a blank screen with a blinking cursor can be. I recommend that you use whichever format works for you, as long as you put in some description of the feature.

Now that we know what we are describing, let’s get to the nitty gritty of refactoring the scenario. The first thing has to be getting rid of all those web steps.

Feature: Todo item management

I want to track items I need to do in a list. That way I will never forget them.
I want to add, edit, delete and mark todo items as finished.
Scenario: Adding a todo item
    Given: I have a todo list named "Mondays list"
    And I am logged in as a normal user
    When I add a todo item "Grab some milk"
    Then it should be added to the todo list

Thats much better, but the log in thing still really bothers me. It is a detail I shouldn’t worry about here. I’m talking about adding a todo item, so why am I distracting myself with this log in nonsense. Luckily, we can handle this using the Gherkin keyword “Background:”.

Feature: Todo item management

I want to track items I need to do in a list. That way I will never forget them. I want to add, edit, delete and mark todo items as finished.

    Given I am logged in as a normal user

Scenario: Adding a todo item
    Given: I have a todo list named "Mondays list"
    When I add a todo item "Grab some milk"
    Then it should be added to the todo list

The background will act as a setup and run before each scenario in the feature. Background is great, but there is another way, which I tend to use when dealing with logins: Cucumber hooks.

  Scenario: Adding a todo item
    Given: I have a todo list named "Mondays list"

I can add the implementation of this hook in a support file along with other roles. It’s especially helpful when using a third party authorization library (such as warden) like so:

Before('@user_logged_in') do
  user = User.create(name: "test_user", admin: false)
  login_as user

Now for the step definitions. Written in Ruby they are a real breeze.

Given /^I have a todo list named "([^"]*)"$/ do |list_name|
  @todo_list = Factory.create(:list, name: list_name)

When /^I add a todo item "([^"]*)"$/ do |item_description|
  visit todo_path(@todo_List)
  fill_in "Description", with: item_description
  click_button "Add Item"

Then /^the Item should be added to the client$/ do
  @todo_list.items.length.should eql 1

I have pointed out a few potential traps we could encounter using Cucumber. Sure, none are really Cucumber’s fault but, like any framework in the wrong hands, it can be nothing but pain. Hopefully, some of what I covered here will save you some trauma. I’m very prone to “am I doing it right” syndrome, but sometimes we just have to feel the pain to learn something new.

I am a firm believer in keeping the scenarios high level. I have read about some developers refusing to use  regex style expressions in the step definitions. If that works for them, great. However, I like to use these regex expressions for setup, sometimes for “When” steps, but never for assertions.

As a rule, try to keep the scenarios tight and don’t try to cover too much of the feature in a single scenario. It sounds like common sense, but I have often been distracted by complementary elements of a feature. Finally, don’t be scared to tear down and re-write “work in progress” scenarios. Cucumber also facilitates a great discovery process for your application. Allow the Cucumber tests to drive your development, always extracting as much testing into your unit tests/specs. If you haven’t started, or even have quit Cucumber testing, I urge you to give it a go. The greatest benefit is knowing the system behaves in an expected way.

CSS Master, 3rd Edition