Code Safari: Configuring Capybara

Welcome to Code Safari. In this series, Xavier Shay guides you through the source code of popular gems and libraries to discover new techniques and idioms.

Capybara provides a Ruby DSL for interacting with web pages. It is typically used to test Rack-based applications (such as Ruby on Rails or Sinatra), but can be used stand-alone with a to drive any website.

I noticed Capybara’s configuration syntax—a fairly common idiom in the ruby library world—and was intrigued as to how it worked.

Capybara.configure do |config|
  config.default_wait_time = 2
end

Let’s find out! Grab the Capybara source code so we can start spelunking.

git clone https://github.com/jnicklas/capybara.git

First of all let’s try and find the configuration code that we are interested in. Searching for the method name is usually a good start.

$ ack configure lib
lib/capybara.rb
25:    #     Capybara.configure do |config|
46:    def configure
166:Capybara.configure do |config|

(ack is a nicer grep – see the documentation for more info)

Jackpot! We found the method we were after on the first go. Open up that file to find the method definition.

# lib/capybara:46
def configure
  yield self
end

Curious: yield is setting the param of the configure block to self, but what is self in this context? You may be able to figure it out from scanning the rest of the file, but we can also start to harness the flexibility of ruby by just jumping in and executing code with irb. You can use the -I flag to add a directory to the load path, ensuring that we will be able to require the file we are looking at rather than another version that may exist elsewhere on our system.

# In cloned capybara directory
$ irb -Ilib
irb> require 'capybara'
 => true
irb> Capybara.configure {|config| puts config.inspect }
Capybara
 => nil 
irb> Capybara.configure {|config| puts config.class.inspect }
Module
 => nil

We see here that self is actually the Capybara module. This is interesting: typically we think of self as being a reference to an instance of class, but in this case it is the actual class object. That means that the following two lines are equivalent:

Capybara.default_wait_time = 2
Capybara.configure {|config| config.default_wait_time = 2 }

configure provides a level of abstraction away from directly setting the module accessors, which allows Capybara to potentially refactor how it stores preferences in the future.

It is starting to become clearer how this mechanism works, but there are still some unanswered questions. How has configure been defined on the module, and how have the module attributes been defined? The answer lies a little bit further up the file, where we see the following pattern:

# lib/capybara.rb, trimmed to size
module Capybara
  class << self
    attr_accessor :default_wait_time
 
    def configure
      self
    end
  end
end

The magic here is class << self, which effectively says “anything inside here belongs to the class, not instances of the class.” (It’s actually little more subtle than that, but that definition is good enough for now.) Traditionally attr_accessor is used to define attributes of an instance, such as:

class Person
  attr_accessor :name
end
 
p = Person.new
p.name = 'Don'
p.name # => 'Don'

We can see from the Capybara code though that we can in fact use it to define attributes on any instance, even weird ones like Module.

Review

In reading the Capybara source code to discover how the configuration block was implemented, we learned two techniques:

Yielding a builder object
Defining accessors on a Module

We can use this knowledge to create our own similar configuration system.

module MyApp
  class << self
    attr_accessor :special_number
 
    def configure
      yield self
    end
  end
end
 
MyApp.configure do |config|
  config.special_number = 42
end
 
puts "The special number is #{MyApp.special_number}."

To practice your code reading, here are some other research tasks for you to try:

RSpec uses a similar looking DSL for configuration, does it work in the same way?
Machinist uses a totally different technique to allow configuration with a block without yielding a parameter. Figure out how it works.

Let us know how you go in the comments. Tune in next week for more exciting adventures in the code jungle.