The Ruby Ecosystem for New Rubyists

newrubyist

Ruby is more than just a language. It has a universe of tools and processes supporting the creation of the complex software it makes. This can be overwhelming to newcomers, so I’ve put together an article that will hopefully make things a little more clear.

Version Management

Let’s say you have two projects relying on two different versions of a gem. The project that uses the newer version is not compatible with the older gem. What do you do?

One option is to just install whichever gem you need at the time. This isn’t a great idea, because one or both of the gem’s versions might have dependency version mistakes/impreciseness in its Gemfile or gemspec. That kind of problem is hard to track down.

Instead, most Rubyists rely on some kind of version manager. Not only do version managers keep gems tidy, but they separate Ruby implementations as well. This makes it easy to test for differences between, say, Ruby 2.0 and Ruby 2.1.

Popular version management solutions include RVM, rbenv, and Uru(Windows).

Here is how you would get started managing Rubies with RVM:

Install RVM.

$ \curl -sSL https://get.rvm.io | bash

Source RVM script in bash-compatible shell on its startup.

$ echo "source $HOME/.rvm/scripts/rvm" >> ~/.bash_profile

Reload the shell.

$ source ~/.bash_profile

Verify that RVM script has been sourced. It should print “rvm is a function.”

$ type rvm | head -n 1

Install Ruby 2.0.0.

$ rvm install 2.0.0

Switch to 2.0.0.

$ rvm use 2.0.0

Create a new gemset for 2.0.0 named “experimental.”

$ rvm gemset create experimental

Switch to the new gemset.

$ rvm gemset use experimental

Crafting Gems

Packaged code was not always a thing in Ruby. In 2003, rubyforge.org launched as a place for Ruby developers to share code. Although it improved the situation a bit, developers were still on their own when it came to figuring out how to run each other’s code. In November 2003, some Ruby developers got together and decided to solve the problem forever. In 2004 rubygems.org launched, and with it, the gem tool.

RubyGems is a package manager for Ruby libraries and programs. Here are a couple of reasons one might have for creating a Ruby gem:

Easily share code with other developers
Avoid duplicating code between projects

Installing a gem is easy:

$ gem install gem_name

The layout for a typical gem project looks like this:

- spec
  - gem_name_spec.rb
  - spec_helper.rb
- bin
  - gem_name
- lib
  - gem_name.rb
  - gem_name
    - source_file1.rb
    - source_file2.rb
    - source_file...
    - version.rb
- Gemfile
- gem_name.gemspec
- README.md
- LICENSE
- Rakefile

It isn’t necessary to memorize the gem directory structure. A simple way to generate a basic gem template is with the bundler tool.

$ bundle gem gem_name

Note: Bundler likely comes pre-installed with your Ruby installation, but if not or if you want the latest version you can install the gem:

$ gem install bundler

If you don’t like the template that Bundler creates, there are other tools available, including jeweler and hoe.

The Load Path

We need to examine a non-obvious issue that new gem developers face: Ruby’s load path. Let’s say you have a couple of files in the same directory (not making a gem, just in general). We’ll call them source1.rb and source2.rb.

# source1.rb
require 'source2'

# source2.rb
puts "hello world"

Looks great, but try using the file and see what happens.

$ ruby source1.rb
...'require': cannot load such file -- source2 (LoadError)...

Wait, what is going on? Even though they are in the same directory, source1.rb can’t see source2.rb. It turns out that Ruby does not automatically include the directory of a source file it executes in its load path. We can make this example work by telling it to do so with the command line option -I (include directory) and . (which directory – . is the current directory):

$ ruby -I . source1.rb
hello world

Alternatively, you can require the direct path to the file:

# source1.rb 
require './source2'

So do you need to do something like this when developing gems? No. For testing gems in development, source directories can be programmatically added to Ruby’s $LOAD_PATH global variable.

# source1.rb
$LOAD_PATH.unshift(".")
require "source2"

This is used to include the lib folder and all of its subdirectories. You will typically encounter File::expand_path converting the relative path to the lib folder from the file to the absolute path.

# spec/spec_helper.rb
lib = File.expand_path('../../lib', __FILE__)
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)

# now we can require lib/gem_name.rb which can require other code as needed
require 'gem_name'

The $LOAD_PATH variable is just an array. $LOAD_PATH.unshift(lib) adds the directory to the beginning of the array, so that it is loaded before anything else. Note that File.expand_path('../../lib', __FILE__) refers to the lib directory one directory up and not two as it looks like. This is a common trip-up. __FILE__ as the second argument specifies the directory to start in, otherwise the current working directory (the one Ruby was executed in) is used.

Note that you will not always see $LOAD_PATH in gems. A popular, if incredibly undescriptive, alias is $:.

$:.unshift(File.expand_path('../../lib', __FILE__)

Any supporting gem code should go in a folder of the same name as the gem.

- lib
  - gem_name.rb
  - gem_name
    - some_file.rb

# lib/gem_name.rb
require 'gem_name/some_file'

When a gem is installed, the contents of the lib directory are placed in a directory that is already in Ruby’s load path. Therefore, the load path shouldn’t be modified anywhere within the actual gem code, and the name of the gem should be unique.

gemspec

The rubygems gemspec file is the place where the gem is actually defined. The gemspec is the one file that must exist in order to build a gem.

# gem_name.gemspec
lib = File.expand_path("../lib", __FILE__)
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
require 'gem_name/version'

Gem::Specification.new do |s|    
  s.platform       = Gem::Platform::RUBY
  s.name           = 'gem_name'
  s.version        = GemName::VERSION
  s.authors        = ['Your Name']
  s.email          = ['your@email.com']
  s.homepage       = 'https://github.com/your_name/gem_name'
  s.summary        = 'Gem in a few words'
  s.description    = 'Longer decription of gem'
  s.required_ruby_version = '>= 1.9.3'
  s.require_path   = 'lib'
  s.files          = Dir[LICENSE, README.md, 'lib/**/*']
  s.executables    = ['gem_name']

  s.add_dependency('sqlite3')
  s.add_development_dependency("rspec", ["~> 2.0"])
  s.add_development_dependency("simplecov")
end

Not all of these fields are required. For example, not every gem has executables and some might prefer to place their dependencies in a Gemfile instead. Be sure to check out the rubygems.org specification reference.

Gemfile

The Bundler Gemfile contains a list of the gem’s dependencies, their versions, and where to get them. This makes it easy for other developers to sync their environment with yours to ensure the same behavior. It also makes it possible for users of the gem to have all of its dependencies installed without having to know what they are.

A gem dependency in a Gemfile looks like this:

gem <gem_name>, <version constraint>

Here is an example of a Gemfile:

# Gemfile
source 'https://rubygems.org'

gem 'nokogiri', '~> 1.4.2'

group :development do
  gem 'sqlite3'  
end

group :test do
  gem 'rspec'
  gem 'cucumber'
end

group :production do
  gem 'pg'
end

gemspec

A non-obvious piece of syntax in Gemfiles is the so-called spermy operator (~>). This operator will increase the last digit until it rolls over. So gem 'nokogiri', '~> 1.4.2' is semantically equivalent to gem 'nokogiri", '>=1.4.2', '<1.5.0'.

The gemspec line just tells Bundler to install the dependencies listed in the gemspec file. Many prefer to place all of their dependencies in a Gemfile, so this line is not required.

When you want to sync your environment with a gem’s dependencies, just navigate to the directory containing the Gemfile and run:

$ bundle install

Sometimes projects have dependencies that only make sense for particular environments. If you don’t plan on using the gem in production, you can avoid installing the gems listed in the production group.

$ bundle install --without production

When Bundler calculates dependencies, it creates a file called Gemfile.lock. The purpose of this file is to help avoid subtle differences between development environments. A common question is whether this should go in a repository. If the project is a gem, no, but if it’s an app, yes – the reason being that an apps generally have their own separate gemsets, and it’s important that everyone developing an app has the same exact version of each dependency.

For more information, see the Bundler Gemfile reference and Bundler Rationale.

Rakefile

rake is the Ruby world’s equivalent to GNU make. Unlike Gemfiles and gemspecs, Rakefiles exist mostly for convenience. Having simple commands like rake test makes it easy for automated systems (or other developers) to test code without needing to know what kind of testing system to use.

# Rakefile
lib = File.expand_path("../lib", __FILE__)
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
require "gem_name/version"

task :test do
  system "rspec"
end

task :build do
  system "gem build gem_name.gemspec"
end

task :release => :build do
  system "gem push gem_name-#{GemName::VERSION}"
end

If you want to get a good feel for how Rakefiles work, there is a long tutorial here.

Building Gems

Once a gem is properly organized and tested, it can be built. The gem build command will produce a .gem file that can be installed locally or pushed to a gem repository. The filename will include the version set in the gemspec.

$ gem build gem_name.gemspec
$ gem install gem_name-version.gem
$ gem push gem_name-version.gem

Now, in projects you or anybody who installs the gem can use it with:

require 'gem_name'

Note: You will sometimes see require 'rubygems' in code. This is a holdover from when RubyGems was a separate project. Starting with Ruby 1.9, RubyGems is part of the standard library.

Bundle Exec

You will often commands written like bundle exec rake db:migrate instead of rake db:migrate. This is because bundle exec guarantees that the command will be executed using the versions of the gems specified in the Gemfile. RVM creates gemsets with rubygems-bundler to avoid this issue, and it also shouldn’t be a problem if you are using Ruby 2.2, but it is worth knowing about.

Ruby Implementations

There are many implementations for the Ruby languages as well as Ruby-like languages. The most popular of these include:

MRI/CRuby – The reference implementation of Ruby written in C
JRuby – Ruby running on the Java Virtual Machine
Rubinius – Ruby written in Ruby, running on LLVM
RubyMotion – Commercial implementation for making native Apple / Android apps

There are many others.

There seems to be a common misconception that threads don’t work in Ruby. Threads are limited in CRuby (MRI), but they do provide the ability to work on two problems at once. It’s important to understand the difference between concurrency and parallelism:

concurrency – completing two or more tasks in overlapping time periods (i.e. downloading a file while waiting to see if the user interacts with the interface).
parallelism – completing two or more tasks simultaneously on multiple processor cores

The reference implementation of Ruby, MRI, has a global interpreter lock, so it can only use one native thread at a time. What this means is that Ruby threads can run concurrently but not in parallel. If executing code on multiple processors is needed in MRI, forking or some other variant of process coordination must be used (at least until the GIL is removed, if ever). There are Ruby implementations that lack a GIL, including all of the others listed above. On top of this, forking isn’t technically available to the JVM, so in JRuby you pretty much have to use threads, anyway.

I wrote an article into more detail about forking in Ruby here.

Documentation

When a gem is installed, you will often notice a notification about rdoc and ri documentation being installed. RDoc is an embedded documentation generator for Ruby. ri is a man-like tool for reading offline documentation in the terminal.

$ ri Array
$ ri Array#length
$ ri Math::sqrt

When using a version manager, it might be necessary to generate the core documentation manually. Otherwise, ri will just print “Nothing known about [whatever]”. For example, with rvm the core documentation can be generated with:

$ rvm docs generate

This can take a while. Likewise, when installing a gem, generating the documentation can take an inconvenient amount of time if the gem is complex enough. You can skip the documentation for the gem with:

$ gem install gem_name --no-rdoc --no-ri

Pry

Another way to explore the Ruby documentation is to install pry. Pry is an alternative to the irb console with various enhancements, including the ability to see the original C source code for methods.

$ gem install pry pry-doc
$ pry

[1] pry(main)> show-method Array#map

From: array.c (C Method):
Owner: Array
Visibility: public
Number of lines: 13

static VALUE
rb_ary_collect(VALUE ary)
{
    long i;
    VALUE collect;

    RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
    collect = rb_ary_new2(RARRAY_LEN(ary));
    for (i = 0; i < RARRAY_LEN(ary); i++) {
  rb_ary_push(collect, rb_yield(RARRAY_AREF(ary, i)));
    }
    return collect;
}

1] pry(main)> cd Array
[2] pry(Array):1> show-doc map

From: array.c (C Method):
Owner: Array
Visibility: public
Signature: map()
Number of lines: 12

Invokes the given block once for each element of self.

Creates a new array containing the values returned by the block.

See also Enumerable#collect.

If no block is given, an Enumerator is returned instead.

  a = [ "a", "b", "c", "d" ]
  a.collect { |x| x + "!" }        #=> ["a!", "b!", "c!", "d!"]
  a.map.with_index{ |x, i| x * i } #=> ["", "b", "cc", "ddd"]
  a                                #=> ["a", "b", "c", "d"]

Testing

Let’s say you clone some random repository, run some code metrics, and fix some duplication. You aren’t familiar with the project other than the file(s) you worked on. How do you know you didn’t break anything?

Testing.

The Ruby community takes testing very seriously. Testing frameworks for Ruby include Test::Unit, RSpec, Minitest, and Cucumber. Test::Unit was the original standard unit testing framework for Ruby, but it has been deprecated in favor of Minitest.

Typically, tests go in the /test or /spec folder at the root level of the project. Usually, there is a /test/test_helper.rb or /spec/spec_helper.rb that is run before processing the test cases or specifications. This is a good place to configure the global testing environment.

Let’s look at an example of testing with Minitest. Minitest is capable of both unit test and specification-flavoured testing, as well as benchmarking.

Note: Although Ruby ships with Minitest, I got an error from using Minitest::Test instead of Minitest::Unit::TestCase. This was fixed by installing the latest version of the gem.

$ gem install minitest --version 5.4.2

We’ll start with a simple test file using the unit test format.

require "minitest/autorun"

class Foo
  def hello
    "goodbye"
  end
end

class TestFoo < Minitest::Test
  def setup
    @foo = Foo.new
  end

  def test_hello
    assert_equal "hello", @foo.hello
  end
end

If you run the file you should get one failure.

1) Failure:
TestFoo#test_hello [test.rb:15]:
Expected: "hello"
  Actual: "goodbye"

1 tests, 1 assertions, 1 failures, 0 errors, 0 skips

Minitest is cool, but RSpec is arguably the most popular testing framework in the Ruby universe. This is largely thanks to a booming community and various integrations with other testing libraries, like Capybara. Unlike Minitest, RSpec includes a test runner command, rspec, which will check all specs in the spec/ directory hierarchy named *_spec.rb.

First, install the rspec gem.

$ gem install rspec --version 3.1.0

Typically the spec/ directory has a layout like this:

- spec
    - spec_helper.rb  
    - gem_or_app_name
      - models
        - some_model_spec.rb
      - controllers
        - some_controller_spec.rb

For a demo, just create an example spec and a blank spec_helper.rb.

# spec/example_spec.rb
require "spec_helper"

class Foo
  def hello
    "goodbye"
  end
end

describe Foo do
  before(:each) do
    @foo = Foo.new
  end

  it "says hello" do
    expect(@foo.hello).to eq "hello"
  end
end

To check the spec, simply run rspec in the root project directory.

$ rspec

Failures:

  1) Foo says hello
    Failure/Error: expect(@foo.hello).to eq "hello"

      expected: "hello"
            got: "goodbye"

      (compared using ==)
    # ./spec/example_spec.rb:13:in `block (2 levels) in <top (required)>'

Finished in 0.00079 seconds (files took 0.06974 seconds to load)
1 example, 1 failure</top>

What about spec/spec_helper.rb? To see what it’s used for, we’ll add code coverage. SimpleCov is a popular code coverage tool that will generate coverage HTML in the coverage/ directory at the project root.

First, install the simplecov gem.

$ gem install simplecov --version 0.9.1

Now, just call SimpleCov.start in spec/spec_helper.rb.

# spec/spec_helper.rb
require "simplecov"

SimpleCov.start

Coverage will be calculated every time rspec is run. In this case, there’s nothing to cover, but the tool will be activated.

Keep in mind that rspec does not execute spec/spec_helper.rb automatically. It needs to be required within each spec by adding require 'spec_helper' to the top of the spec file.

Summary

Version Management:
– Using the system Ruby installation is fraught with peril – use a version manager instead.
– Use a different gemset for each major project. Isolation can help avoid conflicts as well as problems with dependencies that are specified incorrectly or imprecisely.

Crafting Gems:
– When developing gems, the library directory must be manually added to $LOAD_PATH for code to work as if the gem is installed. This can be done at the command line or programmatically.
– File.expand_path('../../lib', __FILE__) refers to a lib folder one directory up, not two directories up as it might seem.
– Only one file, gem_name.rb, should be in the lib directory. The rest of the source code should go lib/gem_name/.
– If using Ruby 1.8.7, require 'rubygems' should go before requiring any gems.
– Gem dependencies can either go in a Bundler Gemfile or a RubyGems gemspec.
– In gemspecs, #executable_files= expects filenames, not paths. A folder named bin is assumed.
– Rakefiles are like Makefiles and define common tasks like testing, building, and releasing.
– It’s popular to use git ls-files to get arrays of files for the gemspec, but Ruby has Dir which is more portable.
– bundle exec [command] executes a command using the versions of the gems specified in the Gemfile. This shouldn’t be necessary with RVM or Ruby >= 2.2.0.

Ruby Implementations:
– The reference implementation of Ruby (CRuby / MRI) has a global interpreter lock, so while concurrency is possible with threads, parallelism (multiple processors at once) is not.
– If parallelism is needed in MRI, use forking with Kernel#fork
– Running threads in parallel can be achieved with JRuby, Rubinius, and RubyMotion.

Documentation
– The ri tool can be used to explore core documentation offline.
– rdoc is Ruby’s standard embedded documentation generator.
– Documentation for your Ruby implementation may need to be installed by your version manager like with rvm docs generate.
– pry is a sophisticated console that can explore documentation and programs at runtime.

Testing
– Tests or specs go in the test/ or spec/ folder of a project.
– Setup and configuration goes in test/test_helper.rb or spec/spec_helper.rb
– MiniTest is the standard testing library, but you may encounter legacy code that uses Test::Unit.
– All RSpec specifications can be checked by running rspec in the project root directory containing spec/.
– All specs must end in *_spec.rb or they will be skipped if not run directly.
– RSpec does not execute spec/spec_helper.rb automatically. require 'spec_helper' needs to go in each spec file.
– SimpleCov is a test coverage library and can be used by installing the gem and adding it to the test suite helper file.