8 Simple Steps for Contributing to Open Source

Glenn Goodrich
Share

forkThe Ruby community embraces open source like no other community. These days, any Ruby developer worth their salt is expected to have authored or contributed to an open source library. Git and Github have made sharing code easy and contributing simple.

However, making your first contribution to an open source library can be very daunting. If you’re like me, I was/am nagged by self-doubt and a fear that I would/will “do it wrong.” I worry about the mocking of other developers, all solidified by years of open source contributions. They will shred my code and use my avatar as a meme to terrible coding.

I did, eventually, get past this fear. I have contributed to a few open source projects (AngularJS, neography, jquery-ui, just to name a few) and I can tell you it’s not so bad. If you are stuck in the self-doubt phase, but want to jump in, you may be asking “What’s the first step?” or “How do I contribute?” Well, I aim to answer those kinds of questions by walking you through my attempt at contributing to the wonderful guard-jruby-rspec gem.

The Steps are:
1. Search
2. Fork
3. Prepare Your Local Environment
4. Write a Test
5. Fix It
6. Commit and Push It
7. Test It
8. Issue the Pull Request

The Gem

I am currently working on a JRuby project that uses RSpec. Running the tests continuously, automatically, and (most importantly) quickly is paramount. I pulled in the guard-jruby-rspec gem to do just that, and it runs my tests faster than any other solution I’ve seen on any Ruby project. Joe Kutner , author of Deploying with JRuby has done a brilliant job with the gem. (Joe also reviewed this article, so thanks for that too!)

Briefly, let me explain what guard-jruby-rspec does. The gem is a plugin to Guard which, according to its Github page, is “a command line tool to easily handle events on file system modifications.” It is most often used to monitor files in a Ruby or Rails project and fire off the affected tests as files change. If a controller file is changed, the corresponding test file is automatically run. This is invaluable, as you get constant feedback as you code. Automate all the things.

Guard has a thriving plugin framework, and guard-jruby-rspec is one of the many plugins. guard-jruby-rspec, as you may have guessed, runs RSpec on files based on “watchers” in you Guardfile. It leverages JRuby to minimize the loading time for each test.

In a Rails app, for example, if the Guardfile looks like:

interactor :simple
guard 'jruby-rspec' do
  watch(%r{^spec/.+_spec.rb$})
  watch(%r{^lib/(.+).rb$})     { |m| "spec/lib/#{m[1]}_spec.rb" }
  watch('spec/spec_helper.rb')  { "spec" }

  # Rails example
  watch(%r{^app/(.+).rb$})                           { |m| "spec/#{m[1]}_spec.rb" }
  watch(%r{^app/(.*)(.erb|.haml)$})                 { |m| "spec/#{m[1]}#{m[2]}_spec.rb" }
  # HEY THIS LINE IS IMPORTANT
  watch(%r{^app/controllers/(.+)_(controller).rb$})  { |m| ["spec/routing/#{m[1]}_routing_spec.rb", "spec/#{m[2]}s/#{m[1]}_#{m[2]}_spec.rb", "spec/acceptance/#{m[1]}_spec.rb"] }
  watch(%r{^spec/support/(.+).rb$})                  { "spec" }
  watch('app/controllers/application_controller.rb')  { "spec/controllers" }

  # Capybara features specs
  watch(%r{^app/views/(.+)/.*.(erb|haml)$})          { |m| "spec/features/#{m[1]}_spec.rb" }
end

Anytime a file changes that matchers a watcher, the specs run. If you pass in a block to the watcher, you can narrow down the specs that will run. In other words, when I change a model file it will run just the model spec.

The Problem

As I was working, I noticed that Guard would fail anytime I changed and saved a controller file. Here is the error:

16:29:43 - ERROR - Guard::JRubyRSpec failed to achieve its <run_on_changes>, exception was:
[# NoMethodError: undefined method `match' for #<Array:0x4816505b>
[#] /home/vagrant/ws/rental_express/ROOT/rails/.bundle/jruby/1.8/gems/guard-rspec-2.5.4/lib/guard/rspec/inspector.rb:50:in `spec_folder?'
[#] /home/vagrant/ws/rental_express/ROOT/rails/.bundle/jruby/1.8/gems/guard-rspec-2.5.4/lib/guard/rspec/inspector.rb:38:in `should_run_spec_file?'
[#] /home/vagrant/ws/rental_express/ROOT/rails/.bundle/jruby/1.8/gems/guard-rspec-2.5.4/lib/guard/rspec/inspector.rb:30:in `clean'
[#] org/jruby/RubyArray.java:2395:in `select'
[#] /home/vagrant/ws/rental_express/ROOT/rails/.bundle/jruby/1.8/gems/guard-rspec-2.5.4/lib/guard/rspec/inspector.rb:30:in `clean'
[#] /home/vagrant/ws/rental_express/ROOT/rails/.bundle/jruby/1.8/gems/guard-rspec-2.5.4/lib/guard/rspec/inspector.rb:62:in `clear_spec_files_list_after'
[#] /home/vagrant/ws/rental_express/ROOT/rails/.bundle/jruby/1.8/gems/guard-rspec-2.5.4/lib/guard/rspec/inspector.rb:29:in `clean'
[#] /home/vagrant/ws/rental_express/ROOT/rails/.bundle/jruby/1.8/gems/guard-rspec-2.5.4/lib/guard/rspec.rb:85:in `run_on_changes'
[#] /vagrant/git/guard-jruby-rspec/lib/guard/jruby-rspec.rb:74:in `run_on_changes'
[#] org/jruby/RubyKernel.java:2080:in `send'
...

I started looking around to see if I could figure out how to fix this error, and this is how I did it.

Step 1: Search

In many cases, you are not the first one to find an issue. Especially if the issue is with a well-used gem or framework, like Rails. Before you start frolicking through the code, it’s smart to ask Google to see if someone else has the issue.

Also, you should find the code base on Github (the vast majority of open source gems/code can be found on Github) and search its issues for your problem. Chances are someone else is also having the issue and/or working on a fix.

What do you search for? I find that using the meat of the error message (if you have one) has a high incidence of success. Take out the line numbers and your local computer/directory names and search for the error. In this example, I searched for the following:

  • NoMethodError: undefined method `match’ for Array
  • JRuby Guard NoMethodError: undefined method `match’ for Array

No dice.

If you don’t have an error or your error isn’t helping, try searching for the major gems involved. In this case, I might search for “Guard JRuby Rspec Rails Array match” and see what pops up. Most times, you’ll get a few StackOverflow hits and you’ll be on the road to recovery. This, however, was not the case for me. Sad panda.

I found the code on Github and searched the repository for “undefined method match for Array” and BINGO! Sure enough, someone else had the same issue. It was only a day old, as well, so there was still blood in the water.

I added a “I am getting this too” comment and let them know what version of the gem I am using. In this case, I also added “I plan to investigate further” to let them know that I was taking a crack at fixing it. On to Step 2…

Step 2: Fork

Since the code for the gem I am investigating is on Github, I can easily make my own copy of it. This is called “forking” and is a step in the process made possible by the fantastic-ness of Github. Some people think forking is a git thing, but it is not. All forking does is copy/clone the repository to a location under your user on Github. In my case, I now have ruprict/guard-jruby-rspec repository that I can hack on to my heart’s content.

Once I have a fork, I like to add it to my Gemfile and make sure the issue is still happening. In order to add a gem to your Gemfile that is sourced from a github repository, you do this:

gem 'guard-jruby-rspec', github: 'ruprict/guard-jruby-rspec'

After a quick bundle, I run guard and make sure the issue is still there. It is.

Now, I need to get this code locally on my laptop so I can start working on it. This is a simple git clone git@github.com:ruprict/guard-jruby-rspec.git from my local terminal, and we’re cooking with gas. Step 3, you’re up.

Step 3: Prepare Your Local Environment

OK, I’ve cloned the repo and switched into my local directory. I like to wall off my work when I am working on something like this, so I use RVM because it is fabulous.

Since this is a JRuby gem, I need to make sure I have JRuby installed (I am using 1.7.4, so rvm install jruby) and then create a gemset. I named my gemset “guard-jruby-rspec” and added a .ruby-version and .ruby-gemset to the directory so that it switches to the right environment when I am in this directory. You can use .rvmrc file (link) for this too, but everything seems to be going the way of .ruby-version.

Once you’re in your new gemset (or other walled-off environment) run bundle install to install all the dependencies of the gem. Most gemspecs will have a Gemfile that points to the dependencies listed in the gemspec file. You’ll see all your dependencies go by, like so:
Bundle install

We are now to the most important step in preparing our development environment: Running the tests/specs. This is Ruby, so all gems should have some tests. You should NOT start development until the existing tests run successfully in your environment. Most often it is simply a case of running rake test or rake spec or, in this case, rspec.

Tests passing

OK, tests are passing. Excelllent. Now we can hack.

I like to make a git branch for my work so the “pristine” environment is just a git checkout away.

git checkout -b array_error_in_custom_watcher
git push -u origin array_error_in_custom_watcher

The second command forces my local branch to track a branch on my github repository with the same name. I dig it.

Many gems/repositories will have a section in their README on how to contribute, so make sure you have read that and are following the requested process. guard-jruby-rspec does not have one that I can find (but Angular has a monster), so I am following the “most common” process as I know it.

Step 4: Write a Test

Test Driven Development tells us that you should write a test to reproduce any bug that you are fixing. Sometimes, this is very easy. Other times, not so much. In this case, it’s somewhere in the middle. I don’t know how Guard works, so I am going to have to do some investigating into how Guard is put together in order to figure out how to make a test that reproduces the error.

It’s a bit more difficult in this case, since the error actually happens in the guard-rspec code. It’s unlikely that guard-rspec has the issue, though, as no one seems to complain about this error on that repository.

What makes this a bit more frustrating is that I think I know what the fix is. In this case, when an array of test targets is passed from a watcher block, we’re getting a NoMethodError on Array. It has to be more than a coincidence that Array is issuing the error and we are returning an Array in our watcher. Go back and look at the Guardfile to see the line I am talking about.

Looking at the stacktrace, the last time the test is on the guard-jruby-rspec code is on line 74 in the run_on_changes method, which looks like:

def run_on_changes(raw_paths)
  unload_previous_examples
  @reloaders.reload(raw_paths)

  unless @custom_watchers.nil? or @custom_watchers.empty?
    paths = []
    raw_paths.each do |p|
      @custom_watchers.each do |w|
        if (m = w.match(p))
          paths << (w.action.nil? ? p : w.call_action(m))
        end
      end
    end
    super(paths) # THIS IS LINE 74
  end
end

That paths variable is an array of files to run based on the changes passed in to the function in the raw_paths argument. By putting in various puts statements, I know that paths is an array of arrays when it fails. The question is, how do I write a test that reproduces the issue?

First, I look at the existing tests. I want to try and use the same style and I certainly want to leverage the same tricks (mocking, etc) so I look through the test files to find a spot where I think the test fits best. My first few attempts either break “wrongly” or pass when they should fail. Some of the existing tests mock two objects runner and inspector. By tracing the code through the guard-rspec gem (the super call calls the run_on_changes method for Guard::RSpec), I realize that the mocked call to inspector.clean is the key to what I want.

That call takes an array of paths and, when things are right, it should NOT have an array as an element in the outer array. With that knowledge, I can write an expectation test that ensures the mocked call to inspector has the appropriate arguments.

# guard-jruby-rspec/spec/guard/jruby-guard_spec.rb:226
it "works with watchers that have an array of test targets" do
  subject = described_class.new([Guard::Watcher.new(%r{^spec/(.+)$}, lambda { |m| ["spec/#{m[1]}_match", "spec/#{m[1]}_another.rb"]})])

  test_targets = ["spec/quack_spec_match", "spec/quack_spec_another.rb"]

  inspector.should_receive(:clean).with(test_targets).and_return(test_targets) # THIS IS THE TEST
  runner.should_receive(:run).with(test_targets) { true }
  subject.run_on_change(['spec/quack_spec'])
end

The test creates a watcher that returns an array of test locations. The test_targets variable is what the inspector SHOULD recieve when things are working. When I run this test, I get a failure on the expectation, because the call to inspector.clean gets an array with an array as the first element.

If this seems daunting, it isn’t. The actual process took me around an hour, and I had a lot of failures, wild goose chasing, etc. Each time I failed, I learned a little bit more about the code until I figured out how I needed to structure my test. There are tools that make the investigation of an issue much easier.

For example, I recently discovered vim-bundler by Tim “I should be given a medal” Pope. This glorious plugin allows me to open gems in my bundle with a simple Btabedit <gem name>, where I can edit the gem in place and never leave my editor. I screamed like a rabid, Beatles fan when I found vim-bundler. Really. Much of my investigation was placing puts statements, etc. into other gems (like Guard::RSpec), which allowed me to see what was happening as I messed about in the code.

Ahhh, with a test showing the issue. Step 5 will be a breeze.

Step 5: Fix It

As I mentioned, writing a test to catch the issue is almost always harder than fixing the code. That is true here, as the fix is simply adding .flatten to the paths array we are passing up our inheritance chain.

def run_on_changes(raw_paths)
  unload_previous_examples
  @reloaders.reload(raw_paths)

  unless @custom_watchers.nil? or @custom_watchers.empty?
    paths = []

    raw_paths.each do |p|
      @custom_watchers.each do |w|
        if (m = w.match(p))
          paths << (w.action.nil? ? p : w.call_action(m))
        end
      end
    end
    super(paths.flatten) # Here be the change, mon.
  end
end

The tests run again and they pass. I sit back and soak in the good feeling.

It’s worth noting that this may or may not be the best solution. In this case, it’s not much code and I have a test. I am happy enough with the code to move on to Step 6.

Step 6: Commit and Push It (Get Up on This!)

I can now commit the code. A word of caution here: Make sure that your changes only include what the fix NEEDS. In one of my earliest contributions, I had a bunch of whitespace changes left over from adding puts statements or other debug statements. This made my commit touch more files than necessary, and that is not cool.

In this case, here is my git diff:

Git Diff

As you can see, my changes are just the test and the change. Nice and clean.

Also, make sure you don’t commit any new files accidentally. If you’re in the habit of doing a git add ., then you are in danger of committing temporary editor files or anything else that may have snuck into your path while you were working.

Finally, make your commit message informative. A good (if not maybe a little overdone) example is the AngularJS commit message conventions. That example is probably at the far end of the SuperDuper Commit Message Spectrum, but it’s unlikely that anyone will complain if you put too much info into the commit message.

Here is my commit message:

Commit message

Step 7: Test It

I like to go back to the my project where I originally found the bug and add my github repository into the Gemfile, bundle, then make sure the issue is no more. In this case, I add
gem "guard-jruby-rspec, github: "ruprict/guard-jruby-rspec, branch: "array_error_in_custom_watcher"
to the Gemfile and run bundle update guard-jruby-rspec.

Now, recreate or go back to the environment that exercises the bug. For me, I simple added the custom watcher that returns an array of test targets back to the Guardfile in my project, started guard (guard), and changed a controller file. I put in an obvious failure, then removed it.

Controller test

Since Guard didn’t blow up all over the place, I am good. The error, she is vanquished!

Step 8: Issue the Pull Request

The moment has finally arrived. You are about to embark on a journey of open source contributions that will take you to faraway lands and introduce you to alien folk. Or something.
Go to the branch on your GH repo and click the Pull Request button.

In your message, mention any issues that you may have read that are fixed or affected by the PR. In my case, I mentioned the issue I found that had the same error. You can mention an issue simply by linking to it in your comment. If your commit message is OK, you shouldn’t have to type much more than the issue reference.
Your self-doubt may rise to the surface here, but just go for it. I have yet to meet a OSS author that isn’t appreciative of someone trying to help. Even if you mess it up, learn, fix, and resubmit it. It’s the only way you’ll become an OSS contributing machine.

Pull Request

Congratulations! An optional Step 9 is to have a celebratory adult beverage!

Wrap Up

I hope this post helps someone get over their fear and apprehesion of contributing to open source. Trust me, if I can do it, ANYONE can do it.  If you are having a hard time finding a bug to work on, I would suggest signing up for Code Triage which exists to facilitate finding places to help in popular open source libraries.

Good luck!

CSS Master, 3rd Edition