Code Safari: Extending Ruby with C

GitHub have recently released a new Ruby library for parsing Markdown: Redcarpet. It is a wrapper around the C library Upskirt, which got me thinking: how do you write a C extension for Ruby anyway? For those of us used to the plush comfort of Ruby land, dropping down to C is a scary prospect. The spectre of segfaults haunts the boundary, forcing unprepared travellers to turn back at the first glimpse of a semi-colon.

What if it wasn’t that hard? Perhaps with a bit of courage, a bit of heart, and a bit of brains we can discover the ability to write a C extension for ourselves.

First Steps

My first investigative move is always the same: clone the repository, and run a default rake.

$ git clone git://github.com/tanoku/redcarpet.git
$ cd redcarpet
$ rake

In a well made gem (as this one is) this will run the full test suite. In this case we also see in the output that some C code is being compiled prior. We can dig into the Rakefile to find out more.

# excerpt from redcarpet/Rakefile
DLEXT = Config::MAKEFILE_CONFIG['DLEXT']
RUBYDIGEST = Digest::MD5.hexdigest(`#{RUBY} --version`)
 
file "ext/ruby-#{RUBYDIGEST}" do |f|
  rm_f FileList["ext/ruby-*"]
  touch f.name
end
CLEAN.include "ext/ruby-*"
 
file 'ext/Makefile' => FileList['ext/*.{c,h,rb}', "ext/ruby-#{RUBYDIGEST}"] do
  chdir('ext') { ruby 'extconf.rb' }
end
CLEAN.include 'ext/Makefile', 'ext/mkmf.log'
 
file "ext/redcarpet.#{DLEXT}" => FileList["ext/Makefile"] do |f|
  sh 'cd ext && make clean && make && rm -rf conftest.dSYM'
end
CLEAN.include 'ext/*.{o,bundle,so,dll}'
 
file "lib/redcarpet.#{DLEXT}" => "ext/redcarpet.#{DLEXT}" do |f|
  cp f.prerequisites, "lib/", :preserve => true
end
 
desc 'Build the redcarpet extension'
task :build => "lib/redcarpet.#{DLEXT}"

With the common use of rake being to only run tasks with dependencies, it’s easy to forget that it was originally conceived as a Ruby version of make. As such it provides methods other than task which come in handy for building projects. file is one such method, creating a task that generates the named file.

$ rake ext/Makefile
$ ls ext/Makefile
ext/Makefile # it exists!

Following the chain of dependencies in the above code, we can extract the essential steps needed to compile a C extension:

cd ext
ruby extconf.rb
make

make is standard for building almost any C code, but extconf.rb is new.

# redcloth/ext/extconf.rb
require 'mkmf'
 
dir_config('redcarpet')
create_makefile('redcarpet')

The documentation for mkmf confirms what we would expect this code to do:

module to create Makefile for extension modules

Presumably this compiles the similarly named redcarpet.c file, so let’s look at that and we should have the bare essentials to create our own extension.

// excerpt from redcarpet/ext/redcarpet.c
#include <stdio.h>
#include "ruby.h"
&nbsp;
static VALUE rb_cRedcarpet;
&nbsp;
// ...
&nbsp;
static VALUE
rb_redcarpet_toc(int argc, VALUE *argv, VALUE self)
{
  // ...
}
&nbsp;
static VALUE
rb_redcarpet_to_html(int argc, VALUE *argv, VALUE self)
{
  // ...
}
&nbsp;
void Init_redcarpet()
{
    rb_cRedcarpet = rb_define_class("Redcarpet", rb_cObject);
    rb_define_method(rb_cRedcarpet, "to_html", rb_redcarpet_to_html, -1);
    rb_define_method(rb_cRedcarpet, "toc_content", rb_redcarpet_toc, -1);
}

This doesn’t look too frightening. The Init_redcarpet method sets up a class then adds some methods to it, which map to other functions defined in the file. We can do this. In fact, let’s!

An Extension of Our Own

In the spirit of doing the simplest thing possible, let’s write a C extension that simply adds numbers together. Functionally useless, perhaps, but learning useful! From our investigation of Redcarpet, we know that there are only three steps needed to create our extension:

  1. Write an Init function in C code.
  2. Create an extconf.rb file that can create a Makefile to compile that C code.
  3. Run the Makefile to compile the extension.

Of course, there is a hidden fourth step as well: test that it all works!

First off, the C code:

// fastadd/fastadd.c
#include "ruby.h"
&nbsp;
static VALUE rb_cFastadd;
&nbsp;
void Init_fastadd()
{
  rb_cFastadd = rb_define_class("Fastadd", rb_cObject);
}

Then some extconf:

# fastadd/extconf.rb
require 'mkmf'
&nbsp;
dir_config('fastadd')
create_makefile('fastadd')

Then put it together and test it

$ ruby extconf.rb
Creating Makefile
$ make
gcc # ...
$ ruby -r./fastadd -e 'puts Fastadd.new'
#<Fastadd:0x0000010086dde0>

Bonanza! That’s a bona fide Ruby class, defined entirely in C. I’m starting to get excited. Let’s dive back into the Redcarpet code to see what else we can do.

Redcarpet Again

The first real logic to appear in redcarpet.c is the following in the rb_redcarpet__render method:

VALUE text = rb_funcall(self, rb_intern("text"), 0);

Note that “intern” is a Ruby name for “convert to symbol”. In that context, the above code seems pretty simple: call an instance method named “text”. But text isn’t defined anywhere else in the file! If you try running that code in our Fastadd extension you will see that it isn’t provided by default (you get a method not found error). Redcarpet must be doing some other shenanigans, and if it’s not in this file it must be somewhere else. Return to the root directory of redcarpet and grep for “text”. Ignoring the many matches in the test directory, it reveals another definition of the Redcarpet class in lib/redcarpet.rb.

class Redcarpet
  # Original Markdown formatted text.
  attr_reader :text
&nbsp;
  def initialize(text, *extensions)
    @text  = text
    extensions.each { |e| send("#{e}=", true) }
  end
end
&nbsp;
&nbsp;
require 'redcarpet.so'

How cheeky! This isn’t a pure C exercise at all. Redcarpet is defining a class in Ruby, then redefining (or adding to it) in C land. Open classes at work. This is really handy: not only can we set up the majority of our class easily in Ruby, we could even provide a default Ruby implementation and then override it later in our C code.

Let’s add this to our Fastadd class, since it will need to know what number to add to.

# fastadd/fastadd.rb
class Fastadd
  attr_reader :n
&nbsp;
  def initialize(n)
    @n = n
  end
end
&nbsp;
require './fastadd.so'

Real Work

Let’s make our Fastadd class be able to add one to the number. Lightning fast! You will have noticed an abundance of VALUE types in the Redcarpet C code —- these represent Ruby objects. We need a way to convert these to native C types so that we can deal with them in a natural manner in C land. I wasn’t sure how to do this, but a search for “ruby c extension convert VALUE to int” led me to this pretty tops cheat sheet which lists some macros and functions for doing exactly this conversion. NUM2INT looks particularly handy for our purposes.

// fastadd/fastadd.c
#include "ruby.h"
&nbsp;
static VALUE rb_cFastadd;
&nbsp;
static VALUE rb_fastadd_add_one(int argc, VALUE *argv, VALUE self)
{
  int n = NUM2INT(rb_funcall(self, rb_intern("n"), 0));
&nbsp;
  return INT2NUM(n + 1);
}
&nbsp;
&nbsp;
void Init_fastadd()
{
  rb_cFastadd = rb_define_class("Fastadd", rb_cObject);
  rb_define_method(rb_cFastadd, "add_one", rb_fastadd_add_one, -1);
}

Trying it out:

$ make
# gcc output
$ ruby -r./fastadd -e 'puts Fastadd.new(3).add_one'
4

Winner.

Wrapping Up

Of course, this is just the beginning of what you can do with C extensions. Here are some extra exercises you can tackle to sharpen your skills:

  • We cargo-culted the Redcarpet code and ended up defining #add_one as a method with a variable number of arguments. Correctly define it as a method that takes no arguments.
  • Access to the @n instance variable is done via the reader method. Instead, access it directly.
  • Investigate what extra work Redcarpet needs to do in its gemspec to include the C extension as part of the gem.

Let us know how you go in the comments. Join us next week for more exciting adventures in the code jungle.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://postmodern.github.com/ Postmodern

    You can also use FFI (http://github.com/ffi/ffi#readme) to wrap C libraries in Ruby, without having to write any C.

  • Ryan Lewis

    Rake’s default task can be anything, so saying that “In a well made gem (as this one is) this will run the full test suite.” is a false statement.

    Sure, rake’s default task inmost projects is /usually/ :test but you can never be sure.
    The default task could also be set to generate RDoc, run specs, Rcov, Reek.

    My point is, if new coders read this, they’re sure to be confused.

    • http://xaviershay.com/ Xavier Shay

      With the possible exception of RDoc, I would argue that everything you listed could be part of a full test suite. I don’t quite understand your objection.

  • http://adamzaninovich.com Adam Zaninovich

    For anyone that runs rake and gets an error message like:

    no such file to load — rake/extensiontask

    running

    gem install rake-compiler

    will resolve the issue.

  • Jeremy Voorhis

    FFI can be a real timesaver for these kinds of bindings. It automates most of the conversion between C and Ruby types and eliminates an entire build step. While there are some drawbacks (e.g. can’t bind to inline functions) it’s a much faster workflow when you can get away with it.