Code Safari: Extending Ruby with C
GitHub have recently released a new Ruby library for parsing Markdown: Redcarpet. It is a wrapper around the C library Upskirt, which got me thinking: how do you write a C extension for Ruby anyway? For those of us used to the plush comfort of Ruby land, dropping down to C is a scary prospect. The spectre of segfaults haunts the boundary, forcing unprepared travellers to turn back at the first glimpse of a semi-colon.
What if it wasn’t that hard? Perhaps with a bit of courage, a bit of heart, and a bit of brains we can discover the ability to write a C extension for ourselves.
First Steps
My first investigative move is always the same: clone the repository, and run a default rake
.
$ git clone git://github.com/tanoku/redcarpet.git
$ cd redcarpet
$ rake
In a well made gem (as this one is) this will run the full test suite. In this case we also see in the output that some C code is being compiled prior. We can dig into the Rakefile
to find out more.
# excerpt from redcarpet/Rakefile
DLEXT = Config::MAKEFILE_CONFIG['DLEXT']
RUBYDIGEST = Digest::MD5.hexdigest(`#{RUBY} --version`)
file "ext/ruby-#{RUBYDIGEST}" do |f|
rm_f FileList["ext/ruby-*"]
touch f.name
end
CLEAN.include "ext/ruby-*"
file 'ext/Makefile' => FileList['ext/*.{c,h,rb}', "ext/ruby-#{RUBYDIGEST}"] do
chdir('ext') { ruby 'extconf.rb' }
end
CLEAN.include 'ext/Makefile', 'ext/mkmf.log'
file "ext/redcarpet.#{DLEXT}" => FileList["ext/Makefile"] do |f|
sh 'cd ext && make clean && make && rm -rf conftest.dSYM'
end
CLEAN.include 'ext/*.{o,bundle,so,dll}'
file "lib/redcarpet.#{DLEXT}" => "ext/redcarpet.#{DLEXT}" do |f|
cp f.prerequisites, "lib/", :preserve => true
end
desc 'Build the redcarpet extension'
task :build => "lib/redcarpet.#{DLEXT}"
With the common use of rake
being to only run tasks with dependencies, it’s easy to forget that it was originally conceived as a Ruby version of make. As such it provides methods other than task
which come in handy for building projects. file
is one such method, creating a task that generates the named file.
$ rake ext/Makefile
$ ls ext/Makefile
ext/Makefile # it exists!
Following the chain of dependencies in the above code, we can extract the essential steps needed to compile a C extension:
cd ext
ruby extconf.rb
make
make
is standard for building almost any C code, but extconf.rb
is new.
# redcloth/ext/extconf.rb
require 'mkmf'
dir_config('redcarpet')
create_makefile('redcarpet')
The documentation for mkmf
confirms what we would expect this code to do:
module to create Makefile for extension modules
Presumably this compiles the similarly named redcarpet.c
file, so let’s look at that and we should have the bare essentials to create our own extension.
// excerpt from redcarpet/ext/redcarpet.c
#include <stdio.h>
#include "ruby.h"
static VALUE rb_cRedcarpet;
// ...
static VALUE
rb_redcarpet_toc(int argc, VALUE *argv, VALUE self)
{
// ...
}
static VALUE
rb_redcarpet_to_html(int argc, VALUE *argv, VALUE self)
{
// ...
}
void Init_redcarpet()
{
rb_cRedcarpet = rb_define_class("Redcarpet", rb_cObject);
rb_define_method(rb_cRedcarpet, "to_html", rb_redcarpet_to_html, -1);
rb_define_method(rb_cRedcarpet, "toc_content", rb_redcarpet_toc, -1);
}
This doesn’t look too frightening. The Init_redcarpet
method sets up a class then adds some methods to it, which map to other functions defined in the file. We can do this. In fact, let’s!
An Extension of Our Own
In the spirit of doing the simplest thing possible, let’s write a C extension that simply adds numbers together. Functionally useless, perhaps, but learning useful! From our investigation of Redcarpet, we know that there are only three steps needed to create our extension:
- Write an
Init
function in C code. - Create an
extconf.rb
file that can create a Makefile to compile that C code. - Run the Makefile to compile the extension.
Of course, there is a hidden fourth step as well: test that it all works!
First off, the C code:
// fastadd/fastadd.c
#include "ruby.h"
static VALUE rb_cFastadd;
void Init_fastadd()
{
rb_cFastadd = rb_define_class("Fastadd", rb_cObject);
}
Then some extconf:
# fastadd/extconf.rb
require 'mkmf'
dir_config('fastadd')
create_makefile('fastadd')
Then put it together and test it
$ ruby extconf.rb
Creating Makefile
$ make
gcc # ...
$ ruby -r./fastadd -e 'puts Fastadd.new'
#<Fastadd:0x0000010086dde0>
Bonanza! That’s a bona fide Ruby class, defined entirely in C. I’m starting to get excited. Let’s dive back into the Redcarpet code to see what else we can do.
Redcarpet Again
The first real logic to appear in redcarpet.c
is the following in the rb_redcarpet__render
method:
VALUE text = rb_funcall(self, rb_intern("text"), 0);
Note that “intern” is a Ruby name for “convert to symbol”. In that context, the above code seems pretty simple: call an instance method named “text”. But text isn’t defined anywhere else in the file! If you try running that code in our Fastadd extension you will see that it isn’t provided by default (you get a method not found error). Redcarpet must be doing some other shenanigans, and if it’s not in this file it must be somewhere else. Return to the root directory of redcarpet and grep for “text”. Ignoring the many matches in the test
directory, it reveals another definition of the Redcarpet
class in lib/redcarpet.rb
.
class Redcarpet
# Original Markdown formatted text.
attr_reader :text
def initialize(text, *extensions)
@text = text
extensions.each { |e| send("#{e}=", true) }
end
end
require 'redcarpet.so'
How cheeky! This isn’t a pure C exercise at all. Redcarpet is defining a class in Ruby, then redefining (or adding to it) in C land. Open classes at work. This is really handy: not only can we set up the majority of our class easily in Ruby, we could even provide a default Ruby implementation and then override it later in our C code.
Let’s add this to our Fastadd
class, since it will need to know what number to add to.
# fastadd/fastadd.rb
class Fastadd
attr_reader :n
def initialize(n)
@n = n
end
end
require './fastadd.so'
Real Work
Let’s make our Fastadd
class be able to add one to the number. Lightning fast! You will have noticed an abundance of VALUE
types in the Redcarpet C code —- these represent Ruby objects. We need a way to convert these to native C types so that we can deal with them in a natural manner in C land. I wasn’t sure how to do this, but a search for “ruby c extension convert VALUE to int” led me to this pretty tops cheat sheet which lists some macros and functions for doing exactly this conversion. NUM2INT
looks particularly handy for our purposes.
// fastadd/fastadd.c
#include "ruby.h"
static VALUE rb_cFastadd;
static VALUE rb_fastadd_add_one(int argc, VALUE *argv, VALUE self)
{
int n = NUM2INT(rb_funcall(self, rb_intern("n"), 0));
return INT2NUM(n + 1);
}
void Init_fastadd()
{
rb_cFastadd = rb_define_class("Fastadd", rb_cObject);
rb_define_method(rb_cFastadd, "add_one", rb_fastadd_add_one, -1);
}
Trying it out:
$ make
# gcc output
$ ruby -r./fastadd -e 'puts Fastadd.new(3).add_one'
4
Winner.
Wrapping Up
Of course, this is just the beginning of what you can do with C extensions. Here are some extra exercises you can tackle to sharpen your skills:
- We cargo-culted the Redcarpet code and ended up defining
#add_one
as a method with a variable number of arguments. Correctly define it as a method that takes no arguments. - Access to the
@n
instance variable is done via the reader method. Instead, access it directly. - Investigate what extra work Redcarpet needs to do in its gemspec to include the C extension as part of the gem.
Let us know how you go in the comments. Join us next week for more exciting adventures in the code jungle.