Detecting Faces with Ruby: FFI in a Nutshell


Man Woman face people problem puzzleIn the last couple of years, I became quite fond of Ruby. It was so refreshing to simply get things done without caring too much about typecasts or memory management. The magnificent expressiveness of the language, the ease of use of the whole Gem system, and a nice interactive shell to play around in really makes you feel at home. And since you are reading chances are quite good that you already feel the same way.

But there are quite a few domains in which Ruby lags behind. Rails gave Ruby a lot of positive publicity (and some negative), but also kind of shaped the Ruby ecosystem, pushing it strongly towards web development. This can be seen as either a blessing or a curse.
On the one hand Ruby, really is awesome in the web domain. On the other hand, it is no serious competitor to languages like Python in domains like scientific programming or game development. Although, I am quite positive that this will slowly change, but there are some options that you can choose from right now:

  • Swig to wrap C/C++ interfaces
  • RubyInline for inlining foreign code in your Ruby
  • C/C++ extension to extend Ruby
  • Spawn to call programs directory
  • FFI to load libraries and bind in Ruby. This is what I will cover in this article.

But first of all let me go through some pros and cons of FFI to help you decide if it is the right tool for your job.

Why you should consider using FFI

FFI has some benefits over writing extensions in C/C++ or calling programs in a subshell:

  • With FFI, you often do not even have to write C/C++ code. Calling methods that are already available in a well maintained library can be a real time saver.
  • FFI is pretty much the same in any Ruby environment, may it be MRI, Rubinius or even JRuby without modifications.
  • It is easier for your end users. They will not need development headers or a C/C++ compiler.

When could FFI not be the right tool

  • You are only developing a small script that needs to grep some output of a program on your system. By all means, simply spawn a subprocess and call that program directly as you would in a shell, then grep the output. Of course it is a bit hacky but when it saves you a lot of time and is not supposed to be distributed to thousands of users, why not? Be very careful with user submitted parameters though!
  • The library you are trying to wrap makes use of a lot of compile-time or preprocessor features. You may be able to implement a few of them in Ruby but sometimes this defeats the initial purpose.
  • If you need to write a lot of custom C/C++ code and every millisecond counts, you might want to go the extra mile and write a C/C++ extension, although creating your own lib and interfacing this one might still be a better option. I will show you how to do that later on.
  • You have trouble to use FFI with a fiddly callback heavy interface. Callbacks are supported by FFI but not always easy to implement.
  • Most FFI systems can not see constants that were created using #define. You can redefine them in Ruby pretty easily, but it can be tedious work and if they change in a newer version of the library you have to change your code as well.

Basic Example

Nothing helps you more to get a feel for a technology than a few short but usable examples.

Maybe we would like to use a function from LIBC. One might want to retrieve the process ID of the current process.
Now when you are familiar with Ruby you know that there is a handy global variable $$ that gives you exactly that. For the sake of this simple example, let’s go the detour and do it via FFI and LIBC’s “getpid” function:

require 'ffi'

module MyLibcWrapper
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  attach_function :getpid, [], :int

puts MyLibcWrapper.getpid
puts $$

First of all, we have to require the FFI Gem (and maybe rubygems before if you are running an old Ruby version). Then we create a module that extends the FFI::Library module. Now we can use the methods from this module to bind to the LIBC library and finally attach the getpid function to our module. Notice that the arguments to attach_function are a symbol ( representing the name of the function inside the foreign library ), then an array of argument types to that function (getpid
takes no arguments), and finally the return type of that function (int).
Now the function can be accessed as a static function of our module. As you hopefully can see (remember to gem install ffi) when you run that code, both methods return the same process ID.

Advanced Example

But how about a more complex example? There is a good chance that you have to work with some kind of struct and pass pointers to memory addresses, after all we are talking about C/C++ here. Maybe you need to find out on what system your programm is running. Fortunately, there is another LIBC function “uname” which provides that kind of information, so let us wrap it:

require 'ffi'

module MyLibcWrapper
  extend FFI::Library
  ffi_lib FFI::Library::LIBC

  # define a FFI Struct to hold the data that we can retrieve
  class UTSName < FFI::Struct
    layout :sysname   , [:char, 65],
           :nodename  , [:char, 65],
           :release   , [:char, 65],
           :version   , [:char, 65],
           :machine   , [:char, 65],
           :domainname, [:char, 65]

    def to_hash

  # takes a pointer, returns an integer
  attach_function :uname, [:pointer], :int

  def self.uname_info
    uts = # create a place in memory to hold the data
    raise 'uname info unavailable' if uname(uts) != 0 # retrieve data

puts MyLibcWrapper.uname_info
# => {:sysname=>"Linux", :nodename=>"picard", :release=>"3.0.0-32-generic", :version=>"#50-Ubuntu SMP Thu Feb 28 22:32:30 UTC 2013", :machine=>"x86_64", :domainname=>"(none)"}

I wrapped the call to the native function inside a Ruby method because, as Ruby developers, we are more used to hashmaps than structs and I thought it would be nice to hide away this conversion inside the wrapper module.

Simple Custom Example

By now you may remember that one cool C/C++ program you wrote ages ago that does this awesome thing in a split second and you really want to use it from within your newest Ruby web service to create the next disruptive killer app!

Let me walk you through the process of turning your own C/C++ program into a shared library.

Since this example is only meant to teach you how to turn your program into a shared library, I decided to keep it really simple. Let us assume that you want to calculate a factorial, so your program might look like this:

extern "C" unsigned long factorial(int n); //offer a C-compatible interface

unsigned long factorial(int n){
  unsigned long f = 1;
  for (int c=1; c<=n; c++) f *= c;
  return f;

Usually, you would have your method signatures in a separate header file, but for the sake of simplicity I skipped that. Also note that you do not need to create a main method because we are compiling a library.

Assuming you named your source file factorial.c you can compile it like this:

g++ -shared -fPIC -Wall -o factorial.c

However, you can add a main method and compile it twice: one time with and one time without the -shared flag. That is an easy way to check your library without the need of embedding it somewhere.

Now for the Ruby part. We already know the pattern:

require 'ffi'

module MyAwesomeLib
  extend FFI::Library
  ffi_lib './' # load library from the same folder
  # this time we take an integer and return an unsigned integer
  attach_function :factorial, [:int], :uint

puts MyAwesomeLib.factorial(6)
# => 720

That was pretty easy right? Now you have all the tools needed to outsource some parts of your Ruby application into a shared C/C++ library and benefit from the best parts of both languages.

Detecting Faces in Milliseconds

Now that you know the basics of writing your own libraries and using them from Ruby, I want to deliver on the promise I made in the title. But why face detection? On one side, face detection is an interesting topic and there are tons of applications that make use of it, probably without you even knowing it. Starting from the portrait wizard on your digital camera to your smart phone to tracking programs running behind security cameras to augmented reality applications running in your browser.

On the other side, I was working on a very similar problem recently. I had to detect certain patterns in uploaded images inside a Rails app, but since face detection is more accessible to a broader audience, I chose this as an example.

Unfortunately, face detection can be a very complex problem and although it may even be feasible with acceptable speed (in some non-realtime setups) in newer Ruby versions, there is no doubt that with the help of OpenCV, the most used computer vision library worldwide, the task would be more of a walk in the park. So let’s get to it.

You will need to install OpenCV though, and there are packages and instructions for nearly every platform. Additionally, you will need the file lbpcascade_frontalface.xml. This file is part of the debian package opencv-doc and can be found in compressed form in either /usr/share/doc/opencv-doc/examples/lbpcascades/lbpcascade_frontalface.xml.gz (on Ubuntu) or with Google.

Once you acquired those necessities, you can create your source file faces.cpp:

#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <iostream>
#include <stdio.h>
#include <unistd.h>
using namespace std;
using namespace cv;
char buf[4096];
extern "C" char* detect_faces(char* input_file, char* output_file);

int main(int argc, char** argv) {
    fprintf(stderr, "usage:n%s <image>n%s <image> <outimg>n", argv[0], argv[0]);
  printf("%s", detect_faces(argv[1], argc<3 ? NULL : argv[2]));

char* detect_faces(char* input_file, char* output_file) {
  CascadeClassifier cascade;
  if(!cascade.load("lbpcascade_frontalface.xml")) exit(-2); //load classifier cascade
  Mat imgbw, image = imread((string)input_file); //read image
  if(image.empty()) exit(-3);
  cvtColor(image, imgbw, CV_BGR2GRAY); //create a grayscale copy
  equalizeHist(imgbw, imgbw); //apply histogram equalization
  vector<Rect> faces;
  cascade.detectMultiScale(imgbw, faces, 1.2, 2); //detect faces
  for(unsigned int i = 0; i < faces.size(); i++){
    Rect f = faces[i];
    //draw rectangles on the image where faces were detected
    rectangle(image, Point(f.x, f.y), Point(f.x + f.width, f.y + f.height), Scalar(255, 0, 0), 4, 8);
    //fill buffer with easy to parse face representation
    sprintf(buf + strlen(buf), "%i;%i;%i;%in", f.x, f.y, f.width, f.height);
  if(output_file) imwrite((string)output_file, image); //write output image
  return buf;

This time I provided a main method so you can play with the program directly without FFI as well. I decided to use a simple char buffer and fill it with strings representing the detected faces. Each line represents one detected face in the form of “x;y;width;height”. In my opinion this method is very versatile and allows you to return nearly anything you want without having to bother with the pleasures of dynamically sized multidimensional arrays from C/C++ functions.

Of course you will have to parse the returned string and it probably takes a few milliseconds more, but this little overhead is very acceptable. If you feel it matters to you, feel free to study their GitHub Repo and read through the additional examples. You will even find examples on how to handle callbacks and other more advanced topics.

But for now, let’s compile! You might want to create a small makefile or build script:

LIBS="-lopencv_imgproc -lopencv_highgui -lopencv_core -lopencv_objdetect"
g++ -I/usr/local/include/opencv -I/usr/local/include/opencv2 -L/usr/lib -L/usr/local/lib -fpic -Wall -c "faces.cpp" $LIBS
//create shared library
g++ -shared -I/usr/local/include/opencv -I/usr/local/include/opencv2 -o faces.o -L/usr/local/lib $LIBS
//create executable (in case you want to play with it directly)
g++ -I/usr/local/include/opencv -I/usr/local/include/opencv2 -o faces faces.o -L/usr/local/lib $LIBS

Once compiled we can run it ./faces image_with_faces.jpg detected_faces.jpg and hopefully it detects some faces for us.

Finally, the last step is to wrap it up in some Ruby code:

require 'ffi'

module Faces
  extend FFI::Library
  ffi_lib File.join(File.expand_path(File.join(File.dirname(__FILE__))), '')
  attach_function :detect_faces, [:string, :string], :string

  def self.faces_in(image)
    keys = [:x,:y,:width,:height]
    detect_faces(image, nil).split("n").map do |e|
      vals = e.split(';').map(&:to_i)
      Hash[ ]

p Faces.faces_in('test.jpg')

Now you can detect faces from your Ruby scripts, isn’t that cool?
For instance, you could now write a custom Rails validator that makes sure that profile image of your users contain exactly one face and instructs them to upload a more genuine profile image upon validation failure.

You could also study some more OpenCV and go from simple face detection to a more advanced topic like face recognition. Then you could write a script that sorts your vacation photos by which persons are in the pictures or authenticates users by utilizing their webcam (please don’t do that!). Whatever is on your mind, I would highly recommend to go explore and build something new and when you do so, please let me know.

I sincerly hope I could give you some insights and inspired you to reach beyond the borders of Ruby alone for your next cool project! If you liked this article, feel free to tell your friends, blog, tweet and spread the word :)

Free JavaScript: Novice to Ninja Sample

Get a free 32-page chapter of JavaScript: Novice to Ninja

  • Tom van Leeuwen

    Thanks for the interesting article Marc!

  • Mike Stephenson

    Thanks for the insightful article. How does one reach you to chat further?

    • Marc Berszick

      Thank you Mike. I just emailed you.