Ruby + Arduino + LEGO® = RuBot

By Marc Berszick

A while back I had a long weekend to kill and was in the mood for some just-for-fun hacking. The weather was predicted to be awful and I remembered that I always wanted to build a robot. Of course, I had built simple line follower robots and the like years ago, but I was aiming higher this time. My idea was to build a Linux controlled robot with a Ruby brain and the capability to provide the operator with a live video stream. Well, at least this is my first milestone. An artificial intelligence module and a doomsday weapon need to stay in the backlog for now.

At first I thought about getting some LEGO® Mindstorms for the task, but then I quickly remembered a few things:

  1. Lego Mindstorms are still pretty expensive even though they are a cool toy.
  2. The NXT is programmed in a weird LabView‘esque language.
  3. I have an unused Arduino and a couple LEGO® Power Functions Motors along with some generic LEGO® Technic. That should do the trick.


I did what every sane tinkerer would do and ordered a Motor Driver Shield from Adafruit. The specs of this nice little shield convinced me to just get it as a finished and tested product and not to start soldering L293D chips myself. After all, I am mostly a software developer, not an electronics engineer.

While I waited for the delivery I started to play around with some LEGO® blocks to build a chassis that is strong enough to carry a small laptop and the Arduino around. I used my trusty old Asus EeePC 701 netbook that I keep around mainly for hacking purposes and as a backup PC. This quirky little fellow would sit on top and do all the heavy lifting CPU-wise: running Ruby, providing live video data (640×480 pixel), WIFI connection. Even with it weighing in at only 922g (~2 pounds) and dimensions of 225x165mm (~8.86×6.5 inches), it is not trivial task to build something out of LEGO® is strong enough to hold it without looking like a giant block. It is kind of funny how similar elegant programming is to elegant mechanical engineering. The basic principles of “DRY” or “YAGNI” and the like can all be applied, along with the high level essence of “less is more”.

In order to become alive and kicking, the robots brain would talk via USB to the Arduino, which would be equipped with Adafruits Motor Driver Shield. Connected to the Motor Driver Shield would be the LEGO® Power Functions Motors.

The finished robot

bigger version

  1. Arduino equipped with Adafruit Motor Driver Shield.
  2. Two LEGO® Power Functions M-Motors for driving.
  3. Slightly modified LEGO® Power Functions Battery Box connected to the Motor Driver Shield used as external power supply (the power from the USB port alone is not enough to drive the 9V motors).
  4. LEGO® Power Functions M-Motor for turning the gripper.
  5. Tiny red LEGO® Micromotor for opening and closing the gripper.
  6. Caster wheel able to turn 360°.
  7. Modified LEGO® Power Functions wires connected to the Motor Driver Shield, providing connectors for the motors.
  8. USB connection to the Arduino.
  9. Camera (640×480 pixel, 30 fps).

No soldering required at all. Although I did a little soldering to create a smaller, thinner USB cable in order to achieve a clutter free appearance.


Knowing that I would not need a fancy frontend and that the application would turn out pretty small in general, I decided to use Sinatra as the base framework accompanied with a small set of Gems and libraries:

I have been looking for a good reason to play around with websockets for quite some time now, and finally took the opportunity to use one to reduce the delay of the robot controls. I can tell you in cases like this, a websocket connection really gives you just the extra snappiness you need when compared to Ajax calls. All the magic is done by the wonderful Sinatra-Websocket Gem and all you have to do in Sinatra is something like this:

get '/' do
  if request.websocket?
    request.websocket do |ws|
      ws.onopen do
        ws.send("CONNECTED #{}")
        settings.websockets << ws
      ws.onmessage do |msg|
        EM.next_tick { settings.websockets.each{|s| s.send(msg) } }
      ws.onclose do
    haml :index

Here we mount the websocket right under “/”, handling both a websocket request and a normal GET for the index page. The three callback functions onopen, onmessage and onclose are used to handle all the events dealing with websockets.

The handle_commands method takes the websocket message, which is a JSON string in this case, and proxies the valid commands to the Arduino via the Ruby-Serialport Gem. I wrote a thin wrapper that makes it a little easier to register several motors and associate them with a name and port (on the Motor Driver Shield) respectively. It also lets you define which direction is “forward” or “backward” for each motor in your model.

$arduino =
$motordriver =
  { :left    =>, $arduino, { :forward => Motor::BACKWARD, :backward => Motor::FORWARD }),
    :right   =>, $arduino, { :forward => Motor::BACKWARD, :backward => Motor::FORWARD }),
    :gripper =>, $arduino, { :forward => Motor::FORWARD, :backward => Motor::BACKWARD }),
    :rotator =>, $arduino, { :forward => Motor::FORWARD, :backward => Motor::BACKWARD }),

def handle_commands(params={})
  params = (JSON.parse(params) unless params.class == Hash rescue {}){`espeak '#{params['speak'].tr(''','')}' 2< /dev/null`} if params['speak']
  $motordriver.left(*params['left'])       if params['left']
  $motordriver.right(*params['right'])     if params['right']
  $motordriver.gripper(*params['gripper']) if params['gripper']
  $motordriver.rotator(*params['rotator']) if params['rotator']
  p $!

Noticed the line that utilizes eSpeak? It is an primitively implemented but an incredibly fun gimmick that adds voice output to the robot! The other parameters are for the actual controls and handled by the $motordriver. It simply translates all the motor commands into 3 byte long messages to the Arduino conforming to the very simple binary protocol that I created. The first byte defines the motor (1-4), the second byte defines the direction (forward = 1, backward = 2, brake = 3, release = 4) and the third byte defines the speed (0-255). The complete Arduino sketch (a sketch is a program in Arduino-speak) looks like this:

#include <AFMotor.h> // Adafruit Motor shield library

AF_DCMotor motors[] = {AF_DCMotor(1), AF_DCMotor(2), AF_DCMotor(3), AF_DCMotor(4)};
int i = 0;

void setup() {
  Serial.println("Motor test!n");
  for(i=0;i<4;i++){ motors[i].run(RELEASE); }

void loop() {
  uint8_t motor;
  uint8_t direction;
  uint8_t speed;
  while (Serial.available() > 2) {
    motor     =;
    direction =;
    speed     =;
    if(motor > 0 && motor < 5) {
      if(direction < 1 || direction > 4){ direction = 4; }

For the video connection, I wanted to use a real audio-/video stream with some fancy codec like H.263 but gave up after several attempts with VLC, FFMPEG and MPlayer as streamer. I just could not get the latency below 3-4 seconds. I know this does not sound like a lot, but when you are operating a remote robot that executes your commands nearly instantly but delivers the images with such a delay, you’re going to have a bad time and crash into objects a lot. I decided to try simple Motion JPEG streaming which is basically just sending JPEG images to the browser without closing the connection. Somehow, despite the much bigger bandwidth usage, the delay shrunk down to half a second or so. My theory is that the Asus EeePC 701 is just a little too slow for fancier codecs. But still, if someone of you had success with “real” low latency streaming besides Red5 (since I’m not looking for Flash solutions), please feel free to leave a comment and point me in the right direction.

My code to handle the M-JPEG streaming is a bit more cluttered but after all it is a hack anyways ;)

get '/mjpgstream' do
  fps = (params['fps'] || 10).to_i
  ms = (1000 / fps.to_f).to_i
  headers('Cache-Control' => 'no-cache, private', 'Pragma' => 'no-cache',
          'Content-type'  => 'multipart/x-mixed-replace; boundary={{{NEXT}}}')
  stream(:keep_open) do |out|
    if !$mjpg_stream || $mjpg_stream.closed?
      puts "starting mjpg stream with #{fps} fps. #{ms} ms between frames."
      $mjpg_stream = IO.popen "./uvccapture/uvccapture -oSTDOUT -m -t#{ms} -DMJPG -x640 -y480 2> /dev/null"
    settings.video_connections << out
    out.callback {
      if settings.video_connections.empty?
        puts 'closing mjpg stream'
    buffer = ''
    buffer << $ while !buffer.end_with?("{{{NEXT}}}n")
    while !$mjpg_stream.closed?
        out << $

Basically I modified/extended the Uvccapture program to …

  • … allow “STDOUT” for the -o option
  • … use milliseconds instead of seconds for the -t option
  • … have a -D option to specify the delimiter or use "nn--{{{NEXT}}}nnContent-type: image/jpegnn" when -D is “MJPEG”

and utilized it with a IO.popen call to provide me with an easy to handle M-JPEG stream that could be transported to the browser with Sinatra’s built in stream function (requires Thin as a webserver, as far as I know). The Uvccapture stream is buffered and reused in case more than one client is requesting the resource and also gets opened and closed on demand. I am aware that this is probably not the most elegant solution but it worked well enough for me in this case.

The web frontend for the robot is pretty simple as promised and lingers within the main file itself in the form of two tiny Haml templates:

@@ layout
!!! 5
%html{:lang => ‘en’}
%meta{:charset => ‘utf-8′}
%link{:rel=>’shortcut icon’, :type=>’image/gif’, :href=>’favicon.gif’}
%title RuBot
%body{:style=>’background-color:black; text-align:center;’}
= yield(:layout)
%script{:src => ‘jquery.js’}
%script{:src => ‘app.js’}

@@ index
Use arrow keys to drive,
q/w to open/close gripper
a/s to turn gripper left/right
hold shift for slow-mode,
space to speak

All the frontend magic happens within “app.js”, where the robot controls are simply mapped to keyboard-events. This is a bit unlucky in hindsight because it makes it much harder to control the robot via a mobile phones browser but I hope you will forgive me. The other thing that is notable within these few lines of Haml code is the line %img{:src=>'/mjpgstream'}. That is all there is to it to display the M-JPEG stream. The content-type multipart/x-mixed-replace; boundary={{{NEXT}}} is enough for the browser to know what to do and handles the replacement of the images within the image tag by itself. Pretty cool if you ask me.

You can find all the code including the modified Uvccapture version in this repository.

As always I hope you enjoyed this little project and if so, feel free to tell your friends :)

  • Tomas

    Cool stuff guys.

Get the latest in Ruby, once a week, for free.