“Ruby is designed to make programmers happy.” – Yukihiro “Matz” Matsumoto
Not everyone might agree, but as a Rubyist I think Matz achieved his design goal. There’s something intangible about Ruby’s syntax that makes it fun, rewarding and easy to use – something that makes me happy. I thought it would be fun to compare Ruby with a few other languages by looking at how different open source developers implemented the same method or function in each language. How do the languages differ? Do they make you equally happy?
And what better example to look at than inside of Ruby itself! Today I’m going to look at how Ruby’s Hash#fetch method is implemented in Ruby (by Rubinius), Python (by Topaz), Java (by JRuby) and finally in C (by standard Ruby 2.0). Of course, there are many other programming languages out there, even other versions of Ruby, but looking at a small slice of Ruby internals gives us an interesting example and allows us to compare apples with apples.
For those of you not familiar with Ruby, or with the the Hash#fetch method, let’s first review what the Hash#fetch does. Fetch allows you to lookup a value from a hash using a key, just like the  method does. In addition, fetch also allows you to specify a default value Ruby should return if it can’t find the requested key. Here’s the example from the Ruby documentation:
Also, if you provide a block Ruby will call it to get a return value if it can’t find the requested key:
Hash#fetch in Rubinius
Let’s start our survey by looking at how Rubinius implements Hash#fetch. Since Rubinius uses Ruby to implement its kernel, reading the Rubinius source code is a great way to understand exactly what a given Ruby method does.
Here’s the Rubinius implementation of Hash#fetch from kernel/common/hash19.rb:
As you can see here, there’s a lot to criticize in Ruby’s syntax. And certainly this isn’t the most elegant example of Ruby code in the world.
But I like it. It makes me happy. Why? Because it’s simple to understand what this code does. The Ruby language doesn’t get in the way of the meaning of the code, of what the code is trying to achieve.
Reading Ruby is almost like reading pseudocode – you know, the sort of code you write on a whiteboard while trying to explain some idea or algorithm. Why do you use pseudocode? Because you don’t have time to fuss with language syntax. You just want to get your point across. For me, writing Ruby is just like writing pseudocode.
Hash#fetch in Topaz
One of the more exciting events this year in the Ruby community was the news that Alex Gaynor had implemented Ruby using Python and the PyPy toolkit. PyPy allows Python developers to implement a compiler and virtual machine for their own language using a subset of Python called “RPython.” PyPy converts the developer’s custom VM into C using a sophisticated series of optimizations, and later compiles the C into fast, native machine language.
Let’s see how the same code looks in Python:
To me Python really seems like a sister language to Ruby. Aside from some minor differences around whitespace, colons and end keywords, the two languages are quite similar.
One important difference between the two is that Python doesn’t support closures in the same elegant way that Ruby does with blocks. The block parameter and invoke_block call here are part of the implementation of Ruby blocks for Topaz.
It’s almost as easy to read this code as it was to read the Rubinius implementation. Any additional code or verbosity here is just due to the details of the way Topaz works internally. I think I could become happy using Python if I spent some time learning to use the language – at least if my editor handled whitespace properly!
Hash#fetch in JRuby
The similarity between Ruby and Python become more obvious when you compare either of them with Java. To see what I mean, take a look at the JRuby implementation of Hash#fetch:
The same code is just a bit more verbose in Java than it was in Python or Ruby. Trying to read and understand this code takes a bit more effort than it did before. First you have to train your eyes not to see semicolons, parentheses or braces.
More importantly, the way Java works begins to get in the way of the algorithm. The clearest example of this is the use of an interface IRubyObject to handle the method’s arguments and return value. Since Java is a statically typed language, you have to worry about exactly what type of value each object is.
Depending on how you look at them, Java interfaces are either an awkward workaround for this problem, or an elegant way to insure each object provides the API you require, allowing the Java compiler to give you helpful error messages. But for me, they are unnecessary typing and thinking. I’m starting to forget what I was trying to achieve and worrying more about how the Java compiler works. The Java compiler should be making me happy, not the other way around!
Hash#fetch in C
Now let’s take a look at the official version of Hash#fetch. This is the C code that your Ruby program actually uses if you’re running Ruby 2.0:
Now the level of verbosity jumps even more. In C, not only do we have the same static types we saw in Java, but now I have to worrying about pointers, memory management, and hardware optimizations using keywords such as “volatile.”
If you’re familiar with the idiomatic style of Ruby’s source code, with constructs like “RHASH,” “st_lookup” and “rb_scan_args,” then it’s not that hard to follow this. But there are still very confusing details here, such as the use of rb_protect to handle exceptions that might occur while generating the “key not found” error message.
Writing C code like this, I have to be completely aware of how each piece of data is represented by my computer’s hardware. How many bytes does it use? Does this API expect a value or a pointer? Do I need to free the memory this pointer references?
Of course, all of these languages are very different, intended for different purposes. C is really shorthand for writing assembly language, and, if used well, gives you tremendous control, flexibility and speed. Java allows you to write elegant, clean, and object oriented code without worry about hardware details or portability issues. And it still runs quite fast on the JVM. JRuby, in fact, is one of the fastest versions of Ruby around, faster than MRI Ruby in some cases.
My reason for comparing these languages to each other is to remind you of how happy you are writing Ruby. Don’t take it for granted! Matz and the Ruby core team have done a tremendous amount of work for the past 20 years to bring you programmer happiness.
And why have other talented open source developers worked so hard to reimplement Ruby using different languages, such as Java, Python or Ruby itself? Because they love Ruby also. They want to bring that happiness to their own platform or technology.