Ruby, Python, Java, C and Programmer Happiness

 

“Ruby is designed to make programmers happy.” – Yukihiro “Matz” Matsumoto

Not everyone might agree, but as a Rubyist I think Matz achieved his design goal. There’s something intangible about Ruby’s syntax that makes it fun, rewarding and easy to use – something that makes me happy. I thought it would be fun to compare Ruby with a few other languages by looking at how different open source developers implemented the same method or function in each language. How do the languages differ? Do they make you equally happy?

And what better example to look at than inside of Ruby itself! Today I’m going to look at how Ruby’s Hash#fetch method is implemented in Ruby (by Rubinius), Python (by Topaz), Java (by JRuby) and finally in C (by standard Ruby 2.0). Of course, there are many other programming languages out there, even other versions of Ruby, but looking at a small slice of Ruby internals gives us an interesting example and allows us to compare apples with apples.

Hash#fetch

For those of you not familiar with Ruby, or with the the Hash#fetch method, let’s first review what the Hash#fetch does. Fetch allows you to lookup a value from a hash using a key, just like the [] method does. In addition, fetch also allows you to specify a default value Ruby should return if it can’t find the requested key. Here’s the example from the Ruby documentation:

fetch1

Also, if you provide a block Ruby will call it to get a return value if it can’t find the requested key:

fetch2

Hash#fetch in Rubinius

Let’s start our survey by looking at how Rubinius implements Hash#fetch. Since Rubinius uses Ruby to implement its kernel, reading the Rubinius source code is a great way to understand exactly what a given Ruby method does.

Here’s the Rubinius implementation of Hash#fetch from kernel/common/hash19.rb:

rbx
As you can see here, there’s a lot to criticize in Ruby’s syntax. And certainly this isn’t the most elegant example of Ruby code in the world.

But I like it. It makes me happy. Why? Because it’s simple to understand what this code does. The Ruby language doesn’t get in the way of the meaning of the code, of what the code is trying to achieve.

Reading Ruby is almost like reading pseudocode – you know, the sort of code you write on a whiteboard while trying to explain some idea or algorithm. Why do you use pseudocode? Because you don’t have time to fuss with language syntax. You just want to get your point across. For me, writing Ruby is just like writing pseudocode.

Hash#fetch in Topaz

One of the more exciting events this year in the Ruby community was the news that Alex Gaynor had implemented Ruby using Python and the PyPy toolkit. PyPy allows Python developers to implement a compiler and virtual machine for their own language using a subset of Python called “RPython.” PyPy converts the developer’s custom VM into C using a sophisticated series of optimizations, and later compiles the C into fast, native machine language.

Let’s see how the same code looks in Python:

topaz

To me Python really seems like a sister language to Ruby. Aside from some minor differences around whitespace, colons and end keywords, the two languages are quite similar.

One important difference between the two is that Python doesn’t support closures in the same elegant way that Ruby does with blocks. The block parameter and invoke_block call here are part of the implementation of Ruby blocks for Topaz.

It’s almost as easy to read this code as it was to read the Rubinius implementation. Any additional code or verbosity here is just due to the details of the way Topaz works internally. I think I could become happy using Python if I spent some time learning to use the language – at least if my editor handled whitespace properly!

Hash#fetch in JRuby

The similarity between Ruby and Python become more obvious when you compare either of them with Java. To see what I mean, take a look at the JRuby implementation of Hash#fetch:

jruby

The same code is just a bit more verbose in Java than it was in Python or Ruby. Trying to read and understand this code takes a bit more effort than it did before. First you have to train your eyes not to see semicolons, parentheses or braces.

More importantly, the way Java works begins to get in the way of the algorithm. The clearest example of this is the use of an interface IRubyObject to handle the method’s arguments and return value. Since Java is a statically typed language, you have to worry about exactly what type of value each object is.

Depending on how you look at them, Java interfaces are either an awkward workaround for this problem, or an elegant way to insure each object provides the API you require, allowing the Java compiler to give you helpful error messages. But for me, they are unnecessary typing and thinking. I’m starting to forget what I was trying to achieve and worrying more about how the Java compiler works. The Java compiler should be making me happy, not the other way around!

Hash#fetch in C

Now let’s take a look at the official version of Hash#fetch. This is the C code that your Ruby program actually uses if you’re running Ruby 2.0:

mri
Now the level of verbosity jumps even more. In C, not only do we have the same static types we saw in Java, but now I have to worrying about pointers, memory management, and hardware optimizations using keywords such as “volatile.”

If you’re familiar with the idiomatic style of Ruby’s source code, with constructs like “RHASH,” “st_lookup” and “rb_scan_args,” then it’s not that hard to follow this. But there are still very confusing details here, such as the use of rb_protect to handle exceptions that might occur while generating the “key not found” error message.

Writing C code like this, I have to be completely aware of how each piece of data is represented by my computer’s hardware. How many bytes does it use? Does this API expect a value or a pointer? Do I need to free the memory this pointer references?

Unfair Comparisons

Of course, all of these languages are very different, intended for different purposes. C is really shorthand for writing assembly language, and, if used well, gives you tremendous control, flexibility and speed. Java allows you to write elegant, clean, and object oriented code without worry about hardware details or portability issues. And it still runs quite fast on the JVM. JRuby, in fact, is one of the fastest versions of Ruby around, faster than MRI Ruby in some cases.

My reason for comparing these languages to each other is to remind you of how happy you are writing Ruby. Don’t take it for granted! Matz and the Ruby core team have done a tremendous amount of work for the past 20 years to bring you programmer happiness.

And why have other talented open source developers worked so hard to reimplement Ruby using different languages, such as Java, Python or Ruby itself? Because they love Ruby also. They want to bring that happiness to their own platform or technology.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://blog.headius.com Charles Oliver Nutter

    Unfair indeed :-)

    The Java code for JRuby’s Hash#fetch suffers from a number of issues:

    * It’s roughly just a port of the C code, originally intended to mimic the C code for maintenance reasons. We don’t do that anymore.
    * It’s an old piece of code, not utilizing our current coding standard or other JRuby features to clean it up.

    Comparing C and Java to Ruby and Python is also unfair because we’re talking about languages great for systems-level development against languages great for application development.

    And in the end, the Ruby version would work fine on JRuby, too, with about the same performance it has in Rubinius, but the Java and C versions of this code are faster.

    Just for sake of argument…I did do a “proper” impl of Hash#fetch using appropriate JRuby features and coding standards. To my eyes it’s not much worse than the Ruby or Python code.

    https://gist.github.com/headius/5709175

  • Erik

    In Scala it would look something like this:

    def fetch(key:A, default: => B = {throw new Exception(“Key not found”)}) =
    map.getOrElse(key, default)

    • ApuX

      Erik > map.getOrElse in scala seems to do exactly what map.fetch does in ruby, so what will be interesting is knowing how getOrElse is implemented in scala.

  • David

    How about Perl?

    # define a hash
    my %h = ( a => 100, b => 200 );

    # print the value for a
    print $h{‘a’};

    # print the value for z or the string default
    print ($h{‘z’} or ‘default’);

    Doesn’t get much easier than that!

    • ApuX

      if my %h = ( z => false); how will print ($h{‘z’} or ‘default’); work?

      • kovnsk

        The whole reason he needs parenthesis on the second print is because of the very low precedence of ‘or’… just use || unless you need the low precedence, and use the defined-or operator to check for definedness.

        use v5.14;

        my %h = (a => 100, b => 200);

        say $h{a};# => 100
        say $h{z} // ‘default’; # => default
        $h{z} = 0;
        say $h{z} || ‘default’; # => default
        say $h{z} // ‘default’; # => 0

        # of course, this doesn’t protect you against:
        $h{z} = undef;
        say $h{z} // ‘default’; # => default

  • Olivier

    In ruby you can do that too :

    h = { :a => 100 }
    puts h[:a] || “default”

  • Steve Single

    Nice example of different languages implementing the same function. Thanks for the article. I especially liked the tidbit about PyPy.

    In this example the Java code was a bit tricky, though not as mind bending as the C. If all my functions were going to be this kind I would prefer Ruby over the others. Mostly functions don’t try to cover so many options at once in the programs I write. This may be why Ruby programs tend to need fewer lines of source than languages like Java and especially C.

  • Francesco Belladonna

    I think your opinions about the ruby code syntax is personal. I DO like and prefer “unless” instead of if not (also suggested by Beautiful Ruby Code guide: https://github.com/dreamr/beautiful-ruby-code ), multiple returns are ok, if you have an if on the end, that is much more clean than a if/else or similar, I really like the way it’s coded there.

    • http://patshaughnessy.net Pat Shaughnessy

      Of course – completely personal. I actually like “unless” myself. I mentioned that in the diagram only because it’s a common source of confusion and sometimes draws criticism from non-Rubyists.

      I’m torn about the multiple return statements. In a simple method like this it’s no problem at all, and is quite readable. But if you had a more complex method with a return buried in the middle it could be quite confusing.

      • Francesco Belladonna

        Speaking about your second statement, well it depends. If you use returns only at top and bottom of the method (on top to exit the method instantly, bottom to return the correct value or raise an exception) I think it’s ok. If there are returns in the middle of the code it will create confusion.

        But in that method having
        return
        return
        raise

        Shows quite well what’s going to happen, at least for me
        I don’t comment the other languages because I’m very addicted to syntax and with Ruby I finally found my ideal syntax.

  • http://upperroom.org Dewayne VanHoozer

    Working backward from ASM -> FORTRAN -> C -> APL -> COBOL -> PL/I -> Ada -> Java -> Python -> Ruby !! yes I am a MUCH HAPPIER Programmer !

  • http://www.speakeasy.org/~lion/ LionKimbro

    In Python, it’s:

    h.get(“z”) or “go fish, {}”.format(z)

    If it is possible that you’ll actually have a value None in there, it’s:

    h["z"] if “z” in h else “go fish, {}”.format(z).

    You’re welcome.