Ruby
Article

Ruby Can Be Faster with a Bit of Rust

By Robert Qualls

rust

In spring of 2013, Sam Saffron committed fast_blank to GitHub. It was a C extension that provided a fast drop-in alternative to ActiveSupport’s String#blank? method which returns whether a string is nothing but whitespace.

At Full Stack Fest 2015, Yehuda Katz presented a talk where he described his investigation into rewriting fast_blank in Rust. His initial goal was to try a naive one-liner (excluding his implementation of Buf) and see how much slower it was than the original C extension.

extern "C" fn fast_blank(buf: Buf) -> bool {
  buf.as_slice().chars().all(|c| c.is_whitespace())
}

To his surprise, it was faster.

What is ‘Rust’ ?

Rust is one of the most solid attempts to date to create a modern, cross-platform, systems programming language.

Mozilla designed Rust to be a language that could provide the performance and safety necessary for modern web browsers while being more accessible than previous static languages (at least in some ways). And it comes with glorious 1st-party package-management! Instead of gems, we have crates.

But these benefits comes with a price: there is a steep learning curve. Make no mistake, the masochists on the Rust team have designed a very hard language. There are a lot of syntactic edge cases, including the fact that the range syntax for expressions is different from the syntax for patterns. High-ranking tutorials often refer to deprecated idioms, and it isn’t always obvious why a particular line is throwing an error.

That isn’t to say Rust is buggy or poorly designed – just adolescent with a chaotic upbringing. And for those who can survive its arcane depths, Rust promises security, speed, and flexibility.

Getting Started

If you are on OSX, you can install Rust with Homebrew. Make sure your Homebrew is up to date because previous versions of the Rust formula did not include cargo. Rust 1.4.0 is being used here.

$ brew install rust

cargo is Rust’s project/package manager. Let’s start by creating an executable project:

$ cargo new hello_world --bin

The Rust equivalent of Gemfile is Cargo.toml:

[package]
name = "hello_world"
version = "0.1.0"
authors = ["Robert Qualls <robert@example.com>"]

If we take a look at src/main.rs we can see the source code:

fn main() {
    println!("Hello, world!");
}

Let’s go ahead and compile the project:

$ cargo build

This will generate Cargo.lock which should be treated in the same manner as Gemfile.lock.

$ cargo run
Hello

Static Programming

In Ruby, just about anything can happen at runtime. This provides a lot of flexibility, but means very little will be caught when compiling to bytecode. Static programming is a different world, mainly because you now have to deal with memory management. This includes managing pointers, data types, and (de)allocation.

A basic glossary:

  • Data Type – a way for the compiler to determine the address range of an identifier
  • Pointer – identifier associated with a memory address
  • Aliasing – when two pointers point to the same memory
  • Dangling Pointer – when a pointer points to freed memory
  • Memory Leak – when unused memory is never freed during the lifetime of a program
  • Mutability – the ability to associate different values with an identifier over the course of its lifetime

The traditional answer for dealing with dangling pointers and memory leaks is to bring in a garbage collector like Ruby’s. When variables are no longer being used, their memory is deallocated automatically when the collector performs a sweep. The problem with garbage collectors is they can cause unpredictable lag spikes and they place additional restrictions on the language’s host environment.

Rust does not have a built-in garbage collector. At the same time, you never free memory manually. Deallocation occurs by design. Once you adjust to the patterns needed to make code compile, you may forget there is no garbage collector involved.

Odds are if you try to do something in Rust that is not memory safe, it will simply not compile. In fact, no matter what you do, your Rust code will probably not compile.

Ownership and Borrowing

So, how does Rust get by without expecting you to manually free memory?

When it comes to managing memory, Rust is primarily concerned with two things:

  1. Whether a function is taking ownership of something or borrowing it.
  2. Whether a resource is mutable (chaotic neutral) or immutable (lawful good).

In unmanaged programming languages, the programmer is responsible for allocating and deallocating memory. For example, in C this is done with the malloc and free functions. Unfortunately, this opens up the programmer to the possibility that they may free memory before the program is done using it.

In Rust, ownership is the right to deallocate. When a function finishes executing, anything it owns is automatically freed from memory.

For example, this code will work:

fn print_vec(v: Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let v = vec![1,2,3];
    print_vec(v);
}

However, the call to print_vec gives it ownership of (and thus the task of deallocating) our vector. So if we try to call it a second time…

fn print_vec(v: Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let v = vec![1,2,3];
    print_vec(v);
    print_vec(v);
}

We get a compile error:

src/main.rs:7:15: 7:16 note: `v` moved here because it has type
`collections::vec::Vec<i32>`, which is non-copyable
src/main.rs:7     print_vec(v);

This can be solved by lending the vector to the function call with &:

fn print_vec(v: &Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let v = vec![1,2,3];
    print_vec(&v);
    print_vec(&v);
}

At present the code is not modifying the vector. What happens when we try to do that?

fn print_vec(v: &Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let v = vec![1,2,3];
    v[0] = 0;
    print_vec(&v);
    print_vec(&v);
}

We get another error!

src/main.rs:7:5: 7:6 error: cannot borrow immutable local variable `v` as
mutable
src/main.rs:7     v[0] = 0;

Anytime we want to modify a variable in Rust, it must be specified as mutable:

fn print_vec(v: &Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let mut v = vec![1,2,3];
    v[0] = 0;
    print_vec(&v);
    print_vec(&v);
}

Don’t fret if you find this paradigm challenging. Although this is the default approach to memory management in Rust, you can also use reference-counting which will deallocate a reference when its last owner is gone. Think of it as garbage-collection lite.

If you want a tracing garbage collector, there’s Manishearth/rust-gc. Last, for a cycle collector you have fitzgen/bacon-rajan-cc.

Ruby to Rust

A great way to see how Rust works is to look at it side-by-side with Ruby.

Function Definition

Ruby:

def add(x, y)
  x + y
end

puts add(1, 2)

Rust:

fn add(x: i32, y: i32) -> i32 { x + y }

fn main() {
    println!("{}", add(1,2));
}

Here, -> i32 sets the function return type, and i32 means 32 bit signed integer. {} is the universal string interpolator in Rust and will interpolate the rest of the arguments to the println! macro in order.

Dynamic Arrays

Rust includes Vec which can grow at run-time.

Ruby:

names = ["bobby", "harry", "sally"]
names << "thor"

Rust:

fn main() {
    let mut names: Vec<&str> = vec!["bobby","harry","sally"];
    // Alternatively
    // let mut names = vec!["bobby","harry","sally"];
    names.push("thor");
}

vec! is a macro for instancing a Vec.

Closures

Ruby:

arr = [1, 2, 3]
puts arr.map { |i| i * 2 }.inspect

Rust:

fn main() {
    let arr = [1, 2, 3];
    let mapped = arr.iter().map(|&x| x * 2);
    let output = mapped.collect::<Vec<i32>>();
    println!("{:?}", output);
}

{:?} tells the interpolator to use the type’s implementation of the std::fmt::Debug trait (think Java interface or Ruby mixin module) for formatting.

Lambdas

Ruby:

is_even = ->(n) { n.even? }
puts is_even.call(5)

Rust:

fn main() {
    let is_even = |n: i32| n % 2 == 0;
    println!("{:?}", is_even(5));
}

Classes and Instance Methods

Ruby:

class Triangle
  attr_accessor :base, :height

  def initialize(base, height)
    @base, @height = base, height
  end

  def area
    (@base * @height) / 2.0
  end
end

triangle = Triangle.new(7, 5)
puts triangle.area

Rust:

struct Triangle {
    base: f32,
    height: f32
}

impl Triangle {
    fn area(&self) -> f32 {
        (self.base * self.height) / 2f32
    }
}

fn main() {
    let triangle = Triangle { base: 7f32, height: 5f32 };
    println!("{}", triangle.area());
}

In Rust, a struct represents the data, and an implementation represents the behavior.

The first argument of an impl instance method is a special case of self, &self, or &mut self, depending on the level of ownership needed. You may recognize passing self to instance methods from Python.

Note that the lack of a semicolon in area is an implicit return.

Static Methods

Ruby:

module StaticMath
  def self.add(x, y)
    x + y
  end
end

puts StaticMath::add(1, 2)

Rust:

struct StaticMath;
impl StaticMath {
    fn add(x: i32, y: i32) -> i32 {
        x + y
    }
}

fn main() {
    println!("{}", StaticMath::add(1,2));
}

Alternatively:

struct StaticMath;
impl StaticMath {
    fn add(&self, x: i32, y: i32) -> i32 {
        x + y
    }
}

fn main() {
    println!("{}", StaticMath.add(1,2));
}

Static methods (“associated methods” in Rust) are created by not specifying one of the selfs as the first argument, and then using :: instead of . to call the method. Although the alternative example appears to work, it’s possible the Rust team did not intend for static methods to be created in that manner.

Note that, while in Ruby, either :: or . can be used to call any method, the difference is enforced in Rust.

Making a Library

Let’s use our knowledge of Rust to make a quick library and use it from Ruby.

$ cargo new libadd

We need to add some lines to Cargo.toml so that Rust knows we are making a dynamic library.

[lib]
name = "add"
crate-type = ["dylib"]

Note that dylib in crate-type is not OSX-specific. It’s just telling Rust we want a dynamic library as opposed to an rlib (a Rust format) which is the default.

We need to add the function we wish to expose to src/lib.rs:

#[no_mangle]
pub extern "C" fn add(x: i32, y: i32) -> i32 {
  x + y
}

Now the library can be compiled:

$ cargo build --release

It’s possible for Ruby to communicate with Rust via a foreign-function interface. Fiddle and FFI are a couple of libraries that let us do this.

Fiddle

fiddle ships with Ruby, so you don’t need to install anything.

require "fiddle"
require "fiddle/import"

module Rust
  extend Fiddle::Importer
  lib_ext = "dylib" if `uname` =~ /Darwin/
  lib_ext = "so" if `uname` =~ /Linux/
  dlload "./libadd/target/release/libadd.#{lib_ext}"
  extern 'int add(int, int)'
end

puts Rust.add(1, 2)

FFI

ffi is a gem that needs to be installed.

$ gem install ffi

It’s used very similarly to fiddle.

require "ffi"

module Rust
  extend FFI::Library
  lib_ext = "dylib" if `uname` =~ /Darwin/
  lib_ext = "so" if `uname` =~ /Linux/
  ffi_lib "./libadd/target/release/libadd.#{lib_ext}"
  attach_function :add, [:int, :int], :int
end

puts Rust.add(1,2)

Conclusion

What was covered in this article barely scratches the surface of the Rust language. At this point, your biggest challenge will be learning how to safely integrate Rust into your Ruby projects. If you leave Rust in charge of deallocation, it may free your pointers before Ruby is done using them. If you leave deallocation to Ruby, the project becomes open to memory leaks. This seems to be what the community is hashing over at the moment.

There is a possibility Rust has been made too arcane for wide adoption. It remains to be seen whether Swift – now open source – will take the stage in systems programming. But for now, if you want something fast and safe, you may find a friend in this not-so-corrosive language.

More:

No Reader comments

Recommended

Learn Coding Online
Learn Web Development

Start learning web development and design for free with SitePoint Premium!

Get the latest in Ruby, once a week, for free.