Ruby Can Be Faster with a Bit of Rust

rust

In spring of 2013, Sam Saffron committed fast_blank to GitHub. It was a C extension that provided a fast drop-in alternative to ActiveSupport’s String#blank? method which returns whether a string is nothing but whitespace.

At Full Stack Fest 2015, Yehuda Katz presented a talk where he described his investigation into rewriting fast_blank in Rust. His initial goal was to try a naive one-liner (excluding his implementation of Buf) and see how much slower it was than the original C extension.

extern "C" fn fast_blank(buf: Buf) -> bool {
  buf.as_slice().chars().all(|c| c.is_whitespace())
}

To his surprise, it was faster.

Key Takeaways

Rust, a modern, cross-platform, systems programming language designed by Mozilla, can be used to optimize Ruby code, providing performance and safety necessary for modern web browsers.
Rust does not have a built-in garbage collector, and instead uses a system of ownership and borrowing to manage memory. This system can prevent common programming errors like null pointer dereferencing, double free, and data races.
Ruby code can be optimized using Rust by writing critical performance sections of the code in Rust and then using FFI (Foreign Function Interface) to call that code from Ruby. This allows you to leverage the performance benefits of Rust while maintaining the simplicity and elegance of Ruby.
Integrating Rust into a Ruby project can be challenging due to differences in syntax and concepts. However, libraries like Rutie provide a bridge between Ruby and Rust, making the integration easier. To effectively optimize Ruby code with Rust, a good understanding of both languages is necessary.

What is ‘Rust’ ?

Rust is one of the most solid attempts to date to create a modern, cross-platform, systems programming language.

Mozilla designed Rust to be a language that could provide the performance and safety necessary for modern web browsers while being more accessible than previous static languages (at least in some ways). And it comes with glorious 1st-party package-management! Instead of gems, we have crates.

But these benefits comes with a price: there is a steep learning curve. Make no mistake, the masochists on the Rust team have designed a very hard language. There are a lot of syntactic edge cases, including the fact that the range syntax for expressions is different from the syntax for patterns. High-ranking tutorials often refer to deprecated idioms, and it isn’t always obvious why a particular line is throwing an error.

That isn’t to say Rust is buggy or poorly designed – just adolescent with a chaotic upbringing. And for those who can survive its arcane depths, Rust promises security, speed, and flexibility.

Getting Started

If you are on OSX, you can install Rust with Homebrew. Make sure your Homebrew is up to date because previous versions of the Rust formula did not include cargo. Rust 1.4.0 is being used here.

$ brew install rust

cargo is Rust’s project/package manager. Let’s start by creating an executable project:

$ cargo new hello_world --bin

The Rust equivalent of Gemfile is Cargo.toml:

[package]
name = "hello_world"
version = "0.1.0"
authors = ["Robert Qualls <robert@example.com>"]

If we take a look at src/main.rs we can see the source code:

fn main() {
    println!("Hello, world!");
}

Let’s go ahead and compile the project:

$ cargo build

This will generate Cargo.lock which should be treated in the same manner as Gemfile.lock.

$ cargo run
Hello

Static Programming

In Ruby, just about anything can happen at runtime. This provides a lot of flexibility, but means very little will be caught when compiling to bytecode. Static programming is a different world, mainly because you now have to deal with memory management. This includes managing pointers, data types, and (de)allocation.

A basic glossary:

Data Type – a way for the compiler to determine the address range of an identifier
Pointer – identifier associated with a memory address
Aliasing – when two pointers point to the same memory
Dangling Pointer – when a pointer points to freed memory
Memory Leak – when unused memory is never freed during the lifetime of a program
Mutability – the ability to associate different values with an identifier over the course of its lifetime

The traditional answer for dealing with dangling pointers and memory leaks is to bring in a garbage collector like Ruby’s. When variables are no longer being used, their memory is deallocated automatically when the collector performs a sweep. The problem with garbage collectors is they can cause unpredictable lag spikes and they place additional restrictions on the language’s host environment.

Rust does not have a built-in garbage collector. At the same time, you never free memory manually. Deallocation occurs by design. Once you adjust to the patterns needed to make code compile, you may forget there is no garbage collector involved.

Odds are if you try to do something in Rust that is not memory safe, it will simply not compile. In fact, no matter what you do, your Rust code will probably not compile.

Ownership and Borrowing

So, how does Rust get by without expecting you to manually free memory?

When it comes to managing memory, Rust is primarily concerned with two things:

Whether a function is taking ownership of something or borrowing it.
Whether a resource is mutable (chaotic neutral) or immutable (lawful good).

In unmanaged programming languages, the programmer is responsible for allocating and deallocating memory. For example, in C this is done with the malloc and free functions. Unfortunately, this opens up the programmer to the possibility that they may free memory before the program is done using it.

In Rust, ownership is the right to deallocate. When a function finishes executing, anything it owns is automatically freed from memory.

For example, this code will work:

fn print_vec(v: Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let v = vec![1,2,3];
    print_vec(v);
}

However, the call to print_vec gives it ownership of (and thus the task of deallocating) our vector. So if we try to call it a second time…

fn print_vec(v: Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let v = vec![1,2,3];
    print_vec(v);
    print_vec(v);
}

We get a compile error:

src/main.rs:7:15: 7:16 note: `v` moved here because it has type
`collections::vec::Vec<i32>`, which is non-copyable
src/main.rs:7     print_vec(v);

This can be solved by lending the vector to the function call with &:

fn print_vec(v: &Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let v = vec![1,2,3];
    print_vec(&v);
    print_vec(&v);
}

At present the code is not modifying the vector. What happens when we try to do that?

fn print_vec(v: &Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let v = vec![1,2,3];
    v[0] = 0;
    print_vec(&v);
    print_vec(&v);
}

We get another error!

src/main.rs:7:5: 7:6 error: cannot borrow immutable local variable `v` as
mutable
src/main.rs:7     v[0] = 0;

Anytime we want to modify a variable in Rust, it must be specified as mutable:

fn print_vec(v: &Vec<i32>) {
    println!("{:?}", v);
}

fn main() {
    let mut v = vec![1,2,3];
    v[0] = 0;
    print_vec(&v);
    print_vec(&v);
}

Don’t fret if you find this paradigm challenging. Although this is the default approach to memory management in Rust, you can also use reference-counting which will deallocate a reference when its last owner is gone. Think of it as garbage-collection lite.

If you want a tracing garbage collector, there’s Manishearth/rust-gc. Last, for a cycle collector you have fitzgen/bacon-rajan-cc.

Ruby to Rust

A great way to see how Rust works is to look at it side-by-side with Ruby.

Function Definition

Ruby:

def add(x, y)
  x + y
end

puts add(1, 2)

Rust:

fn add(x: i32, y: i32) -> i32 { x + y }

fn main() {
    println!("{}", add(1,2));
}

Here, -> i32 sets the function return type, and i32 means 32 bit signed integer. {} is the universal string interpolator in Rust and will interpolate the rest of the arguments to the println! macro in order.

Dynamic Arrays

Rust includes Vec which can grow at run-time.

Ruby:

names = ["bobby", "harry", "sally"]
names << "thor"

Rust:

fn main() {
    let mut names: Vec<&str> = vec!["bobby","harry","sally"];
    // Alternatively
    // let mut names = vec!["bobby","harry","sally"];
    names.push("thor");
}

vec! is a macro for instancing a Vec.

Closures

Ruby:

arr = [1, 2, 3]
puts arr.map { |i| i * 2 }.inspect

Rust:

fn main() {
    let arr = [1, 2, 3];
    let mapped = arr.iter().map(|&x| x * 2);
    let output = mapped.collect::<Vec<i32>>();
    println!("{:?}", output);
}

{:?} tells the interpolator to use the type’s implementation of the std::fmt::Debug trait (think Java interface or Ruby mixin module) for formatting.

Lambdas

Ruby:

is_even = ->(n) { n.even? }
puts is_even.call(5)

Rust:

fn main() {
    let is_even = |n: i32| n % 2 == 0;
    println!("{:?}", is_even(5));
}

Classes and Instance Methods

Ruby:

class Triangle
  attr_accessor :base, :height

  def initialize(base, height)
    @base, @height = base, height
  end

  def area
    (@base * @height) / 2.0
  end
end

triangle = Triangle.new(7, 5)
puts triangle.area

Rust:

struct Triangle {
    base: f32,
    height: f32
}

impl Triangle {
    fn area(&self) -> f32 {
        (self.base * self.height) / 2f32
    }
}

fn main() {
    let triangle = Triangle { base: 7f32, height: 5f32 };
    println!("{}", triangle.area());
}

In Rust, a struct represents the data, and an implementation represents the behavior.

The first argument of an impl instance method is a special case of self, &self, or &mut self, depending on the level of ownership needed. You may recognize passing self to instance methods from Python.

Note that the lack of a semicolon in area is an implicit return.

Static Methods

Ruby:

module StaticMath
  def self.add(x, y)
    x + y
  end
end

puts StaticMath::add(1, 2)

Rust:

struct StaticMath;
impl StaticMath {
    fn add(x: i32, y: i32) -> i32 {
        x + y
    }
}

fn main() {
    println!("{}", StaticMath::add(1,2));
}

Alternatively:

struct StaticMath;
impl StaticMath {
    fn add(&self, x: i32, y: i32) -> i32 {
        x + y
    }
}

fn main() {
    println!("{}", StaticMath.add(1,2));
}

Static methods (“associated methods” in Rust) are created by not specifying one of the selfs as the first argument, and then using :: instead of . to call the method. Although the alternative example appears to work, it’s possible the Rust team did not intend for static methods to be created in that manner.

Note that, while in Ruby, either :: or . can be used to call any method, the difference is enforced in Rust.

Making a Library

Let’s use our knowledge of Rust to make a quick library and use it from Ruby.

$ cargo new libadd

We need to add some lines to Cargo.toml so that Rust knows we are making a dynamic library.

[lib]
name = "add"
crate-type = ["dylib"]

Note that dylib in crate-type is not OSX-specific. It’s just telling Rust we want a dynamic library as opposed to an rlib (a Rust format) which is the default.

We need to add the function we wish to expose to src/lib.rs:

#[no_mangle]
pub extern "C" fn add(x: i32, y: i32) -> i32 {
  x + y
}

Now the library can be compiled:

$ cargo build --release

It’s possible for Ruby to communicate with Rust via a foreign-function interface. Fiddle and FFI are a couple of libraries that let us do this.

Fiddle

fiddle ships with Ruby, so you don’t need to install anything.

require "fiddle"
require "fiddle/import"

module Rust
  extend Fiddle::Importer
  lib_ext = "dylib" if `uname` =~ /Darwin/
  lib_ext = "so" if `uname` =~ /Linux/
  dlload "./libadd/target/release/libadd.#{lib_ext}"
  extern 'int add(int, int)'
end

puts Rust.add(1, 2)

FFI

ffi is a gem that needs to be installed.

$ gem install ffi

It’s used very similarly to fiddle.

require "ffi"

module Rust
  extend FFI::Library
  lib_ext = "dylib" if `uname` =~ /Darwin/
  lib_ext = "so" if `uname` =~ /Linux/
  ffi_lib "./libadd/target/release/libadd.#{lib_ext}"
  attach_function :add, [:int, :int], :int
end

puts Rust.add(1,2)

Conclusion

What was covered in this article barely scratches the surface of the Rust language. At this point, your biggest challenge will be learning how to safely integrate Rust into your Ruby projects. If you leave Rust in charge of deallocation, it may free your pointers before Ruby is done using them. If you leave deallocation to Ruby, the project becomes open to memory leaks. This seems to be what the community is hashing over at the moment.

There is a possibility Rust has been made too arcane for wide adoption. It remains to be seen whether Swift – now open source – will take the stage in systems programming. But for now, if you want something fast and safe, you may find a friend in this not-so-corrosive language.

Frequently Asked Questions (FAQs) about Ruby and Rust

What are the key differences between Ruby and Rust in terms of performance?

Ruby is a dynamic, interpreted language, which means it tends to be slower in terms of performance compared to compiled languages like Rust. Rust, on the other hand, is a statically-typed, compiled language that is designed for performance and safety, particularly safe concurrency. It has a more complex syntax compared to Ruby but offers better performance, especially in tasks that require heavy computation or system-level programming.

Can Ruby code be optimized using Rust?

Yes, Ruby code can be optimized using Rust. Rust’s performance benefits can be leveraged in Ruby by writing critical performance sections of the code in Rust and then using FFI (Foreign Function Interface) to call that code from Ruby. This allows you to get the best of both worlds – the simplicity and elegance of Ruby and the performance of Rust.

How does Rust ensure memory safety without a garbage collector?

Rust ensures memory safety through its ownership system. In Rust, each value has a variable that’s called its owner. There can only be one owner at a time, and when the owner goes out of scope, the value will be dropped. This system prevents common programming errors like null pointer dereferencing, double free, and data races.

What are the use cases where Rust would be a better choice than Ruby?

Rust would be a better choice in scenarios where performance is critical. This includes system-level programming, game development, and other areas where you need to have direct control over the system resources. Rust’s memory safety features also make it a good choice for concurrent programming where data races can be a concern.

How difficult is it to integrate Rust into a Ruby project?

Integrating Rust into a Ruby project can be challenging if you’re not familiar with Rust’s syntax and concepts. However, there are libraries like Rutie that provide a bridge between Ruby and Rust and make the integration easier. It’s also important to note that you’ll need to have a good understanding of both languages to effectively optimize your Ruby code with Rust.

Can Rust replace Ruby in web development?

While Rust has many advantages in terms of performance and safety, it’s not typically used for web development. Ruby, with frameworks like Ruby on Rails, provides a more productive environment for building web applications. However, Rust can be used to write performance-critical parts of a web application, such as database interactions or computations.

What are the challenges in porting Ruby code to Rust?

Porting Ruby code to Rust can be challenging due to the differences in the languages. Ruby is a dynamic, interpreted language with a focus on developer productivity and happiness, while Rust is a statically-typed, compiled language with a focus on performance and safety. This means that idiomatic Ruby code may not translate directly into idiomatic Rust code, and you’ll need to carefully consider how to best structure your Rust code to achieve the desired performance improvements.

How does Rust handle error handling compared to Ruby?

Rust uses a system of return values for error handling, which is different from Ruby’s exception-based error handling. In Rust, functions that can fail return a Result type, which is an enum that can be either Ok(value) or Err(error). This forces you to explicitly handle errors, which can lead to more robust code.

Can I use Ruby libraries in Rust?

While it’s technically possible to use Ruby libraries in Rust through FFI, it’s generally not recommended. The differences in the languages and their ecosystems mean that it’s usually more effective to use libraries written in the same language as your code. However, you can use Rust libraries in your Ruby code, which can be a good way to leverage Rust’s performance benefits.

What resources are available for learning Rust?

There are many resources available for learning Rust. The Rust Programming Language book, often referred to as “The Book,” is a comprehensive resource that’s available for free online. There are also many tutorials, blog posts, and online courses available. The Rust community is also very active and welcoming, so don’t hesitate to ask questions if you’re stuck.