Comparing Background Processing Libraries: Sidekiq

Dhaivat Pandya
This entry is part 3 of 3 in the series Comparing Ruby Background Processing Libraries

Comparing Ruby Background Processing Libraries

People and door blue

This is the final part in the background processing series. You can check out the previous posts from the links above. I’ll summarize what we have covered so far.

  • The principles behind background processing (queues, processes etc.).
  • Using delayed_job, which is amazing for getting background processing rolled into your app really quickly, but, it uses ActiveRecord so it is kind of slow and you end up mixing up background and foreground code throughout your app.
  • Using Resque, which hopes to solve the problems with delayed_job by using Redis and separate worker classes.

However, we can still improve on the performance of Resque. This is where Sidekiq comes in.

Sidekiq

If you read through the theory section in the previous articles (which you really should), you might have noticed that I talked about processes as the only concurrency option. It turns out that’s not really true and Sidekiq takes full advantage of something else: Threads.

Threads are a super lightweight concurrency measure, in order to increase performance. The headline on the Sidekiq page puts it squarely

“What if one Sidekiq process could do the work of 20 Resque or DelayedJob processes?”

Let’s check out how to use Sidekiq. If you read through the delayed_job and Resque introductions, Sidekiq is somewhat a combination of the two. Sidekiq does use Redis, so you will need to have that installed and running (check out the Resque section in the previous articles to see how to do this).

To get started, add the following line to your Gemfile:

gem 'sidekiq'

Install ‘er up:

bundle install

Fortunately, that’s about it for the setup we need. SideKiq doesn’t use the rake system directly like Resque, so we don’t have to fiddle around with “lib/tasks”. Let’s get onto writing our print worker in “app/workers/print_worker.rb”. Here it is:

class PrintWorker
  include Sidekiq::Worker

  def perform(str)
    puts str
  end
end

Once again, we have to queue up a job somewhere (we’ll do it in the index controller):

class IndexController < ApplicationController
  def index
    PrintWorker.perform_async(params[:to_print])
  end
end

Finally, we have to get the Sidekiq process(es) fired up. Type this into your shell in the root directory of your Rails app:

bundle exec sidekiq

This starts a process that is waiting for jobs. Let’s put one in the queue.

If you go to “/index/index?to_print=sidekiqisgreat”, you should get “sidekiqisgreat” somewhere in the output of your sidekiq process (The sidekiq runner includes some other information that you can safely ignore for the sake of the example.)

Sidekiq is easy enough if you learned Resque, but it has a pretty big “problem” that we haven’t discussed yet: thread safety. Since Sidekiq uses threads, you can only uses libraries in Sidekiq if they are thread safe – anything else will likely cause a ton of problems that are difficult to track down. This really limits what you can do with Sidekiq.

Secondly, you must ensure that your code is thread safe (e.g. global variables are a no-go). Ruby (rather, the “default” ruby interpreter which is MRI) also has something called an “interpreter lock”, which means that only one thread can run at a time, so Sidekiq will work much better with alternative implementations of Ruby, such as Rubinus and JRuby.

What’s the point of going through all that hassle? Why not just use Resque? The biggest reason is performance; the difference is pretty big if you’re processing lots of jobs that would benefit with concurrency.

The Final Comparison

Over this three part series, we’ve covered the theory behind background processing and three background processing frameworks: delayed_job, Resque and Sidekiq. Each has its ups and downs.

With delayed_job:
* Pros
incredibly quick and easy to get rolling
– no addition to your “stack”; it can run just fine with ActiveRecord
– a fantastic choice for beginners or migrating code from the foreground to the background
* Cons
– Runs on ActiveRecord, so it will probably run slower than something that runs on Redis
– Makes it very easy to mix async and sync code in file, which, in my opinion, is a bad thing
– The “.delay” calls scattered across your codebase will make it difficult to reason about six months later

With Resque:
* Pros
– It runs on Redis, which is fast
– Great separation of background code with worker classes
It has a fantastic web dashboard
– It makes it fairly easy to do background processing
– It is my favorite!
* Cons
– More difficult to get running with than delayed_job
– Still isn’t the fastest!
– Doing prioritized, time-based jobs is not as easy as delayed_job

With Sidekiq:
* Pros
– Pretty darn fast and workers are lightweight
– You can port over code really easily from Resque
– Great separation of code
* Cons
– More difficult than delayed_job
– (this is the biggie) You must use thread-safe libraries and write thread-safe code

As I’ve mentioned, I like Resque the most out of the three. My primary reason to avoid delayed_job is the mixing of sync and async code. When considering Sidekiq, it is the requirement of thread-safe libraries that scares me off.

If your requirements are different, your choice could be different. Just take the time to understand the consequences of your choice! Make sure you don’t try to optimize too early, or you’ll spend time on something that has no relevance to the real bottleneck in your app!

Example app

I’ve built a small example application using Sidekiq. It performs the same things as the delayed_job and Resque examples, but using Sidekiq. It saves and displays page counts on an uploaded PDF synchronously and asynchronously. The most important thing to note is that I had to change about three lines of code in order to switch from Resque to Sidekiq, because the library I was using to count pages in PDFs happened to be thread safe. However, if it hadn’t been, writing a Sidekiq port would have been much more difficult since I would have had to either roll my own library or use a C extension.

The code for the example app can be found here.

Wrapping it up

As you can tell, the Ruby community has presented three unique solutions to a single problem. Each has its own benefits and drawbacks. Pick the one that you like the most and use it in your next amazing app!

If you have any questions or suggestions, do drop them in the comments below!

Comparing Ruby Background Processing Libraries

<< Comparing Background Processing Libraries: Resque

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Richard

    I don’t think writing a C extension would make your Sidekiq code thread safe. I think it is quite the opposite actually. C extensions are not thread safe due to not having the GIL. That said, maybe there is a way to access the GIL from a C extension, I’m not sure though.

    • Dhaivat Pandya

      Sorry, I wasn’t too clear in that sentence. What I meant was that I could use a C library that *was* thread-safe with a Ruby wrapper (also thread safe). However, you are right. This post (http://burgestrand.se/articles/asynchronous-callbacks-in-ruby-c-extensions.html) goes into a lot more detail about this, but C extensions will not let you call the Ruby API from a different thread.

  • Anonymous

    Have you tried Torquebox?

    • Dhaivat Pandya

      Rubem,

      I actually haven’t tried TB before; I’ll definitely checking that out. I like the idea of using a reasonably solid (in my opinion) VM instead of starting from scratch.

  • Anonymous

    Rubem, we are in the midst of developing a large JRuby Rails app that we are most likely gonna deploy on TB. As such, the backgroundable jobs piece will come into play. I plan to blog about that when we get there.