Simple, Organized Queueing with Resque
After having used many different queueing systems, from the venerable BackgrounDRb, to DelayedJob, to roll-your-own solutions, I’ve settled on the excellent Resque.
The now-famous blog post from GitHub’s Chris Wanstrath says it all, and the README has everything you could ever hope for in terms of getting up and running. However, Resque leaves a lot up to the imagination for exactly how to integrate it into your app. This is, of course, one of its strengths, as it allows for a high degree of flexibility.
In this article, I am going to introduce one such integration strategy I’ve found particularly useful, clean, and most importantly, simple.
A Namespace of Its Own
In order to free up dependency on your database, which can cause all kinds of performance problems down the road, Resque uses Redis for its data store (note: if you’d like to read more about getting up and running with Redis, read my previous article here).
Assuming you have already setup Redis in your app, you probably already have a $redis global variable to represent the connection. Resque uses the redis-namespace gem by default to avoid polluting the key space of your redis server, but I personally like to have control over important details like this.
Fortunately, Resque allows this, so initializing the connection is as simple as adding the following to config/initializers/resque.rb
:
[gist id=”2731334″]
Enqueueing Jobs
Inspired by Delayed Job, in Resque all you need to run code in the background is provide a class or module that responds to the perform
method. These objects also specify the name of the queue which processes them.
The most evident solution to integrating Resque, therefore, is to have one of these objects per background task. For example, you might have one class for processing user-uploaded images, another class for sending your monthly newsletter to all users, and yet another for updating search indices.
As you might imagine, the number of these workers is going to increase over time. Furthermore, Resque prioritizes queues based solely on the order in which they are specified to a worker, so you’re going to need to remember which workers act on which queues. If you have an app where new background tasks are constantly being added, removed, or re-prioritized, this is not only going to be confusing, but will also necessitate a lot of upkeep.
A Cleaner Approach
Rather than focusing on what needs to happen in the background, let’s focus on when. That is, my approach is to let priority be the guiding factor on how to design an interface to Resque.
The first step is to note that 99% of the time, background jobs are going to fall into one of three possible priorities: high, normal, and low. While it will be easy to add more priorities later, it will only happen in very rare circumstances.
Having only three priorities also makes worker configuration much simpler. Every worker is always assigned to these three queues, so except for the count of workers, the command is always the same:
[gist id=”2731496″]
If necessary, to throw more processing power at the queue, simply spawn more workers:
[gist id=”2731506″]
Enqueue Methods Instead
The next step to this approach is to make it easier to throw any instance or class method on to one of these queues. This is where the dynamic nature of Ruby is going to help out a lot. To begin at the end, we’re going to be able to enqueue anything like this:
[gist id=”2731551″]
In addition, it’s not going to matter if some_object
is a class, module, or instance. We can change the priority of a method simply by changing between Queue::Normal
, Queue::High
, Queue::Low
.
Let’s look at how we can code this. The above interface clearly dictates that each class will need to specify an enqueue method. We also know that each class needs to specify a queue name and a perform method, so that’s a good place to start:
[gist id=”2731581″]
We can already see that these classes have the same interface, so let’s refactor it into a superclass:
[gist id=”2731680″]
The enqueue
method’s only job is to supply data to the perform
method, which will be called by Resque. The perform
method needs to find the object by ID, invoke the enqueued method, and pass the enqueued arguments to it. Since everything passed to Resque needs to be serializable to JSON, we need to pass the class name of the object, the method, and the object’s ID as a special “meta” argument:
[gist id=”2757896″]
Then, in the perform
method, we can make use of the constantize
Rails extension to get the class from its name, find the object, and send the method along with its arguments:
[gist id=”2757909″]
And with that, we are ready to enqueue any instance method. Here is an example of how the big picture might look:
[gist id=”2863154″]
That’s all there is to it. The only caveat is that all arguments passed to background_method
are going to be serialized to JSON, and then deserialized back into Ruby. Usually this will not cause any problems, but one big difference is that all hashes with symbol keys will have string keys in background_method
.
Enqueueing Class or Module Methods
Only one final step remains. We also want to enqueue class or module-level methods. That is, we don’t always need to find an instance by ID to accomplish certain background tasks. For example, we might want to send an email to every registered user. The code would look something like this:
[gist id=”2863149″]
This is going to require some slight modification to the enqueue
and perform
methods, since we need to be able to tell the difference between an enqueued class or module and an enqueued object.
To do this, we need to see if the object’s class responds to :find_by_id
or not. This works because if object
is a class or module, its class is Class
, which does not respond to :find_by_id
. If object
is not a model instance, we do not add the 'id'
key to the meta information.
As such, the perform
method has only to check for the existence of this key to determine whether to invoke the method directly on the object, or to find an instance by ID first:
[gist id=”2758040″]
Note: this article assumes the ActiveRecord ORM. Depending on your application, you may need to modify the definition of is_model?
to more accurately specify what constitutes a model instance.
Robustness
We have a working queue interface, but there’s still plenty of room for programmer error. The enqueue
method should ideally raise exceptions when the programmer attempts to enqueue a non-existent method, or tries to pass too many or too few arguments to that method.
By adding a few quick checks, the number of queue-side failures can be reduced dramatically. Let’s add a method, ensure_queueable!
to Queue::Base
that raises an exception unless the method exists and the appropriate number of arguments have been passed. With these changes in place, the entire Queue::Base
class looks like this:
[gist id=”2758127″]
Note: checking method arity is somewhat complex, as methods that accept a variable number of arguments return a negative number. See the Ruby documentation of Method
for more information.
That’s All
With that, we have a simple and clean interface to Resque which allows us to enqueue any instance or class method with minimum effort. We can also add new priorities in a few lines of code by simply defining a new subclass of Queue::Base
.
Furthermore, we have provided a single entry point into the queue which uses Resque as its implementation, but should we decide to swap out Resque for another solution in the future, we should only need to make changes to Queue::Base
.
I hope this article has been useful. Have fun with Resque, and happy queueing!