Put Your Work in a Qu

If you’ve ever written an application that had to be responsive while also performing some slow or long running tasks, then you’ve probably read about using background jobs. In the wonderful world of Ruby there are quite a few options for handling background jobs. Today, you’ll get to meet Qu.

Background on Qu

What is Qu? Like all good projects the README provides a nice description:

Qu is a Ruby library for queuing and processing background jobs. It is heavily inspired by delayed_job and Resque.

Brandon Keepers created Qu to overcome shortcomings of existing queueing libraries while helping develop Harmony, Gaug.es and SpeakerDeck at Ordered List.

A few of the shortcomings that Qu attempts to overcome include:

Silent fail
- some existing solutions would fail, and you’d never know about it.
Single persistence backend
- being designed for only a single peristence backend limits the environments in which a queueing library can be used. (such as the Resque/Redis combination)
Jobs being performed multiple times
- Some queueing libraries showed contention in ActiveRecord that allowed one job to be run by multiple workers.

There are several other issues outlined in the README more recently on Brandon’s blog.

Besides aiming to resolve those issues, Qu will requeue jobs when a worker is killed so no jobs are ever lost. Also, if you’re familiar with Resque you’ll notice Qu has a “Resque-like” API, which might help you transition.

Silent Fail

When a job runs, how do you handle the failures? Where do exceptions go? Some libraries simply do nothing, while others will log an error. When Qu performs a job, it will catch any type of exception and log it to a “failure” collection in the database. This is a handy feature, but it shouldn’t be your only method of handling errors.

Besides logging the exception, your own job code should catch exceptions and react accordingly. Naturally that sort of code can suffer from boiler-plate-itis, a disease where you’re writing the same sort of code over and over.

Once hooks are ready, handling errors that occur while performing a job will be simpler and cleaner to implement.

Single Persistence Backend

Using a queueing library that’s chained to a single persistence backend might not be a problem for many projects, but it can be a constraint for some. Perhaps you’re not sure if you want to use Redis or MongoDB for a project you’re working on. Or perhaps you’ve been using Redis but have found it’s not meeting your needs. In most cases this would require that you change your queuing library.

Qu has an API that abstracts the persistence backend which allows for many persistence solutions to be used. As of version 0.1.3 Qu has “out of the box” support for Redis and MongoDB. Later in this article you’ll be able to read more about what Brandon, the author of Qu, thinks about the future of supporting additional persistence backends.

Jobs Performed Once

One of the issues Qu tries to solve is the possibility of contention that might arise when multiple workers are accessing the database. Brandon observed this issue with ActiveRecord and set out to fix it in Qu. The LPOP and BLPOP commands are used with the Redis backend when workers grab jobs. The same sort of behavior is achieved with MongoDB by way of its findAndModify operation.

Sample

Below is a simple example which demonstrates creating a job and adding it to the queue. Afterwards a worker is started to process the job.

class ScrapeWebsite
  @queue = :data    # not required, defaults to :default

  def self.perform(url)
    results = url.scrape
    results.save
  end
end

# queue the job

job = Qu.enqueue ScrapeWebsite, "http://github.com"

From the command-line you can start up a worker to process any jobs in the “data” queue.

$ bundle exec rake qu:work QUEUES=data

Pretty simple, right? One thing you’ll notice is that your “job” only needs to have a class method named perform (similar to delayed_job). This is likely going to change in a future version of Qu (the current release is 0.1.3) to help support hooks and other features. The goal of hooks is to allow developers to “hook-in” to the job lifecycle (before, after, etc).

Also, if you deploy apps to Heroku, Qu is ready for that. It can easily connect to MongoDB by using the MONGOHQ_URL environment variable. Similarly, Qu can connect to Redis by using the REDISTOGO_URL environment variable!s

Interview with Brandon

Brandon was nice enough to do a little Q&A with me via GitHub.

Any plans to add other persistence backends?

I’d at least like to see a generic SQL backend. A PostgreSQL-specific backend based on Ryan Smith’s queue_classic would be pretty awesome. I don’t have any others on the radar, but I think there is a lot of potential for other creative uses.

Any plans to add scheduling? (i.e. run this job every hour)

No, but I wouldn’t be opposed to it. The implementation of this will be very different across multiple backends, so it will take a lot of forethought to get the API right. If someone wanted to work on it, I’d be happy to help them.

What’s the roadmap for Qu (e.g. getting to 1.0)?

There’s still a lot of work to do on some basic features that we’ve come to know and love from other queuing systems (hooks, plugins, stats), and of course a gorgeous web interface. The things that are on my immediate radar are listed in GitHub Issues: https://github.com/bkeepers/qu/issues.

I’ve started the work on hooks, which will be an essential step to a good plugin API. To support hooks, I made some changes to how jobs are defined. I’m really excited about the idea of jobs being a richer object than the class method approach that resque takes.

Anything else you’d like to mention?

There have already been a few forks and pull requests, but I would love to see more people get involved.

Conclusion

Qu is another tool for processing jobs in the background that has some interesting features and a clean API. It’s still in the early days for the project, which is good and bad. Good because it’s easy to get help, provide feedback and contribute! Of course, since it’s “young”, there are certainly going to be bugs (which all software has), a changing API and potentially missing features.

Have you used other queueing and job processing libraries? If so, which ones? Which did you choose and why?