The (Silver) Bullet for the N+1 Problem

Rails is one of the popular choices when it comes to building your Minimum Viable Products(MVP). You can bootstrap, prototype, and release an application with such ease that Rails is many developers default choce for an MVP.

Usually while developing these prototypes, most developers don’t consider the performance indexes, which really shouldn’t be a concern at that time, anyway. But when the application needs to be scaled and optimized, these performance indexes comes into play. Then the same developersn need to focus on how the application can be refactored for speed and performance.

One potential area of improvement is the queries your application sends to the database. Reduce the number of queries, increase your application’s performance.

Most Rails applications have data distributed over many models with associations between them, using ORMs for access. ORMs can help you to address the impedance mismatch between relational databases and object oriented models, hopefully making your life simpler. But not knowing about some of their pitfalls can decrease your application performance dramatically. One such pitfall is the select N+1 problem.

What is the N+1 Query Problem ?

This problem occurs when the code needs to load the children of a parent-child relationship (the “many” in the “one-to-many”). Most ORMs have lazy-loading enabled by default, so queries are issued for the parent record, and then one query for EACH child record. As you can expect, doing N+1 queries instead of a single query will flood your database with queries, which is something we can and should avoid.

Consider a simple blog application which has many articles published by different authors:

#Articles model
class Article < ActiveRecord::Base
  belongs_to :author
end

#Authors model
class Author < ActiveRecord::Base
  has_many :posts
end

We want to list the 5 most recent articles on the sidebar of the article, along with their title and author’s name.

This could be achieved using the following

#In our controller
@recent_articles = Article.order(published_at: :desc).limit(5)

#in our view file
@recent_articles.each do |article|
    Title: <%= article.title %> 
    Author:<%= article.author.name %>
end

The above code will send 6 (5+1) queries to the database, 1 to fetch the 5 recent articles and then 5 for their corresponding authors. In the above case, since we are limiting the number of queries to 5, we won’t see this issue affecting our application performance much. However, for queries with larger limit, this could be fatal.

Each query has quite a bit of overhead. It is much faster to issue 1 query which returns 100 results than to issue 100 queries which each return 1 result. This is particularly true if your database is on a different machine which is, say, 1-2ms away on the network. Here, issuing 100 queries serially has a minimum cost of 100-200ms, even if they can be satisfied instantly by MySQL.

Solution – Eager Loading

Eager loading is the mechanism for loading the associated records of the objects returned by Model.find using as few queries as possible. In the above example, if we use eager loading to fetch the data, the details of the author will be loaded along with the articles.

In Rails, ActiveRecord has a method called includes, which ensures that all the associated datas specified are loaded with the minimum number of queries.

Let’s refactor our code to eager load associations.

#In our controller
#Using includes(:authors) will include authors model.
@recent_articles = Article.order(published_at: :desc).includes(:authors).limit(5)

#in our view file
@recent_articles.each do |article|
    Title: <%= article.title %> 
    Author:<%= article.author.name %>
end

How Eager Loading Can Prevent the N+1 Query Problem ?

Eager loading is the solution to the N+1 query problem, making sure you don’t end up running unnecessary queries while looping through an object.

The queries in our example go from

Article Load (0.9ms) SELECT 'articles'.* FROM 'articles'
Author Load (0.4ms) SELECT 'authors'.* FROM 'authors' WHERE 'authors'.'id' = ? ORDER BY 'authors'.'id' ASC LIMIT 1 [["id", 1]]
Author Load (0.3ms) SELECT 'authors'.* FROM 'authors' WHERE 'authors'.'id' = ? ORDER BY 'authors'.'id' ASC LIMIT 1 [["id", 2]]
Author Load (0.4ms) SELECT 'authors'.* FROM 'authors' WHERE 'authors'.'id' = ? ORDER BY 'authors'.'id' ASC LIMIT 1 [["id", 3]]
Author Load (0.3ms) SELECT 'authors'.* FROM 'authors' WHERE 'authors'.'id' = ? ORDER BY 'authors'.'id' ASC LIMIT 1 [["id", 4]]    
Author Load (0.4ms) SELECT 'authors'.* FROM 'authors' WHERE 'authors'.'id' = ? ORDER BY 'authors'.'id' ASC LIMIT 1 [["id", 5]]

Article Load (0.4ms) SELECT 'articles'.* FROM 'articles'
Author Load (0.4ms) SELECT 'authors'.* FROM 'authors' WHERE 'authors'.'id' IN (1,2,3,4,5)

Less queries for great good. :)

The Bullet Gem

Bullet is a gem, written by Richard Huang, that helps reduce the number of queries an application makes. It was first posted in 2009 as a plugin, an it is still a pretty useful gem to monitor your application for performance improvements. Bullet helps you monitor the queries your application is making to the database and notifies you of any N+1 scenarios. Interestingly, it also notifies you of any unused eager loads.

Bullet has many ways to notify you of N+1 query problems: Growl notifications, JavaScript alerts by default, and even using XMPP. Additionally, it saves out to bullet.log the exact line and stack trace of what caused the alert. If you want it to, it can also write to the application log.

Usage and Configuration

Add Bullet to your Gemfile and run bundle install.

group :development do
  gem 'bullet'
end

The gem should only be used in the development environment, as you won’t want the users of your application getting alerts about the N+1 querry problems.

The next thing to do is configure the way Bullet notifies you.

General Configuration

Bullet should be enabled in the application, just adding the bullet gem won’t notify you of bad queries. Configuration is done in the config/environments/development.rb.

config.after_initialize do
  #Enable bullet in your application
  Bullet.enable = true
end

Notify Via Javascript Alerts

Bullet can be configured to notify the developer using a simple javascript alert. An alert box pops up while loading the pages that run the N+1 queries. To configure javascript alerts, add the following code to the configuration block above

Bullet.alert = true

Notify via Browser Console

If you don’t like alert boxes popping up all over your application, you could use the browsers console to notify you of the N+1 queries by adding

Bullet.console = true

Notify via Rails Logs

Bullet can also append a notification to your Rails log about your bad queries. Thus, if you are using some analyzer tools for your logs, you could add the following

Bullet.rails_logger = true

Log to a File

If you need the queries to be logged to a file, Bullet lets you do that by

Bullet.bullet_logger = true

which creates a log file named bullet.log with all your bad queries.

Notify via Growl Notifications

If you prefer growl notifications, you can enable the growl support using

Bullet.growl = true

Conclusion

That wraps up my whirlwind tour of the N+1 query problem and the Bullet gem. If you have association in your Rails app, you should start using these tricks today.

Resources

Rubygems : Bullet gem Github : Source Code RailsCast : Tutorial

Frequently Asked Questions (FAQs) about N+1 Problem

What exactly is the N+1 problem in ORM?

The N+1 problem is a common issue in Object-Relational Mapping (ORM) where an application executes N+1 queries to fetch data from the database, where N is the number of records. This happens when the application fetches related data in a separate query for each record, leading to a significant performance hit. It’s called the N+1 problem because for N objects, you make 1 query to fetch the objects and N more queries to fetch related data for each object.

How does the N+1 problem affect application performance?

The N+1 problem can significantly degrade application performance. Each additional query to the database consumes resources and adds latency. If you’re dealing with a large number of records, the cumulative effect of these additional queries can lead to noticeable slowdowns, increased server load, and potentially even timeouts or crashes in extreme cases.

How can I identify the N+1 problem in my application?

Identifying the N+1 problem can be challenging, as it may not be immediately obvious. One way to detect it is by monitoring your application’s database queries. If you notice a pattern of repeated queries fetching related data for each record, that’s a sign of the N+1 problem. Tools like database profilers or ORM debug logs can help you spot these patterns.

What are some strategies to solve the N+1 problem?

There are several strategies to solve the N+1 problem. One common approach is eager loading, where you fetch all related data in a single query instead of separate queries for each record. Another approach is batch loading, where you fetch related data in batches to reduce the number of queries. The right solution depends on your specific use case and the capabilities of your ORM framework.

What is eager loading and how can it help solve the N+1 problem?

Eager loading is a strategy where you fetch all related data in a single query when you fetch the main data. This can significantly reduce the number of queries and solve the N+1 problem. However, it can also lead to fetching more data than necessary, so it’s important to use it judiciously.

What is batch loading and how can it help solve the N+1 problem?

Batch loading is a strategy where you fetch related data in batches, reducing the number of queries. Instead of making a separate query for each record, you make a single query for a batch of records. This can be more efficient than eager loading if you’re dealing with a large number of records.

Can the N+1 problem occur in non-ORM contexts?

Yes, the N+1 problem can occur in any context where you’re fetching related data in separate queries for each record. While it’s commonly associated with ORM, it’s not exclusive to it. Any application that interacts with a database can potentially run into the N+1 problem.

How can I prevent the N+1 problem in my application?

Preventing the N+1 problem requires careful design and planning. When fetching data, consider how you can fetch related data efficiently. Use strategies like eager loading or batch loading where appropriate. Monitor your application’s database queries to spot potential N+1 problems early.

Are there any tools that can help me solve the N+1 problem?

Yes, many ORM frameworks provide tools to help solve the N+1 problem. For example, they may provide options for eager loading or batch loading. There are also third-party tools and libraries that can help you monitor your database queries and detect potential N+1 problems.

Can the N+1 problem affect the scalability of my application?

Yes, the N+1 problem can significantly affect the scalability of your application. As your data grows, the impact of the N+1 problem can become more severe. Solving the N+1 problem is crucial for ensuring your application can scale effectively.