N + 1: When More Queries Is a Good Thing

Share this article

N + 1: When More Queries Is a Good Thing

Over the last week I have been trying to understand how eager loading works in Rails to eliminate the infamous N+1 query problem by reducing the number of queries fired. My initial hypothesis was that reducing the number of queries as much as possible was the goal. However, I was surprised by what I discovered.

Using includes to Reduce Queries

Most posts that you read about the infamous N + 1 Query Problem cite the includes method to address the issue. includes is used to eager load associations related to the model by using the minimum number of queries possible. For this, under the hood, it uses a preload or left outer join, depending on the situation. I will explain both situations in subsequent sections.

This is nicely explained with examples in Active Record Query Methods documentation.

When and How to Use includes?

Suppose that our user can have many posts and can comment on any post. Each post can have many comments. The basic structure is shown in snippet shown below:

# models/users.rb
class User < ApplicationRecord
  has_many :posts
  has_many :comments
end

# models/posts.rb
class Post < ApplicationRecord
  has_many :comments
  belongs_to :user
end

# models/comments.rb
class Comment < ApplicationRecord
  belongs_to :user
  belongs_to :post
end

Now, if we want a user’s information with posts made by the user along with their comments, simply calling User.all will first load users, then it will fetch the posts by each user. After fetching the posts, it will fetch the comments made by the user for those posts. If we have 10 users each having 5 posts, and on average 2 comments on each post, one User.all will end up doing around 1 + 5 + 10 queries.

# users_controller.rb
def index
  @users = User.all
  render json: @users
end

A simple solution is to use includes to tell Active Record that we want to fetch users and all related posts:

@users = User.all.includes(:posts)
posts are preloaded, comments are not preloaded
posts are preloaded, comments are not preloaded

This improvises performance a bit, as it fetches users first, and then in the subsequent query it fetches posts related to those users. Now the previous 1 + 5 + 10 queries are reduced to 1 + 1 + 10 queries. But this will be a lot better if comments related to posts are loaded in advance, as well. This will reduce it all down to 1 + 1 + 1, totaling 3 queries to fetch all the data. Look at the snippet shown below to understand:

# users_controller.rb
def index
  @users = User.all.includes(:posts => [:comments])
  render json: @users
end
All data is loaded in just 3 queries, one for users, one for posts, and one for comments related to posts
All data is loaded in just 3 queries, one for users, one for posts, and one for comments related to posts

All data is loaded in just 3 queries, one for users, one for posts, and one for comments related to posts

Passing comments in an array tells active record to preload comments related to the posts as well. If some relationship of comments needs to preloaded, we can change arguments passed to the includes methods, like so:

User.all.includes(:posts => [:comments => [:another_relationship]])

This way, any number of nested relationships can be preloaded. For all of the above queries includes uses preload.

Fetching Posts with a Specific Title

User.all.includes(:posts => [:comments]).where('posts.title = ?', some_title)

This will raise an error. Whereas,

User.all.includes(:posts => [:comments]).where(posts: {    title: some_title })

will give us the expected result. This happens because when hash conditions are passed, a left outer join of users and posts is performed to fetch users with posts having the specific title.

Specifying hash conditions on  included relations
Specifying hash conditions on included relations

But, what if we want to use a pure string or array conditions instead of hash conditions to specify conditions on included relations? Look at the following example:

User.all.includes(:posts => [:comments]).where('posts.title = ?', some_title).references(:posts)

Notice the references(:posts) part? references tells includes to force join the posts relation with a left outer join. To understand this, see the example query generated by the above line of code:

Example query when posts relation is force joined with includes through references
Example query when posts relation is force joined with includes through references

We reduced the number of queries from 1 + 5 + 10 to 1 query. That’s great!

But, Less is NOT Always More

Look at the last two example queries. Both are 3 to 4 lines long, and are have substrings like t0_r1, t0_r2, … , t2_r5. This seems unusual. I am not an SQL expert and didn’t know what this means. These are known as a CROSS JOIN or CARTESIAN join.

So, using references or hash conditions to specify conditions for included relations can cause very long queries and unnecessary outer joins, which may affect performance and memory adversely. Instead, splitting a large single query into a few queries would be more beneficial.

The Active Record documentation clearly states that when you need to query associations, you should use join queries with includes instead of references.

Even though Active Record lets you specify conditions on the eager loaded associations, just like joins, the recommended way is to use joins instead.

A better way to specify conditions on eager loaded associations:

User.all.joins(:posts).where('posts.title = ? ', some_title).includes(:posts => [:comments])

This generates 1 + 1 + 1 queries and loads only the users having posts matching the given conditions, such as a specific title, etc.

Conclusion

Eager loading associations can be very useful and improve performance to a great extent, but can also cause serious harm when many nested associations are loaded. I for one was a bit surprised that reducing trips to the database can actually make things worse.

This post at Engine Yard explains issues related to eager loading associations very well. For example, do not forget to add pagination and limit to records being fetched.

I hope you enjoyed this quick post on N + 1 queries.

Frequently Asked Questions (FAQs) about N+1 Query Problem and Its Solutions

What exactly is the N+1 query problem in Rails?

The N+1 query problem is a common issue in Rails that occurs when the code needs to load the data of associated records from the database. For each record, an additional query is executed, leading to a total of N+1 queries, where N is the number of associated records. This can significantly slow down the performance of your application, especially when dealing with large datasets.

How can I identify an N+1 query problem in my Rails application?

Identifying an N+1 query problem can be done by observing your application’s log file. If you notice that your application is making a separate database query for each associated record, then you likely have an N+1 query problem. Tools like the Bullet gem can also help identify these issues by alerting you when your code is making unnecessary queries.

What is eager loading and how can it help solve the N+1 query problem?

Eager loading is a technique in Rails that allows you to load all the associated records of a model in a single database query. This can significantly reduce the number of queries your application needs to make, thereby solving the N+1 query problem. Eager loading can be implemented using the ‘includes’, ‘joins’, or ‘preload’ methods in Rails.

What is the difference between ‘includes’, ‘joins’, and ‘preload’ in Rails?

These are all methods used to implement eager loading in Rails, but they work in slightly different ways. ‘Includes’ loads the associated records in one or two queries, depending on the need. ‘Joins’ combines the data from two tables into one, but only loads the data from the main table. ‘Preload’ always performs two queries, one for the main table and one for the associated table.

Can the N+1 query problem occur in other programming languages or frameworks?

Yes, the N+1 query problem is not exclusive to Rails. It can occur in any programming language or framework that interacts with a database. The solution will depend on the specific language or framework, but the general principle of reducing the number of queries by loading associated data in advance remains the same.

How can I prevent N+1 query problems from occurring in my Rails application?

The best way to prevent N+1 query problems is to always be mindful of how your code interacts with the database. Make use of eager loading techniques whenever you need to load associated records. Tools like the Bullet gem can also help by alerting you when your code is making unnecessary queries.

Are there any drawbacks to using eager loading in Rails?

While eager loading can significantly improve the performance of your application by reducing the number of queries, it can also lead to increased memory usage as all the associated records are loaded at once. Therefore, it’s important to use eager loading judiciously and only when necessary.

What is the impact of the N+1 query problem on the performance of a Rails application?

The N+1 query problem can significantly slow down the performance of a Rails application. Each additional database query takes time, and when dealing with large datasets, these extra queries can add up quickly. This can lead to slow page load times and a poor user experience.

Can the N+1 query problem affect the scalability of my Rails application?

Yes, the N+1 query problem can affect the scalability of your Rails application. As your dataset grows, the number of queries required to load associated records can increase exponentially, leading to slower performance. By addressing the N+1 query problem, you can ensure that your application scales effectively as your data grows.

Are there any tools or gems that can help me solve the N+1 query problem in Rails?

Yes, there are several tools and gems available that can help you identify and solve the N+1 query problem in Rails. The Bullet gem, for example, can alert you when your code is making unnecessary queries. The Goldiloader gem can automatically optimize your code to use eager loading where appropriate.

Parth ModiParth Modi
View Author

Loves to work in Ruby on Rails, and currently works as Software Engineer at SoluteLabs. Shares his journey of learning at his blog parthtalks and at medium.

ActiveRecordGlennGRuby on Railssql
Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week