Post

Conquering the N+1 Query Problem in Rails: A Performance Deep Dive

If you’ve been developing Rails applications for any length of time, you’ve likely encountered the infamous N+1 query problem—perhaps without even realizing it. This performance bottleneck can silently slow your application to a crawl as your data grows. In this article, we’ll dive deep into understanding, identifying, and solving the N+1 problem with practical, real-world examples and benchmarks.

What Is the N+1 Query Problem?

The N+1 query problem is a database performance anti-pattern where your application executes one query to retrieve a collection of records (the “1”), followed by N additional queries (one for each record in the collection) to retrieve related data. This approach can significantly degrade performance, especially as your dataset grows.

Let’s illustrate this with a simple example:

1
2
3
4
5
# This innocent-looking code hides a serious performance issue
posts = Post.all
posts.each do |post|
  puts post.user.name  # Each access to post.user triggers a separate database query
end

When executed, this code generates SQL that looks something like:

1
2
3
4
SELECT * FROM posts;                           -- The "1" query
SELECT * FROM users WHERE id = 1 LIMIT 1;      -- First of the "N" queries
SELECT * FROM users WHERE id = 2 LIMIT 1;      -- Second of the "N" queries
SELECT * FROM users WHERE id = 3 LIMIT 1;      -- And so on...

Benchmarking the Impact

Let’s first understand the magnitude of the problem using Ruby’s Benchmark module. We’ll compare the performance of code with and without the N+1 problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
require 'benchmark'

# Setup test data (in a real application you'd have this data already)
10.times do |i|
  user = User.create!(name: "User #{i}", email: "user#{i}@example.com")
  5.times do |j|
    Post.create!(title: "Post #{j} by User #{i}", content: "Content...", user: user)
  end
end

puts "Benchmarking with N+1 problem:"
time_with_n_plus_one = Benchmark.measure do
  posts = Post.all
  posts.each do |post|
    puts "#{post.title} by #{post.user.name}"
  end
end

puts "Benchmarking with eager loading (solution to N+1):"
time_with_eager_loading = Benchmark.measure do
  posts = Post.includes(:user).all
  posts.each do |post|
    puts "#{post.title} by #{post.user.name}"
  end
end

puts "Time with N+1 problem: #{time_with_n_plus_one.real} seconds"
puts "Time with eager loading: #{time_with_eager_loading.real} seconds"
puts "Performance improvement: #{(time_with_n_plus_one.real / time_with_eager_loading.real).round(2)}x faster"

For a modest dataset with just 50 posts across 10 users, you might see results like:

1
2
3
Time with N+1 problem: 0.2812 seconds
Time with eager loading: 0.0431 seconds
Performance improvement: 6.52x faster

As your dataset grows, this performance gap widens dramatically.

Spotting N+1 Problems in Your Rails App

Telltale Signs in Your Logs

The most straightforward way to identify N+1 issues is by reviewing your development or production logs. Look for patterns of repeated, similar queries occurring in succession:

1
2
3
4
Post Load (0.5ms)  SELECT "posts".* FROM "posts"
User Load (0.3ms)  SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2  [["id", 1], ["LIMIT", 1]]
User Load (0.2ms)  SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2  [["id", 2], ["LIMIT", 1]]
User Load (0.2ms)  SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2  [["id", 3], ["LIMIT", 1]]

This pattern—one query followed by many similar queries with different parameters—is the classic signature of an N+1 problem.

Using the Bullet Gem

The Bullet gem is a fantastic tool for automatically detecting N+1 queries. Here’s how to set it up:

1
2
3
4
5
6
7
8
9
10
11
# Gemfile
gem 'bullet', group: [:development, :test]

# config/environments/development.rb
config.after_initialize do
  Bullet.enable = true
  Bullet.alert = true  # JavaScript alerts in the browser
  Bullet.console = true  # Logs to browser console
  Bullet.rails_logger = true  # Logs to Rails logger
  Bullet.add_footer = true  # Adds details to HTML footer
end

Bullet will now notify you whenever it detects an N+1 query in your application, suggesting exactly where to add eager loading.

Common Solutions to the N+1 Problem

1. Eager Loading with includes

The most common solution is to use Rails’ includes method, which tells Rails to load the associated records in as few queries as possible:

1
2
3
4
5
6
7
8
9
10
11
# Instead of:
posts = Post.all

# Use:
posts = Post.includes(:user)

# For multiple associations:
posts = Post.includes(:user, :comments, :tags)

# For nested associations:
posts = Post.includes(user: :profile, comments: [:user, :likes])

With includes, Rails will typically execute just two queries regardless of how many posts you have:

1
2
SELECT * FROM posts;
SELECT * FROM users WHERE id IN (1, 2, 3, ...);

2. Using preload for Specific Loading Strategies

Sometimes you need finer control over how associations are loaded. Rails provides preload:

1
2
# This forces separate queries
posts = Post.preload(:user)

This will always use separate queries for each association (never a JOIN), which can be beneficial when retrieving large result sets.

3. Using eager_load for JOIN-based Loading

When you need to filter based on associated records, eager_load is your friend:

1
2
# This will use a LEFT OUTER JOIN
posts = Post.eager_load(:user).where(users: { role: 'admin' })

This generates a query with a JOIN, allowing you to filter the primary collection based on conditions on the associated records.

4. Using joins for More Complex Filtering

For more complex filtering without loading associated records:

1
2
# Find all posts written by users with a specific email domain
posts = Post.joins(:user).where("users.email LIKE ?", "%@example.com")

This uses a JOIN but doesn’t load the associated records into memory—useful when you need to filter but don’t need the associated data.

5. Batching with find_each and in_batches

For processing large collections efficiently:

1
2
3
4
5
# Process records in batches of 1000
Post.find_each(batch_size: 1000) do |post|
  # This automatically includes batch finding to reduce memory consumption
  process_post(post)
end

Advanced Techniques

1. Optimizing with select

Sometimes you don’t need all attributes of your records:

1
2
3
4
5
# Only select the fields you need
posts = Post.select(:id, :title).includes(:user)

# You can also select specific fields from associations
posts = Post.includes(:user).references(:user).select('posts.*, users.name as author_name')

2. Counter Cache Columns

For situations where you frequently count associations, use counter caches:

1
2
3
4
5
6
7
8
9
10
11
12
13
# In your migration
add_column :users, :posts_count, :integer, default: 0

# In your Post model
class Post < ApplicationRecord
  belongs_to :user, counter_cache: true
end

# Now instead of:
user.posts.count  # Executes a COUNT query

# You can use:
user.posts_count  # Uses the cached value

3. Custom Benchmarking Class for N+1 Detection

Create a custom class to help identify potential N+1 problems in your codebase:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class QueryCounter
  attr_reader :count
  
  def initialize
    @count = 0
  end
  
  def self.track
    counter = new
    subscription = ActiveSupport::Notifications.subscribe('sql.active_record') do |*args|
      event = ActiveSupport::Notifications::Event.new(*args)
      counter.count += 1 unless event.payload[:name] == 'SCHEMA' || event.payload[:sql].include?('BEGIN') || event.payload[:sql].include?('COMMIT')
    end
    
    yield
    
    ActiveSupport::Notifications.unsubscribe(subscription)
    counter
  end
end

# Usage
counter = QueryCounter.track do
  # Code that might have N+1 issues
  Post.all.each { |post| puts post.user.name }
end

puts "Executed #{counter.count} queries."

Real-World Example: Beyond Simple Associations

Let’s tackle a more complex example involving multiple levels of associations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class Blog < ApplicationRecord
  has_many :posts
end

class Post < ApplicationRecord
  belongs_to :blog
  belongs_to :user
  has_many :comments
  has_many :tags, through: :taggings
end

class User < ApplicationRecord
  has_many :posts
  has_one :profile
end

class Comment < ApplicationRecord
  belongs_to :post
  belongs_to :user
end

class Tag < ApplicationRecord
  has_many :taggings
  has_many :posts, through: :taggings
end

class Tagging < ApplicationRecord
  belongs_to :post
  belongs_to :tag
end

class Profile < ApplicationRecord
  belongs_to :user
end

Now, imagine we want to display blogs with their posts, each post’s author details, and the post’s comments and tags:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Inefficient approach with N+1 problems:
blogs = Blog.all
blogs.each do |blog|
  puts "Blog: #{blog.title}"
  
  blog.posts.each do |post|
    puts "  Post: #{post.title} by #{post.user.name} (#{post.user.profile.bio})"
    
    post.comments.each do |comment|
      puts "    Comment by: #{comment.user.name}"
    end
    
    post.tags.each do |tag|
      puts "    Tagged with: #{tag.name}"
    end
  end
end

This seemingly innocent code could generate hundreds of queries! Let’s fix it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Efficient approach with proper eager loading:
blogs = Blog.includes(
  posts: [
    {user: :profile},
    {comments: :user},
    :tags
  ]
)

# Same output loop, but now with drastically fewer queries
blogs.each do |blog|
  puts "Blog: #{blog.title}"
  
  blog.posts.each do |post|
    puts "  Post: #{post.title} by #{post.user.name} (#{post.user.profile.bio})"
    
    post.comments.each do |comment|
      puts "    Comment by: #{comment.user.name}"
    end
    
    post.tags.each do |tag|
      puts "    Tagged with: #{tag.name}"
    end
  end
end

Let’s benchmark this complex scenario:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
require 'benchmark'

puts "Benchmarking complex N+1 scenario:"
time_with_n_plus_one = Benchmark.measure do
  blogs = Blog.all
  # ... (inefficient loop from above)
end

puts "Benchmarking with comprehensive eager loading:"
time_with_eager_loading = Benchmark.measure do
  blogs = Blog.includes(posts: [{user: :profile}, {comments: :user}, :tags])
  # ... (same loop)
end

puts "Time with N+1 problem: #{time_with_n_plus_one.real} seconds"
puts "Time with eager loading: #{time_with_eager_loading.real} seconds"
puts "Performance improvement: #{(time_with_n_plus_one.real / time_with_eager_loading.real).round(2)}x faster"

With a moderate dataset, you might see a performance improvement of 20x or more!

Caveats and Considerations

While eager loading is powerful, it’s not always the right solution:

  1. Memory usage: Eager loading loads all associated records into memory. For very large datasets, this can consume significant RAM.

  2. Unused data: If you’re not actually using all the eager loaded associations in your code, you’re wasting resources.

  3. JOINs complexity: Complex eager loading with many nested associations can result in inefficient JOINs. In such cases, multiple targeted queries might be faster.

  4. Database-specific optimization: Different databases have different query optimization capabilities. PostgreSQL might handle certain complex JOINs better than MySQL, for example.

Conclusion

The N+1 query problem is one of the most common performance issues in Rails applications, but also one of the most solvable. By understanding the problem, learning to identify it in your own code, and applying the right solutions, you can dramatically improve your application’s performance.

Remember to:

  1. Use Rails’ includes, preload, and eager_load methods appropriately
  2. Monitor your application logs for signs of N+1 queries
  3. Use tools like the Bullet gem for automated detection
  4. Benchmark your improvements to ensure they’re having the desired effect
  5. Consider both performance and memory usage when optimizing

By keeping these practices in mind, you’ll be well on your way to faster, more efficient Rails applications that can handle larger datasets with ease.

Additional Resources

This post is licensed under CC BY 4.0 by the author.