Manually preloading associations in Rails using custom scopes/SQL

In rare cases, you may need to manually preload associations in Rails for performance reasons. For example, you might have a situation where you can fetch multiple associations from the database in one query, or you need to do something that can’t be accomplished with Rails’ built-in support for eager loading.

First, let’s look at some situations where you might not know that it can be done with Rails’ built-in functionality.

Preloading / eager loading only one record from an association

For example you might have a Person that has_many photos, but when listing these people you only want to display the first photo. This is the naive and wrong way:


class Person < ActiveRecord::Base
has_many :photos
has_one :display_photo, class_name: "Photo"
end
# WARNING: This only appears to work properly!
people = Person.all.includes(:display_photo)
people.each { |person| puts person.display_photo.id }

view raw

gistfile1.rb

hosted with ❤ by GitHub

It will appear as if it’s working, but the problem is that Rails will still preload all photos for each Person, not only the first one. You might think that a limit condition on the association would help, but it cannot, since Rails uses a query like the following for preloading:

SELECT “photos”.* FROM “photos” WHERE “photos”.”person_idIN (1,2,3,4,5)

So, if you would stick a LIMIT at the end of that, it would not be limiting on a per-person basis. Here is what we can do to preload only the first photo:


class Person < ActiveRecord::Base
has_many :photos
has_one :display_photo, -> {
self.select_values = ["DISTINCT ON(photos.person_id) photos.*"]
order('photos.person_id')
}, class_name: "Photo"
end
# Polymorphic association example
class Person < ActiveRecord::Base
has_many :photos, as: :photo_parent
has_one :display_photo, -> {
fields = "photos.photo_parent_id, photos.photo_parent_type"
self.select_values = ["DISTINCT ON(#{fields}) photos.*"]
order(fields)
}, as: :photo_parent, class_name: "Photo"
end
people = Person.all.includes(:display_photo)
people.each { |person| puts person.display_photo.id }

view raw

gistfile1.rb

hosted with ❤ by GitHub

We use PostgreSQL’s DISTINCT ON to only select one photo for each person_id. Because of this we also need to add person_id as the first column to order by (this is required by postgres, and in Rails 4 orders are prepended to existing ones so this will be the first one). You can order descending if you need it to be the last photo. If you need to use more specific conditions, you can use additional scopes, however note that Rails will ignore everything except for: where, select, references, and includes.

We need to set select_values directly because DISTINCT ON must appear before specifying which columns to select.

Also, if you want to use a scope that you have defined on Image, you might have to do a little hack to be able to use that scope:


class Photo < ActiveRecord::Base
scope :enabled, -> { where(photo: { enabled: true }) }
end
class Person < ActiveRecord::Base
has_one :display_photo, -> {
self.select_values = ["DISTINCT ON(photos.person_id) photos.*"]
instance_exec(&Photo.method(:enabled)).order('photos.person_id')
}, class_name: "Photo"
end

view raw

gistfile1.rb

hosted with ❤ by GitHub

Note that the same limitations apply on the types of scopes that you can use (so no group scope for example).

Manually preloading

You can look into code in active_record/associations/preloader/ to see more on how Rails does preloading, but basically, you need to do this:


# collection association e.g. has_many
owners = People.all
association_name = :photos
owners.each do |owner|
records = Array(whatever_you_want)
association = owner.association(association_name)
association.loaded!
association.target.concat(records)
records.each { |record| association.set_inverse_instance(record) }
end
# singular association e.g. has_one
owners = People.all
association_name = :photos
owners.each do |owner|
record = whatever_you_want
association = owner.association(association_name)
association.target = record
association.set_inverse_instance(record)
end

view raw

gistfile1.rb

hosted with ❤ by GitHub

Gem

Some time later after writing this post, I came across a project where we needed to do a lot of pretty complex record preloading. I first created some preloading classes that basically added includes to the scope and later loaded stuff based on results. However soon there was some code duplication and it was also getting hard to understand because the preloaders were calling each other. There was a lot of boilerplate code so I needed some kind of abstraction to clean it up.

So I wrote a gem specifically for handling more complex preloading logic called datasource and another one for preloading specifically when using active_model_serializers called active_loaders (this one has documentation). These two gems are only intended for people who have a good understanding of SQL, especially the N+1 problem and how to efficiently fetch data in a single query. If you are a beginner the gems will not be of much help, even though there is some auto-magic.

Appendix

For my reference, here is another more complex way to do manual preloading by creating a custom preloader and calling it:


has_one :display_gallery_item, as: :imageable, class_name: "GalleryItem"
class ServiceCenterPreloader
class GalleryItemPreloader < ActiveRecord::Associations::Preloader::HasOne
def initialize(owners, reflection)
super(GalleryItem, owners, reflection, ActiveRecord::Relation.new(nil, nil))
end
private
def build_scope
t = klass.table_name
distinct = "#{t}.imageable_id, #{t}.imageable_type"
scope = super
scope.select_values = ["DISTINCT ON(#{distinct}) #{t}.*"]
scope.order(distinct)
end
end
def self.preload(service_centers)
records = Array.wrap(service_centers.to_a).compact.uniq
reflection = ServiceCenter.reflections[:display_gallery_item]
GalleryItemPreloader.new(records, reflection).run
end
end
service_centers = ServiceCenter.all.to_a
ServiceCenterPreloader.preload(service_centers)

view raw

gistfile1.rb

hosted with ❤ by GitHub

9 thoughts on “Manually preloading associations in Rails using custom scopes/SQL

  1. Mmmmmm maybe this is not a real problem you are discussing, but imho there is no way ever to determine a display_photo uniquely. I would write this differently using a belongs_to relation, adding the display_photo_id to the Person model.

    • You can determine it uniquely, you just need to add orders to get what you need (i.e. you might want to order by photos created_at). DISTINCT ON will take the first row according to how you ordered them.

Leave a reply to Nathan Cancel reply