Ruby
Article

Master Many-to-Many Associations with ActiveRecord

By Fred Heath

Modeling many-to-many relationships between data entities in the ActiveRecord world isn’t always a straightforward task. Even if we have a well-defined ER diagram to work with, it’s not always clear which ActiveRecord associations we should be using and what the implications of our decision will be. There are two types of many-to-many relationships: transitive and intransitive. In mathematics,

A binary relation R is transitive whenever an element A is related to an element B, and B is in turn related to an element C.

To put this into a data modeling context, a relationship between two entities is transitive if it can be best expressed through the introduction of one or more other entities. So, for instance, it’s easy to see that a Buyer buys from many Sellers while a Seller sells to many Buyers. However, the relationship is not fully expressed until we start adding entities such as Product, Payment, Marketplace and so on. Such relationships can be called transitive many-to-many as we rely on the presence of other entities to fully capture the semantics of the relationship. Luckily, ActiveRecord allows us to model such relationships with ease. Let’s start by looking at the simplest ActiveRecord many-to-many associations and work our way up.

Intransitive Associations

This is the simplest many-to-many association. Two models are associated by simple virtue of their existence. A Book can be written by many authors and an Author may write many books. It is a direct association and there is a direct dependency between the two models. We can’t really have one without the other. In ActiveRecord this can easily be modeled with the has_and_belongs_to_many (HABTM) association. We can create the models and migrations for this relationship in Rails by running the following commands:

rails g model Author name:string
rails g model Book title:string
rails g migration CreateJoinTableAuthorsBooks authors books

We need to define the HABTM association in our models like this:

class Book < ApplicationRecord
  has_and_belongs_to_many :authors
end
class Author < ApplicationRecord
  has_and_belongs_to_many :books
end

Then, we can create our database tables by running:

rails db:migrate

Finally, we can populate our database:

herman = Author.create name: 'Herman Melville'
moby = Book.create title: 'Moby Dick'
herman.books << moby

We can now, among other things, access: a book’s Authors, all Books written by an Author and all Authors that have written a specific book:

moby.authors
herman.books
herman.books.where(title: 'Moby Dick')

Nice and simple.

Mono-transitive Associations

A transitive association that can be best described with the addition of a single extra model. Take the example of a Student. A Student can be taught by many Tutors and a Tutor can teach many Students, but we can’t fully express the relationship unless we include another entity: Class (to avoid Ruby reserved-name conflicts, let’s name it Klass)

rails g model Student name:string
rails g model Tutor name:string
rails g model Klass subject:string student:references tutor:references

We can say that a Student is taught through attending Klasses and that a Tutor teaches Students through giving Klasses. The word through is important here, as we use the same term in ActiveRecord to define the association:

class Student < ApplicationRecord
  has_many :klasses
  has_many :tutors, through: :klasses
end
class Tutor < ApplicationRecord
  has_many :klasses
  has_many :students, through: :klasses
end
class Klass < ApplicationRecord
  belongs_to :student
  belongs_to :tutor
end

Now we can create our database tables by running:

rails db:migrate

We can then populate the database:

bart = Student.create name: 'Bart Simpson'
edna = Tutor.create name: 'Mrs Krabapple'
Klass.create subject: 'Maths', student: bart, tutor: edna

As well as the usual simple finds we can also create some more complex queries:

Student.find_by(name: 'Bart Simpson').tutors  # find all Bart's tutors
Student.joins(:klasses).where(klasses: {subject: 'Maths'}).distinct.pluck(:name) # get all students who attend the Maths class
Student.joins(:tutors).joins(:klasses).where(klasses: {subject: 'Maths'}, tutors: {name: 'Mrs Krabapple'}).distinct.map {|x| puts x.name} # get all students who attend Maths taught by Mrs Krabapple

As in most cases of mono-transitive associations, the existing model names reflect the association implicitly (i.e. X has_many Z through Y), there is no need for us to do anything more and ActiveRecord will model our association perfectly.

Multi-transitive Associations

A multi-transitive association is one that can be expressed best through many other models. We as Developers, for example, are associated with many software Communities. Our association, though, takes many forms: we may contribute code, post at forums, attend events, and many others. Each Developer is associated with a Community in their own way through specific actions. Let’s pick three of these actions for our example:

  • Contributing code
  • Posting on forums
  • Attending events

The next step in our modeling process is to define the data entities (models) which help realize these actions (associations). For our example, we can safely come up with:

Association through Model
contributing code Repository
posting at forums Forum
attending events Event

Now let’s go ahead and create the models we need:

rails g model Community name:string
rails g model Developer name:string
rails g model Repo url:string comment:string developer:references community:references
rails g model Forum url:string post:text developer:references community:references
rails g model Event location:string name:string developer:references community:references
rails db:migrate

Let’s also create some Developers and Communities:

devs = %w(joe sue fred mary).map {|dev| Developer.create name: dev}
comms = %w(rails nosql javascript postgres).map {|comm| Community.create name: comm}

We then can define the associations between our models. At this point we may be tempted to use the same technique we used in the mono-transitive example and repeat the has_many..through invocation for each association:

class Developer < ApplicationRecord
  has_many :events
  has_many :forums
  has_many :repos
  has_many :appearances, through: :events  #FAIL
  has_many :postings, through: :forums #FAIL
  has_many :contributions, through: :repos #FAIL
end

However, this won’t work as ActiveRecord will try to infer the name of the association’s source model from the association name (e.g appearance) and it will fail. For this reason we need to specify the source model name using the :source option:

class Developer < ApplicationRecord
  has_many :events
  has_many :forums
  has_many :repos
  has_many :appearances, through: :events, source: :community
  has_many :postings, through: :forums, source: :community
  has_many :contributions, through: :repos, source: :community
end

Similarly, we do the same for Communities:

class Community < ApplicationRecord
  has_many :events
  has_many :forums
  has_many :repos
  has_many :hostings, through: :events, source: :developer
  has_many :discussions, through: :forums, source: :developer
  has_many :contributions, through: :repos, source: :developer
end

As you may have noticed, on the Community model we are changing the names of some associations to reflect their nature from this side of the relationship. For instance, a Developer makes appearances at events, while a Community hosts events. A Developer posts at forums, while a Community fosters discussions at forums. This way, we are ensuring that our method names (that AR will dynamically create based on our associations) are meaningful and clear.

We can now create some events, forums, and repos:

Repo.create url: 'www.gitlab.com/342', comment: 'ruby code', developer: devs[0], community: comms[0]
Repo.create url: 'www.gitlab.com/662', comment: 'callbacks sample', developer: devs[0], community: comms[2]
Repo.create url: 'www.jsfiddle.com/abcg3', comment: 'reactive sample', developer: devs[1], community: comms[3]
Repo.create url: 'www.jsfiddle.com/563', comment: 'promises sample', developer: devs[2], community: comms[3]
Forum.create url: 'www.stackoverflow.com/mongodb', post: 'this is what I think...', developer: devs[2], community: comms[1]
Forum.create url: 'www.redis.com/563', post: 'my opinion is...', developer: devs[3], community: comms[1]
Event.create location: 'Bath, UK', name: 'Bath Ruby', developer: devs[2], community: comms[0]
Event.create location: 'Tech Institute', name: 'London NoSQL Meetup', developer: devs[2], community: comms[1]

We can then start extracting useful information from our models:

devs.find_by(name: 'fred').appearances # events a developer has appeared at
Event.find_by(community: comms[0]) # all events for the Rails community
Forum.where(developer: Developer.find_by(name: 'fred') # all forums where a specific developer has posted
Community.find_by(name: 'rails').hostings + Community.find_by(name: 'rails').discussions + Community.find_by(name: 'rails').contributions # get all events, forums and repositories for a specific community
Developer.select('distinct developers.name').joins(:repos).joins(:events).joins(:forums) # find developers who have appeared in Events, contributed to Repos and chatted on Forums, for any Community

We can use the associations directly and/or join the through models in an endless variety of permutations to retrieve the data we need.

The Gist

  • If you have a direct, many-to-many relationship between two models, where no further semantical clarification is needed to describe the relationship, use a has_and_belongs_to_many association.
  • If the many-to-many relationship is indirect or needs a single extra entity in order to be described fully and the relationship name can be captured by the extra model name, use a has_many :through association.
  • If the many-to-many relationship has nuances that require multiple other entities in order to describe it, then use a has_many :through :source association.

Modeling with many-to-many relationships using ActiveRecord can be challenging to get right. Once you understand the nature of each association and the options ActiveRecord offers, it makes this task much easier.

More:
  • vickris

    Well explained. Bravo!!

    • Fred@Bootstrap

      thank you vickris!

  • Maude Bailly-Otis

    Really clear thanks ! !

    • Fred@Bootstrap

      thank you Maude!

  • Josh Arnold

    It would also be great to show how to properly index these!

    • Fred@Bootstrap

      HI Josh. Yes it would, but that would probably require a whole new article for such a nuanced subject as indexing. I wanted to keep this one focused on the association modelling side.

      Hold on, you just gave me an idea for the next article :)

  • Pito Salas

    I thought Rails experts have long discouraged the use of HABM

    • Fred@Bootstrap

      Hi Pito. Some Rails experts discourage HABTM associations on the basis of some devs seriously underestimating / misunderstanding their modelling needs, so they create HABTM associations that later turn out to need to be modified as a has_many: through, as the nature of the relationship becomes clearer Part of the motivation for this article was to clarify when you should be using either of these associations.

  • Fred@Bootstrap

    Hi Tom. The :source option doesn’t affect the underlying tables, but the way AR creates SQL statements. For instance, when we run devs[0].appearances, if we haven’t specified the source, AR will try to join Events table with Appearances, because Appearances is the table name it infers from the association name ‘appearances’. Of course no such table exists. By specifying source: communities, we’re telling AR to join Events with Communities in order to implement the association we have defined as ‘appearances’ . So, without the source option, running devs[0].appearances will fail. With the source option it will create and execute:

    SELECT “communities”.* FROM “communities” INNER JOIN “events” ON “communities”.”id” = “events”.”community_id” WHERE “events”.”developer_id” = ? [[“developer_id”, 3]]

    Which is what we want.

    Hope that makes it clearer.

  • Strategiusz

    Very strange those klasses, each can has only one student. Same with forums, events and repos. Not intuitive.

    • Fred@Bootstrap

      Hi Strategiusz. This is to just keep the examples here short and to the point. To add more students to a class we create a new Klass record, e.g. [Klass.create subject: ‘Maths’, student: chris, tutor: edna] and we can query how many students take a class by the subject name, e.g. [Klass.where(subject: ‘Maths’)]. In a real-word app we’d probably need an association table between Student and Klass, but doing that here would only double the length of this article and detract from the main focus, which is the transitive relationship between two domain classes.
      Hope this makes sense.

      • Strategiusz

        Makes sense for me. Thanks.

  • Aestimo Njuguna

    Hello Fred, Great tutorial – really enjoyed it! Quick questions:
    – Assuming I wanted to add several students to the same klass e.g. the Math klass, how would I go about it?
    – Let’s say I was building up an index view of all students who belong to the Math klass (under a particular tutor, say Mrs. Krabapple), how would this be done?

    Thanks, and good job!

  • Aestimo Njuguna

    Awesome! Thanks..is the kind of association table you describe in the “real world scenario” like a has_many through kind?

    • Fred@Bootstrap

      You wouldn’t even need to do that! Just create your Klass model withour any references, i.e.
      rails g model Klass subject:string

      Then create the associative table:
      rails g model KlassStudentTutor student:references tutor:references klass:references

      run migrations:
      rails db:migrate

      Now, to create a new student for the Maths class under Mrs Krabapple
      KlassStudentTutor.create student: Student.create(name: ‘New Student’), klass: Klass.find_by(name: ‘Maths’), tutor: Tutor.find_by(name: ‘Mrs Krabapple’)

      HTH

      • Aestimo Njuguna

        Simple and sweet…really nice. Thanks. Heading over to your blog for some more good stuff!

Recommended
Sponsors
Get the latest in Ruby, once a week, for free.