Rails: How to Select Records Which Don't Have a Specific Related (Associated) Object (Sql Exists Brief How-To)

Rails: How to Select Records Which Don't Have a Specific Related (associated) Object (SQL EXISTS brief how-to)

Such questions are pretty common amongst beginner to middle-level Rails developers. You know ActiveRecord interface and basic SQL operations, but you are stumbled on such kind of tasks as outlined in the question. (Couple of examples of such questions: 1, 2).

The answer is simple: use SQL EXISTS condition. Quick reference from the given URL:

Syntax

The syntax for the SQL EXISTS condition is:

WHERE EXISTS ( subquery );

Parameters or Arguments

subquery

The subquery is a SELECT statement. If the subquery returns at least one record in its result set, the EXISTS clause will evaluate to true and the EXISTS condition will be met. If the subquery does not return any records, the EXISTS clause will evaluate to false and the EXISTS condition will not be met.

It is also mentioned that EXISTS might be slower than JOIN, but that is usually not true. From the Exists v. Join question on SO:

EXISTS is only used to test if a subquery returns results, and short circuits as soon as it does. JOIN is used to extend a result set by combining it with additional fields from another table to which there is a relation. [...] If you have proper indexes, most of the time the EXISTS will perform identically to the JOIN. The exception is on very complicated subqueries, where it is normally quicker to use EXISTS.

So, the database doesn't need to look through all the connections (it stops 'joining' records with 'exists' as soon as it founds the right one), and doesn't need to return all the fields from the table joined (just check that the corresponding row, well, does exist).

Answering the specific questions:

Select only such users, who don't belong to given set of Groups (groups with ids [4,5,6])

not_four_to_six = User.where("NOT EXISTS (
SELECT 1 FROM connections
WHERE connections.user_id = users.id
AND connections.group_id IN (?)
)", [4,5,6])

Select only such users, who belong to one set of Groups ([1,2,3]) and don't belong to another ([4,5,6])

one_two_three = not_four_to_six.where("EXISTS (
SELECT 1 FROM connections
WHERE connections.user_id = users.id
AND connections.group_id IN (?)
)", [1,2,3])

Select only such users, who doesn't belong to a Group

User.where("NOT EXISTS (
SELECT 1 FROM connections
WHERE connections.user_id = users.id
)")

Want to find records with no associated records in Rails

This is still pretty close to SQL, but it should get everyone with no friends in the first case:

Person.where('id NOT IN (SELECT DISTINCT(person_id) FROM friends)')

Rails: Find the has_one record that doesn't have one

I'm not sure what the Ruby code would be, but I think the SQL should be something like:

SELECT * FROM Foo WHERE id NOT IN (SELECT foo_id FROM Bar)

Make a request with activerecord to get only the users from groups and not from others

Correct way to do such things is to use SQL EXISTS condition. I wish there was a specific ActiveRecord helper method for that, but there isn't at the moment.

Well, using pure SQL is just fine:

User.where("EXISTS (SELECT 1 FROM groups_users WHERE groups_users.user_id = users.id AND groups_users.group_id IN (?))", [8939, 8950]).
where("NOT EXISTS (SELECT 1 FROM groups_users WHERE groups_users.user_id = users.id AND groups_users.group_id IN (?))", [8942])

What you were doing with your original query is asking for not joining groups with [8942] ids to your query, and only joining groups with ids [8939, 8950]. Well, you can see right now that this doesn't make any sense: that's like asking to select every user whose name is bob and NOT charlie. Second condition doesn't add anything to the first one.

Join query is multiplicating columns, so if your user is in every group, result set would be:

user_id | group_id
1 | 8939
1 | 8950
1 | 8942

Then you filter out the latter row: 1 | 8942. Still, user 1 is in the result set and is returned.

And to ask the database to return only records which doesn't connect with another relation you should explicitly use NOT EXISTS which exists explicitly for that purpose :)

Find all records which have a count of an association greater than zero

joins uses an inner join by default so using Project.joins(:vacancies) will in effect only return projects that have an associated vacancy.

UPDATE:

As pointed out by @mackskatz in the comment, without a group clause, the code above will return duplicate projects for projects with more than one vacancies. To remove the duplicates, use

Project.joins(:vacancies).group('projects.id')

UPDATE:

As pointed out by @Tolsee, you can also use distinct.

Project.joins(:vacancies).distinct

As an example

[10] pry(main)> Comment.distinct.pluck :article_id
=> [43, 34, 45, 55, 17, 19, 1, 3, 4, 18, 44, 5, 13, 22, 16, 6, 53]
[11] pry(main)> _.size
=> 17
[12] pry(main)> Article.joins(:comments).size
=> 45
[13] pry(main)> Article.joins(:comments).distinct.size
=> 17
[14] pry(main)> Article.joins(:comments).distinct.to_sql
=> "SELECT DISTINCT \"articles\".* FROM \"articles\" INNER JOIN \"comments\" ON \"comments\".\"article_id\" = \"articles\".\"id\""

How to select subset of users based on many-to-many relationship?

I will do this

  1. User.joins("LEFT JOIN relationships ON relationships.user_id = users.id").where('relationships.user_id IS NULL').offset(rand(0..100)).first

  2. Something like:

    member_ids = Relationship.where(member: true).pluck(:user_id).uniq
    users = User.where.not(id: member_ids) # or User.where('id NOT in (?)', member_ids) on Rails < 4

Including an association if it exists in a rails query

Answering my own question twice... bit awkward but anyway.

Rails doesn't seem to let you specify additional conditions for an includes() statement. If it did, my previous answer would work - you could put an additional condition on the includes() statement that would let the where conditions work correctly. To solve this we'd need to get includes() to use something like the following SQL (Getting the 'AND' condition is the problem):

LEFT JOIN user_thing_states as uts ON things.id = uts.thing_id AND uqs.user_id = :user_id

I'm resorting to this for now which is a bit awful.

class User
...

def subscribed_things
self.subscribed_things_with_state + self.subscribed_things_with_no_state
end

def subscribed_things_with_state
self.things.includes(:user_thing_states).by_subscribed_collections(self).all
end

def subscribed_things_with_no_state
Thing.with_no_state().by_subscribed_collections(self).all
end

end

Rails: How to get objects with at least one child?

Parent.joins(:children).uniq.all


Related Topics



Leave a reply



Submit