Rails: How to Select Records Which Don't Have a Specific Related (associated) Object (SQL EXISTS brief how-to)
Such questions are pretty common amongst beginner to middle-level Rails developers. You know ActiveRecord
interface and basic SQL
operations, but you are stumbled on such kind of tasks as outlined in the question. (Couple of examples of such questions: 1, 2).
The answer is simple: use SQL EXISTS
condition. Quick reference from the given URL:
Syntax
The syntax for the SQL EXISTS condition is:
WHERE EXISTS ( subquery );
Parameters or Arguments
subquery
The subquery is a
SELECT
statement. If the subquery returns at least one record in its result set, theEXISTS
clause will evaluate to true and theEXISTS
condition will be met. If the subquery does not return any records, theEXISTS
clause will evaluate to false and theEXISTS
condition will not be met.
It is also mentioned that EXISTS
might be slower than JOIN
, but that is usually not true. From the Exists v. Join question on SO:
EXISTS
is only used to test if a subquery returns results, and short circuits as soon as it does.JOIN
is used to extend a result set by combining it with additional fields from another table to which there is a relation. [...] If you have proper indexes, most of the time theEXISTS
will perform identically to theJOIN
. The exception is on very complicated subqueries, where it is normally quicker to useEXISTS
.
So, the database doesn't need to look through all the connections (it stops 'joining' records with 'exists' as soon as it founds the right one), and doesn't need to return all the fields from the table joined (just check that the corresponding row, well, does exist).
Answering the specific questions:
Select only such users, who don't belong to given set of Groups (groups with ids
[4,5,6]
)
not_four_to_six = User.where("NOT EXISTS (
SELECT 1 FROM connections
WHERE connections.user_id = users.id
AND connections.group_id IN (?)
)", [4,5,6])
Select only such users, who belong to one set of Groups (
[1,2,3]
) and don't belong to another ([4,5,6]
)
one_two_three = not_four_to_six.where("EXISTS (
SELECT 1 FROM connections
WHERE connections.user_id = users.id
AND connections.group_id IN (?)
)", [1,2,3])
Select only such users, who doesn't belong to a Group
User.where("NOT EXISTS (
SELECT 1 FROM connections
WHERE connections.user_id = users.id
)")
Want to find records with no associated records in Rails
This is still pretty close to SQL, but it should get everyone with no friends in the first case:
Person.where('id NOT IN (SELECT DISTINCT(person_id) FROM friends)')
Rails: Find the has_one record that doesn't have one
I'm not sure what the Ruby code would be, but I think the SQL should be something like:
SELECT * FROM Foo WHERE id NOT IN (SELECT foo_id FROM Bar)
Make a request with activerecord to get only the users from groups and not from others
Correct way to do such things is to use SQL EXISTS condition. I wish there was a specific ActiveRecord helper method for that, but there isn't at the moment.
Well, using pure SQL is just fine:
User.where("EXISTS (SELECT 1 FROM groups_users WHERE groups_users.user_id = users.id AND groups_users.group_id IN (?))", [8939, 8950]).
where("NOT EXISTS (SELECT 1 FROM groups_users WHERE groups_users.user_id = users.id AND groups_users.group_id IN (?))", [8942])
What you were doing with your original query is asking for not joining groups with [8942]
ids to your query, and only joining groups with ids [8939, 8950]
. Well, you can see right now that this doesn't make any sense: that's like asking to select every user whose name is bob
and NOT charlie
. Second condition doesn't add anything to the first one.
Join query is multiplicating columns, so if your user is in every group, result set would be:
user_id | group_id
1 | 8939
1 | 8950
1 | 8942
Then you filter out the latter row: 1 | 8942
. Still, user 1
is in the result set and is returned.
And to ask the database to return only records which doesn't connect with another relation you should explicitly use NOT EXISTS
which exists explicitly for that purpose :)
Find all records which have a count of an association greater than zero
joins
uses an inner join by default so using Project.joins(:vacancies)
will in effect only return projects that have an associated vacancy.
UPDATE:
As pointed out by @mackskatz in the comment, without a group
clause, the code above will return duplicate projects for projects with more than one vacancies. To remove the duplicates, use
Project.joins(:vacancies).group('projects.id')
UPDATE:
As pointed out by @Tolsee, you can also use distinct
.
Project.joins(:vacancies).distinct
As an example
[10] pry(main)> Comment.distinct.pluck :article_id
=> [43, 34, 45, 55, 17, 19, 1, 3, 4, 18, 44, 5, 13, 22, 16, 6, 53]
[11] pry(main)> _.size
=> 17
[12] pry(main)> Article.joins(:comments).size
=> 45
[13] pry(main)> Article.joins(:comments).distinct.size
=> 17
[14] pry(main)> Article.joins(:comments).distinct.to_sql
=> "SELECT DISTINCT \"articles\".* FROM \"articles\" INNER JOIN \"comments\" ON \"comments\".\"article_id\" = \"articles\".\"id\""
How to select subset of users based on many-to-many relationship?
I will do this
User.joins("LEFT JOIN relationships ON relationships.user_id = users.id").where('relationships.user_id IS NULL').offset(rand(0..100)).first
Something like:
member_ids = Relationship.where(member: true).pluck(:user_id).uniq
users = User.where.not(id: member_ids) # or User.where('id NOT in (?)', member_ids) on Rails < 4
Including an association if it exists in a rails query
Answering my own question twice... bit awkward but anyway.
Rails doesn't seem to let you specify additional conditions for an includes() statement. If it did, my previous answer would work - you could put an additional condition on the includes() statement that would let the where conditions work correctly. To solve this we'd need to get includes() to use something like the following SQL (Getting the 'AND' condition is the problem):
LEFT JOIN user_thing_states as uts ON things.id = uts.thing_id AND uqs.user_id = :user_id
I'm resorting to this for now which is a bit awful.
class User
...
def subscribed_things
self.subscribed_things_with_state + self.subscribed_things_with_no_state
end
def subscribed_things_with_state
self.things.includes(:user_thing_states).by_subscribed_collections(self).all
end
def subscribed_things_with_no_state
Thing.with_no_state().by_subscribed_collections(self).all
end
end
Rails: How to get objects with at least one child?
Parent.joins(:children).uniq.all
Related Topics
How to Monitor the Executed SQL Statements on a SQL Server 2005
How to Do a SQL Update in Batches, Like an Update Top
Is Cut() Style Binning Available in Dplyr
What Is {Ts '2013-04-02 00:00:00'}
How to Update All Columns of a Record Without Having to List Every Column
Can an SQL Procedure Return a Table
How to Get Referenced Values from Another Table
Sqlsave: Mapping Dataframe Timestamps to SQL Server Timestamps
Find a Database with a Particular Table or Find a Table in Every Database of SQL Server
SQL Convert Milliseconds to Days, Hours, Minutes
Sql Server Pivot on Multiple Columns
Is Innodb Sorting Really That Slow
Selecting The Top N Rows Within a Group by Clause
Failing Update Table in Db2 with Sqlcode: -668, Sqlstate: 57016, Sqlerrmc: 7;
How to Execute SQL Queries in Apache Spark
Count of Unique Values in a Rolling Date Range for R