How to Build a Query in Ruby on Rails That Joins on the Max of a Has_Many Relation Only and Includes a Select Filter on That Relation

How do I build a query in Ruby on Rails that joins on the max of a has_many relation only and includes a select filter on that relation?

The simplest solution (based on code complexity) I can think of is first fetching the employment ids with their maximum values, then compsing a new query with the result.

attributes = %i[employee_id created_at]
employments = Employment.group(:employee_id).maximum(:created_at)
.map { |values| Employee.where(attributes.zip(values).to_h) }
.reduce(Employment.none, :or)
.where(status: :inactive)

employees = Employee.where(id: employments.select(:employee_id))

This should produce the following SQL:

SELECT employments.employee_id, MAX(employments.created_at)
FROM employments
GROUP BY employments.employee_id

With the result the following query is build:

SELECT employees.*
FROM employees
WHERE employees.id IN (
SELECT employments.employee_id
FROM employments
WHERE (
employments.employee_id = ? AND employments.created_at = ?
OR employments.employee_id = ? AND employments.created_at = ?
OR employments.employee_id = ? AND employments.created_at = ?
-- ...
) AND employments.status = 'inactive'
)

The above method doesn't hold up well for large amounts of records, since the query grows for each additional employee. It becomes a lot easier when we can assume the higher id is made last. In that scenario the following would do the trick:

employment_ids = Employment.select(Employment.arel_table[:id].maxiumum).group(:employee_id)
employee_ids = Employment.select(:employee_id).where(id: employment_ids, status: :inactive)
employees = Employee.where(id: employee_ids)

This should produce a single query when employees is loaded.

SELECT employees.*
FROM employees
WHERE employees.id IN (
SELECT employments.employee_id
FROM employments
WHERE employments.id IN (
SELECT MAX(employments.id)
FROM employments
GROUP BY employments.employee_id
) AND employments.status = 'inactive'
)

This solution works a lot better with larger datasets but you might want to look into the answer of max for better lookup performance.

Rails: query for 2-step relation through join-table + has many relation

@shippingservices = @cart.available_shipping_services.joins(:lands => :zones).where('lands_zones_join.land_id = ?', params[:id])

The lands_zones_join table in the where clause is incorrect. Rails is seeing this as a derived table name and applying two extra joins seen here:

INNER JOIN `lands_zones` `zones_lands_join` ON `zones_lands_join`.`land_id` = `lands`.`id` 
INNER JOIN `zones` `zones_lands` ON `zones_lands`.`id` = `zones_lands_join`.`zone_id`

This is leading to the duplication in the query results.

The relations from Shippingservice to Land has been set up correctly in the models and schema. We can therefore join the tables directly and query the lands table itself for the id:

@cart.available_shipping_services.joins(:lands).where(lands: {id: params[:id]})

How to filter based on has_many association?

You're looking for a "contains" query:

SELECT p.*
FROM projects p
INNER JOIN taggings t ON p.id = t.project_id
GROUP BY p.id
HAVING array_agg(t.tag_id ORDER BY t.tag_id) @> ARRAY [1, 3, 5];

This will return all the projects that have all given tags but not limited to them. i.e. if a project has tags 1, 3, 5, 7, it will be returned to. But not a project that

A couple of conditions:

  1. ARRAY [1, 3, 5] must be sorted
  2. p.id (which is really projects.id) must: a) be the primary key or b) have a uniqueness constraint attached to it.

The advantage of doing it this way is that the query is flexible — you can change the operation to quickly change meaning. Say, instead of "return a project will all of these tags", you could now write "return a project with only these tags".

Consider this data set:

projects:

id name
1 guttenberg
2 x
3 aristotle

tags:

id name
1 books
2 teams
3 management
4 library
5 movie

taggings:

id project_id tag_id
1 1 1
2 1 3
3 1 5
4 2 1
5 2 3
6 3 4
7 3 5

If you were to query for 1, 3, you should get projects 1 and 2.

A SQL fiddle to play with: http://sqlfiddle.com/#!17/345dd0/9/1

Equivalent ActiveRecord:

tag_ids = [1, 5, 3].sort # condition 1
projects =
Project.joins(:taggings) # don't need tags
.group(:id)
.having("array_agg(taggings.tag_id ORDER BY taggings.tag_id) @> ARRAY[?]", tag_ids)

select all records holding some condition in has_many association - Ruby On Rails

You should do this to get all profile_ids which have both accounting and administration skills :

Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").pluck(:profile_id)

If you need profiles details, you can put this query in where clause of Profile for id.

Note the number 2 in query, it is length of your array used in where clause. In this case ["accounting", "administration"].length

UPDATE::

Based on updated question description, instead of pluck you can use select and add subquery to make sure it happens in one query.

Profile.where(id: Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").select(:profile_id))

More over you have control over sorting, pagination and additional where clause. Don't see any concerns over there which are mentioned in question edit.

UPDATE 2::

Another way to get intersect of profiles with both the skills (likely to be less efficient than above solution):

profiles = Profile

["accounting", "administration"].each do |name|
profiles = profiles.where(id: Skill.where(name: name).select(:profile_id))
end

How to select subset of users based on many-to-many relationship?

I will do this

  1. User.joins("LEFT JOIN relationships ON relationships.user_id = users.id").where('relationships.user_id IS NULL').offset(rand(0..100)).first

  2. Something like:

    member_ids = Relationship.where(member: true).pluck(:user_id).uniq
    users = User.where.not(id: member_ids) # or User.where('id NOT in (?)', member_ids) on Rails < 4

How to make query conditional on associated table in ActiveRecord

Here is how I would recommend going about this.

We will create inverted scopes for busy and available like so

class Act < ApplicationRecord
has_many :events

scope :busy_on, ->(date) { joins(:events).where(events: {date: date}) }
scope :available_on, ->(date) {where.not(id: busy_on(date).select(:id))}
end

Here we create one scope for the days that an Act is busy and then we use that scope as a counter filter to determine if the act is available.
The resulting SQL for busy_on scope will be:

  SELECT 
acts.*
FROM
acts
INNER JOIN events ON acts.id = events.act_id
WHERE
events.date = [THE DATE YOU PASS INTO THE SCOPE]

Thus the resulting SQL for the available_on scope will be:

 SELECT 
acts.*
FROM
acts
WHERE
acts.id NOT IN (
SELECT
acts.id
FROM
acts
INNER JOIN events ON acts.id = events.act_id
WHERE
events.date = [THE DATE YOU PASS INTO THE SCOPE]
)

SQL where joined set must contain all values but may contain more

Group by offer.id, not by sports.name (or sports.id):

SELECT o.*
FROM sports s
JOIN offers_sports os ON os.sport_id = s.id
JOIN offers o ON os.offer_id = o.id
WHERE s.name IN ('Bodyboarding', 'Surfing')
GROUP BY o.id -- !!
HAVING count(*) = 2;

Assuming the typical implementation:

  • offer.id and sports.id are defined as primary key.
  • sports.name is defined unique.
  • (sport_id, offer_id) in offers_sports is defined unique (or PK).

You don't need DISTINCT in the count. And count(*) is even a bit cheaper, yet.

Related answer with an arsenal of possible techniques:

  • How to filter SQL results in a has-many-through relation

Added by @max (the OP) - this is the above query rolled into ActiveRecord:

class Offer < ActiveRecord::Base
has_and_belongs_to_many :sports
def self.includes_sports(*sport_names)
joins(:sports)
.where(sports: { name: sport_names })
.group('offers.id')
.having("count(*) = ?", sport_names.size)
end
end


Related Topics



Leave a reply



Submit