How to Build a Query in Ruby on Rails That Joins on the Max of a Has_Many Relation Only and Includes a Select Filter on That Relation

How do I build a query in Ruby on Rails that joins on the max of a has_many relation only and includes a select filter on that relation?

The simplest solution (based on code complexity) I can think of is first fetching the employment ids with their maximum values, then compsing a new query with the result.

attributes = %i[employee_id created_at]
employments = Employment.group(:employee_id).maximum(:created_at)
              .map { |values| Employee.where(attributes.zip(values).to_h) }
              .reduce(Employment.none, :or)
              .where(status: :inactive)

employees = Employee.where(id: employments.select(:employee_id))

This should produce the following SQL:

SELECT employments.employee_id, MAX(employments.created_at)
FROM employments
GROUP BY employments.employee_id

With the result the following query is build:

SELECT employees.*
FROM employees
WHERE employees.id IN (
  SELECT employments.employee_id 
  FROM employments
  WHERE (
    employments.employee_id = ? AND employments.created_at = ?
    OR employments.employee_id = ? AND employments.created_at = ?
    OR employments.employee_id = ? AND employments.created_at = ?
    -- ...
  ) AND employments.status = 'inactive'
)

The above method doesn't hold up well for large amounts of records, since the query grows for each additional employee. It becomes a lot easier when we can assume the higher id is made last. In that scenario the following would do the trick:

employment_ids = Employment.select(Employment.arel_table[:id].maxiumum).group(:employee_id)
employee_ids = Employment.select(:employee_id).where(id: employment_ids, status: :inactive)
employees = Employee.where(id: employee_ids)

This should produce a single query when employees is loaded.

SELECT employees.*
FROM employees
WHERE employees.id IN (
  SELECT employments.employee_id 
  FROM employments
  WHERE employments.id IN (
    SELECT MAX(employments.id)
    FROM employments
    GROUP BY employments.employee_id
  ) AND employments.status = 'inactive'
)

This solution works a lot better with larger datasets but you might want to look into the answer of max for better lookup performance.

Rails: query for 2-step relation through join-table + has many relation

@shippingservices = @cart.available_shipping_services.joins(:lands => :zones).where('lands_zones_join.land_id = ?', params[:id])

The lands_zones_join table in the where clause is incorrect. Rails is seeing this as a derived table name and applying two extra joins seen here:

INNER JOIN `lands_zones` `zones_lands_join` ON `zones_lands_join`.`land_id` = `lands`.`id` 
INNER JOIN `zones` `zones_lands` ON `zones_lands`.`id` = `zones_lands_join`.`zone_id`

This is leading to the duplication in the query results.

The relations from Shippingservice to Land has been set up correctly in the models and schema. We can therefore join the tables directly and query the lands table itself for the id:

@cart.available_shipping_services.joins(:lands).where(lands: {id: params[:id]})

How to filter based on has_many association?

You're looking for a "contains" query:

SELECT p.*
FROM projects p
         INNER JOIN taggings t ON p.id = t.project_id
GROUP BY p.id
HAVING array_agg(t.tag_id ORDER BY t.tag_id) @> ARRAY [1, 3, 5];

This will return all the projects that have all given tags but not limited to them. i.e. if a project has tags 1, 3, 5, 7, it will be returned to. But not a project that

A couple of conditions:

ARRAY [1, 3, 5] must be sorted
p.id (which is really projects.id) must: a) be the primary key or b) have a uniqueness constraint attached to it.

The advantage of doing it this way is that the query is flexible — you can change the operation to quickly change meaning. Say, instead of "return a project will all of these tags", you could now write "return a project with only these tags".

Consider this data set:

projects:

id  name
1   guttenberg
2   x
3   aristotle

tags:

id  name
1   books
2   teams
3   management
4   library
5   movie

taggings:

id  project_id  tag_id
1   1   1
2   1   3
3   1   5
4   2   1
5   2   3
6   3   4
7   3   5

If you were to query for 1, 3, you should get projects 1 and 2.

A SQL fiddle to play with: http://sqlfiddle.com/#!17/345dd0/9/1

Equivalent ActiveRecord:

tag_ids = [1, 5, 3].sort # condition 1
projects = 
  Project.joins(:taggings) # don't need tags
    .group(:id)
    .having("array_agg(taggings.tag_id ORDER BY taggings.tag_id) @> ARRAY[?]", tag_ids)

select all records holding some condition in has_many association - Ruby On Rails

You should do this to get all profile_ids which have both accounting and administration skills :

Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").pluck(:profile_id)

If you need profiles details, you can put this query in where clause of Profile for id.

Note the number 2 in query, it is length of your array used in where clause. In this case ["accounting", "administration"].length

UPDATE::

Based on updated question description, instead of pluck you can use select and add subquery to make sure it happens in one query.

Profile.where(id: Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").select(:profile_id))

More over you have control over sorting, pagination and additional where clause. Don't see any concerns over there which are mentioned in question edit.

UPDATE 2::

Another way to get intersect of profiles with both the skills (likely to be less efficient than above solution):

profiles = Profile

["accounting", "administration"].each do |name|
  profiles = profiles.where(id: Skill.where(name: name).select(:profile_id))
end

How to select subset of users based on many-to-many relationship?

I will do this

User.joins("LEFT JOIN relationships ON relationships.user_id = users.id").where('relationships.user_id IS NULL').offset(rand(0..100)).first

Something like:

member_ids = Relationship.where(member: true).pluck(:user_id).uniq
users = User.where.not(id: member_ids) # or User.where('id NOT in (?)', member_ids) on Rails < 4

How to make query conditional on associated table in ActiveRecord

Here is how I would recommend going about this.

We will create inverted scopes for busy and available like so

class Act < ApplicationRecord
   has_many :events 

   scope :busy_on, ->(date) { joins(:events).where(events: {date: date}) }
   scope :available_on, ->(date) {where.not(id: busy_on(date).select(:id))}
end

Here we create one scope for the days that an Act is busy and then we use that scope as a counter filter to determine if the act is available.
The resulting SQL for busy_on scope will be:

  SELECT 
     acts.* 
  FROM 
     acts
     INNER JOIN events ON acts.id = events.act_id
  WHERE 
     events.date = [THE DATE YOU PASS INTO THE SCOPE]

Thus the resulting SQL for the available_on scope will be:

 SELECT 
   acts.* 
 FROM 
   acts 
 WHERE 
   acts.id NOT IN ( 
      SELECT 
         acts.id
      FROM 
         acts
         INNER JOIN events ON acts.id = events.act_id
      WHERE 
         events.date = [THE DATE YOU PASS INTO THE SCOPE]
   )

SQL where joined set must contain all values but may contain more

Group by offer.id, not by sports.name (or sports.id):

SELECT o.*
FROM   sports        s
JOIN   offers_sports os ON os.sport_id = s.id
JOIN   offers        o  ON os.offer_id = o.id
WHERE  s.name IN ('Bodyboarding', 'Surfing') 
GROUP  BY o.id  -- !!
HAVING count(*) = 2;

Assuming the typical implementation:

offer.id and sports.id are defined as primary key.
sports.name is defined unique.
(sport_id, offer_id) in offers_sports is defined unique (or PK).

You don't need DISTINCT in the count. And count(*) is even a bit cheaper, yet.

Related answer with an arsenal of possible techniques:

How to filter SQL results in a has-many-through relation

Added by @max (the OP) - this is the above query rolled into ActiveRecord:

class Offer < ActiveRecord::Base
  has_and_belongs_to_many :sports
  def self.includes_sports(*sport_names)
    joins(:sports)
      .where(sports: { name: sport_names })
      .group('offers.id')
      .having("count(*) = ?", sport_names.size)
  end
end