Rails Group a Table by: Created_At, Returning a Count of The: Status Column, and Then Sub-Group The: Status Column with Count of Each Unique Value

Rails group a table by :created_at, returning a count of the :status column, and then sub-group the :status column with count of each unique value

I'm adding a second answer with a different approach that I believe to be much better in that it is efficient and can be translated into a DB view.

Any time I end up with lots of repeated hits on the DB or large, complex queries that don't translate well, I look to use pure SQL as that can then be used as a view in the DB. I asked this question because my SQL is poor. I think this can be adapted to your needs, especially if the "status" field is a know set of possible values. Here's how I would try it initially:

Construct a SQL query that works. You can test this in psql.

SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) AS errors,
sum(case when status = 'pending' then 1 end) AS pending,
sum(case when status = 'sent' then 1 end) AS sent
FROM notifications
GROUP BY created_at;

This should return a pivot table like:

| created_at       |total|errors|pending|sent|
----------------------------------------------
| Mon, 05 Oct 2015 |2572 |500 |12 |null|
| Tue, 06 Oct 2015 |555 |null |12 |50 |

Great, any single table is an easy query in Rails that will load it up as an array of objects. Each of those objects will have a method that corresponds to each column. If the column is null for that row Rails will pass nil as the value.

Test it in Rails

@stats = Notification.where(user: users).find_by_sql("SELECT created_at, count(status) 
AS total,
sum(case when status = 'error' then 1 end) AS errors,
sum(case when status = 'pending' then 1 end) AS pending,
sum(case when status = 'sent' then 1 end) AS sent
FROM notifications
GROUP BY created_at;")

Which will return an array of Notification objects...

=> [#< Notification id: nil, created_at: "2014-02-07 22:36:30">
#< Notification id: nil, created_at: "2014-06-26 02:07:51">,
#< Notification id: nil, created_at: "2015-04-26 21:37:09">,
#< Notification id: nil, created_at: "2014-02-07 22:48:29">,
#< Notification id: nil, created_at: "2014-11-04 23:39:07">,
#< Notification id: nil, created_at: "2015-01-27 17:46:50">,...]

Note that the Notification id: is nil. That's because these objects do not represent the actual objects in the DB, but a row in the table produced by your query. But now you can do something like:

@stats.each do |daily_stats|
puts daily_stats.attributes
end

#{"created_at" => "Mon, 05 Oct 2015", "total" = 2572, "errors" => 500, "pending" => 12, "sent" => nil}
#{"created_at" => "Tue, 06 Oct 2015", "total" = 555, "errors" => nil, "pending" => 12, "sent" => 50}

and so on.. Your @stats variable is easily passed to a view where it is easily printed as a table in an .html.erb file. You can access the attributes of any Notification object in the array like:

@stats[0].created_at
#=> "Mon, 05 Oct 2015"

@stats[1].pending
#=> 12

The overall point is you have used one query to get your entire dataset.

Store it as a view
Log into the SQL console on your DB and do

CREATE VIEW daily_stats AS
SELECT user_id, created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) AS errors,
sum(case when status = 'pending' then 1 end) AS pending,
sum(case when status = 'sent' then 1 end) AS sent
FROM notifications
GROUP BY user_id, created_at;

Now you can get the results with

Select * FROM daily_stats;

Note that I have purposefully not limited this by user as you are in your original question and added user_id to the SELECT. We are working in the DB directly and it should easily handle generating a table from this view with ALL users stats for every date. This is a very powerful dataset for what you are doing. Now you can set up a dummy model in Rails and easily have all of your data available without contorted Rails queries.

Add a dummy model
app/models/daily_stat.rb:

class DailyStat < ActiveRecord::Base
belongs_to :user
#this is a model for a view in the DB called dash_views
#class name is singular and will automatically look for the table "daily_stats" which his snake_case and plural.
end

add the corresponding relation to your User model:

class User < ActiveRecord::Base
has_many :daily_stats
end

Now you have access to your stats by user in a very rail-ish way.

users = [2]
DailyStat.where(user: users)
=> AllStat Load (2.8ms) SELECT "all_stats".* FROM "all_stats" WHERE "all_stats"."category_id" = 2
=> [ #<AllStat user_id: 2, created_at: "2014-02-14 00:30:24", total: 300, errors: 23, pending: nil, sent: 3>,
#<AllStat user_id: 2, created_at: "2014-11-29 00:18:28", total: 2454, errors: 3, pending: 45, sent: 323>,
#<AllStat user_id: 2, created_at: "2014-02-07 22:46:59", total: 589, errors: 33, pending: 240, sent: 68>...]

and in the other direction:

user = User.first
user.daily_stats
#returns array of that users DailyStat objects.

The key is to "solve things at the lowest level". Solve a data query problem in the database, then use Rails to manipulate and present it.

SQL query to return a grouped result as a single row

The following should work in any RDBMS:

SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) as errors,
sum(case when status = 'complete' then 1 end) as completed,
sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs
GROUP BY created_at;

The query uses conditional aggregation so as to pivot grouped data. It assumes that status values are known before-hand. If you have additional cases of status values, just add the corresponding sum(case ... expression.

Demo here

group by in active records

Finally i solved this problem. As follows

In the controller =>

@bill_details = @bill_details.group_by{|bd| bd.item_with_detail_id}.values

it will give the following result

 #[[#<BillDetail id: 100, item_with_detail_id: 205, quantity: 3, #<BillDetail id:101, item_with_detail_id: 205, quantity: 5]]

And in the view page:

<% for bd in @bill_details%>
<%quantity=0%>
<%for bill_detail in bd%>
<%quantity+=bill_detail.quantity%>
<%end%>
<%bd=bd.first%>
<%end%>

How do I build a query in Ruby on Rails that joins on the max of a has_many relation only and includes a select filter on that relation?

The simplest solution (based on code complexity) I can think of is first fetching the employment ids with their maximum values, then compsing a new query with the result.

attributes = %i[employee_id created_at]
employments = Employment.group(:employee_id).maximum(:created_at)
.map { |values| Employee.where(attributes.zip(values).to_h) }
.reduce(Employment.none, :or)
.where(status: :inactive)

employees = Employee.where(id: employments.select(:employee_id))

This should produce the following SQL:

SELECT employments.employee_id, MAX(employments.created_at)
FROM employments
GROUP BY employments.employee_id

With the result the following query is build:

SELECT employees.*
FROM employees
WHERE employees.id IN (
SELECT employments.employee_id
FROM employments
WHERE (
employments.employee_id = ? AND employments.created_at = ?
OR employments.employee_id = ? AND employments.created_at = ?
OR employments.employee_id = ? AND employments.created_at = ?
-- ...
) AND employments.status = 'inactive'
)

The above method doesn't hold up well for large amounts of records, since the query grows for each additional employee. It becomes a lot easier when we can assume the higher id is made last. In that scenario the following would do the trick:

employment_ids = Employment.select(Employment.arel_table[:id].maxiumum).group(:employee_id)
employee_ids = Employment.select(:employee_id).where(id: employment_ids, status: :inactive)
employees = Employee.where(id: employee_ids)

This should produce a single query when employees is loaded.

SELECT employees.*
FROM employees
WHERE employees.id IN (
SELECT employments.employee_id
FROM employments
WHERE employments.id IN (
SELECT MAX(employments.id)
FROM employments
GROUP BY employments.employee_id
) AND employments.status = 'inactive'
)

This solution works a lot better with larger datasets but you might want to look into the answer of max for better lookup performance.

select all records holding some condition in has_many association - Ruby On Rails

You should do this to get all profile_ids which have both accounting and administration skills :

Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").pluck(:profile_id)

If you need profiles details, you can put this query in where clause of Profile for id.

Note the number 2 in query, it is length of your array used in where clause. In this case ["accounting", "administration"].length

UPDATE::

Based on updated question description, instead of pluck you can use select and add subquery to make sure it happens in one query.

Profile.where(id: Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").select(:profile_id))

More over you have control over sorting, pagination and additional where clause. Don't see any concerns over there which are mentioned in question edit.

UPDATE 2::

Another way to get intersect of profiles with both the skills (likely to be less efficient than above solution):

profiles = Profile

["accounting", "administration"].each do |name|
profiles = profiles.where(id: Skill.where(name: name).select(:profile_id))
end

Query with LEFT JOIN not returning rows for count of 0

Fix the LEFT JOIN

This should work:

SELECT o.name AS organisation_name, count(e.id) AS total_used
FROM organisations o
LEFT JOIN exam_items e ON e.organisation_id = o.id
AND e.item_template_id = #{sanitize(item_template_id)}
AND e.used
GROUP BY o.name
ORDER BY o.name;

You had a LEFT [OUTER] JOIN but the later WHERE conditions made it act like a plain [INNER] JOIN.

Move the condition(s) to the JOIN clause to make it work as intended. This way, only rows that fulfill all these conditions are joined in the first place (or columns from the right table are filled with NULL). Like you had it, joined rows are tested for additional conditions virtually after the LEFT JOIN and removed if they don't pass, just like with a plain JOIN.

count() never returns NULL to begin with. It's an exception among aggregate functions in this respect. Therefore, COALESCE(COUNT(col)) never makes sense, even with additional parameters. The manual:

It should be noted that except for count, these functions return a null value when no rows are selected.

Bold emphasis mine. See:

  • Count the number of attributes that are NULL for a row

count() must be on a column defined NOT NULL (like e.id), or where the join condition guarantees NOT NULL (e.organisation_id, e.item_template_id, or e.used) in the example.

Since used is type boolean, the expression e.used = true is noise that burns down to just e.used.

Since o.name is not defined UNIQUE NOT NULL, you may want to GROUP BY o.id instead (id being the PK) - unless you intend to fold rows with the same name (including NULL).

Aggregate first, join later

If most or all rows of exam_items are counted in the process, this equivalent query is typically considerably faster / cheaper:

SELECT o.id, o.name AS organisation_name, e.total_used
FROM organisations o
LEFT JOIN (
SELECT organisation_id AS id -- alias to simplify join syntax
, count(*) AS total_used -- count(*) = fastest to count all
FROM exam_items
WHERE item_template_id = #{sanitize(item_template_id)}
AND used
GROUP BY 1
) e USING (id)
ORDER BY o.name, o.id;

(This is assuming that you don't want to fold rows with the same name like mentioned above - the typical case.)

Now we can use the faster / simpler count(*) in the subquery, and we need no GROUP BY in the outer SELECT.

See:

  • Multiple array_agg() calls in a single query

Rails 4 JOIN GROUP BY and SELECT

Would the following work for you?

User.joins(:orders)
.select("users.*, max(orders.created_at) as most_recent, count(orders.id) as orders_count")
.group('users.id')

Taking the max of order.created_at should give you the date of the most recent order. I don't think you want to have your select as part of has_many orders, since you're looking for a list of users, not a list of orders. If you'd a method that returns this active record query, assuming you'll use it more than once, you can add the following to your User model.

def self.with_order_info
self.joins(:orders)
.select("users.*, max(orders.created_at) as most_recent, count(orders.id) as orders_count")
.group('users.id')
end

And then, you can call that method anywhere using:

@users = User.with_order_info

As a further note (to be 100% clear), you should keep your association to orders as:

has_many :orders

SELECT list is not in GROUP BY clause and contains nonaggregated column .... incompatible with sql_mode=only_full_group_by

This

Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'returntr_prod.tbl_customer_pod_uploads.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

will be simply solved by changing the sql mode in MySQL by this command,

SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));

This too works for me..
I used this, because in my project there are many Queries like this so I just changed this sql mode to only_full_group_by

OR simply include all columns in the GROUP BY clause that was specified by the SELECT statement. The sql_mode can be left enabled.

Thank You... :-)

Rails: select unique values from a column

Model.select(:rating)

The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:

Model.select(:rating).map(&:rating).uniq

or this (most efficient):

Model.uniq.pluck(:rating)

Rails 5+

Model.distinct.pluck(:rating)

Update

Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).

Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']

In this case, deduplicate after the query

user.addresses.pluck(:city).uniq # => ['Moscow']


Related Topics



Leave a reply



Submit