Rails group a table by :created_at, returning a count of the :status column, and then sub-group the :status column with count of each unique value
I'm adding a second answer with a different approach that I believe to be much better in that it is efficient and can be translated into a DB view.
Any time I end up with lots of repeated hits on the DB or large, complex queries that don't translate well, I look to use pure SQL as that can then be used as a view in the DB. I asked this question because my SQL is poor. I think this can be adapted to your needs, especially if the "status" field is a know set of possible values. Here's how I would try it initially:
Construct a SQL query that works. You can test this in psql.
SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) AS errors,
sum(case when status = 'pending' then 1 end) AS pending,
sum(case when status = 'sent' then 1 end) AS sent
FROM notifications
GROUP BY created_at;
This should return a pivot table like:
| created_at |total|errors|pending|sent|
----------------------------------------------
| Mon, 05 Oct 2015 |2572 |500 |12 |null|
| Tue, 06 Oct 2015 |555 |null |12 |50 |
Great, any single table is an easy query in Rails that will load it up as an array of objects. Each of those objects will have a method that corresponds to each column. If the column is null for that row Rails will pass nil
as the value.
Test it in Rails
@stats = Notification.where(user: users).find_by_sql("SELECT created_at, count(status)
AS total,
sum(case when status = 'error' then 1 end) AS errors,
sum(case when status = 'pending' then 1 end) AS pending,
sum(case when status = 'sent' then 1 end) AS sent
FROM notifications
GROUP BY created_at;")
Which will return an array of Notification
objects...
=> [#< Notification id: nil, created_at: "2014-02-07 22:36:30">
#< Notification id: nil, created_at: "2014-06-26 02:07:51">,
#< Notification id: nil, created_at: "2015-04-26 21:37:09">,
#< Notification id: nil, created_at: "2014-02-07 22:48:29">,
#< Notification id: nil, created_at: "2014-11-04 23:39:07">,
#< Notification id: nil, created_at: "2015-01-27 17:46:50">,...]
Note that the Notification id:
is nil. That's because these objects do not represent the actual objects in the DB, but a row in the table produced by your query. But now you can do something like:
@stats.each do |daily_stats|
puts daily_stats.attributes
end
#{"created_at" => "Mon, 05 Oct 2015", "total" = 2572, "errors" => 500, "pending" => 12, "sent" => nil}
#{"created_at" => "Tue, 06 Oct 2015", "total" = 555, "errors" => nil, "pending" => 12, "sent" => 50}
and so on.. Your @stats
variable is easily passed to a view where it is easily printed as a table in an .html.erb
file. You can access the attributes of any Notification object in the array like:
@stats[0].created_at
#=> "Mon, 05 Oct 2015"
@stats[1].pending
#=> 12
The overall point is you have used one query to get your entire dataset.
Store it as a view
Log into the SQL console on your DB and do
CREATE VIEW daily_stats AS
SELECT user_id, created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) AS errors,
sum(case when status = 'pending' then 1 end) AS pending,
sum(case when status = 'sent' then 1 end) AS sent
FROM notifications
GROUP BY user_id, created_at;
Now you can get the results with
Select * FROM daily_stats;
Note that I have purposefully not limited this by user as you are in your original question and added user_id to the SELECT. We are working in the DB directly and it should easily handle generating a table from this view with ALL users stats for every date. This is a very powerful dataset for what you are doing. Now you can set up a dummy model in Rails and easily have all of your data available without contorted Rails queries.
Add a dummy model
app/models/daily_stat.rb:
class DailyStat < ActiveRecord::Base
belongs_to :user
#this is a model for a view in the DB called dash_views
#class name is singular and will automatically look for the table "daily_stats" which his snake_case and plural.
end
add the corresponding relation to your User
model:
class User < ActiveRecord::Base
has_many :daily_stats
end
Now you have access to your stats by user in a very rail-ish way.
users = [2]
DailyStat.where(user: users)
=> AllStat Load (2.8ms) SELECT "all_stats".* FROM "all_stats" WHERE "all_stats"."category_id" = 2
=> [ #<AllStat user_id: 2, created_at: "2014-02-14 00:30:24", total: 300, errors: 23, pending: nil, sent: 3>,
#<AllStat user_id: 2, created_at: "2014-11-29 00:18:28", total: 2454, errors: 3, pending: 45, sent: 323>,
#<AllStat user_id: 2, created_at: "2014-02-07 22:46:59", total: 589, errors: 33, pending: 240, sent: 68>...]
and in the other direction:
user = User.first
user.daily_stats
#returns array of that users DailyStat objects.
The key is to "solve things at the lowest level". Solve a data query problem in the database, then use Rails to manipulate and present it.
SQL query to return a grouped result as a single row
The following should work in any RDBMS:
SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) as errors,
sum(case when status = 'complete' then 1 end) as completed,
sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs
GROUP BY created_at;
The query uses conditional aggregation so as to pivot grouped data. It assumes that status
values are known before-hand. If you have additional cases of status
values, just add the corresponding sum(case ...
expression.
Demo here
group by in active records
Finally i solved this problem. As follows
In the controller =>
@bill_details = @bill_details.group_by{|bd| bd.item_with_detail_id}.values
it will give the following result
#[[#<BillDetail id: 100, item_with_detail_id: 205, quantity: 3, #<BillDetail id:101, item_with_detail_id: 205, quantity: 5]]
And in the view page:
<% for bd in @bill_details%>
<%quantity=0%>
<%for bill_detail in bd%>
<%quantity+=bill_detail.quantity%>
<%end%>
<%bd=bd.first%>
<%end%>
How do I build a query in Ruby on Rails that joins on the max of a has_many relation only and includes a select filter on that relation?
The simplest solution (based on code complexity) I can think of is first fetching the employment ids with their maximum values, then compsing a new query with the result.
attributes = %i[employee_id created_at]
employments = Employment.group(:employee_id).maximum(:created_at)
.map { |values| Employee.where(attributes.zip(values).to_h) }
.reduce(Employment.none, :or)
.where(status: :inactive)
employees = Employee.where(id: employments.select(:employee_id))
This should produce the following SQL:
SELECT employments.employee_id, MAX(employments.created_at)
FROM employments
GROUP BY employments.employee_id
With the result the following query is build:
SELECT employees.*
FROM employees
WHERE employees.id IN (
SELECT employments.employee_id
FROM employments
WHERE (
employments.employee_id = ? AND employments.created_at = ?
OR employments.employee_id = ? AND employments.created_at = ?
OR employments.employee_id = ? AND employments.created_at = ?
-- ...
) AND employments.status = 'inactive'
)
The above method doesn't hold up well for large amounts of records, since the query grows for each additional employee. It becomes a lot easier when we can assume the higher id is made last. In that scenario the following would do the trick:
employment_ids = Employment.select(Employment.arel_table[:id].maxiumum).group(:employee_id)
employee_ids = Employment.select(:employee_id).where(id: employment_ids, status: :inactive)
employees = Employee.where(id: employee_ids)
This should produce a single query when employees
is loaded.
SELECT employees.*
FROM employees
WHERE employees.id IN (
SELECT employments.employee_id
FROM employments
WHERE employments.id IN (
SELECT MAX(employments.id)
FROM employments
GROUP BY employments.employee_id
) AND employments.status = 'inactive'
)
This solution works a lot better with larger datasets but you might want to look into the answer of max for better lookup performance.
select all records holding some condition in has_many association - Ruby On Rails
You should do this to get all profile_id
s which have both accounting and administration skills :
Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").pluck(:profile_id)
If you need profiles details, you can put this query in where clause of Profile
for id
.
Note the number 2
in query, it is length of your array used in where clause. In this case ["accounting", "administration"].length
UPDATE::
Based on updated question description, instead of pluck
you can use select
and add subquery to make sure it happens in one query.
Profile.where(id: Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").select(:profile_id))
More over you have control over sorting, pagination and additional where clause. Don't see any concerns over there which are mentioned in question edit.
UPDATE 2::
Another way to get intersect of profiles with both the skills (likely to be less efficient than above solution):
profiles = Profile
["accounting", "administration"].each do |name|
profiles = profiles.where(id: Skill.where(name: name).select(:profile_id))
end
Query with LEFT JOIN not returning rows for count of 0
Fix the LEFT JOIN
This should work:
SELECT o.name AS organisation_name, count(e.id) AS total_used
FROM organisations o
LEFT JOIN exam_items e ON e.organisation_id = o.id
AND e.item_template_id = #{sanitize(item_template_id)}
AND e.used
GROUP BY o.name
ORDER BY o.name;
You had a LEFT [OUTER] JOIN
but the later WHERE
conditions made it act like a plain [INNER] JOIN
.
Move the condition(s) to the JOIN
clause to make it work as intended. This way, only rows that fulfill all these conditions are joined in the first place (or columns from the right table are filled with NULL). Like you had it, joined rows are tested for additional conditions virtually after the LEFT JOIN
and removed if they don't pass, just like with a plain JOIN
.
count()
never returns NULL to begin with. It's an exception among aggregate functions in this respect. Therefore, never makes sense, even with additional parameters. The manual:COALESCE(COUNT(col))
It should be noted that except for
count
, these functions return a null value when no rows are selected.
Bold emphasis mine. See:
- Count the number of attributes that are NULL for a row
count()
must be on a column defined NOT NULL
(like e.id
), or where the join condition guarantees NOT NULL
(e.organisation_id
, e.item_template_id
, or e.used
) in the example.
Since used
is type boolean
, the expression e.used = true
is noise that burns down to just e.used
.
Since o.name
is not defined UNIQUE NOT NULL
, you may want to GROUP BY o.id
instead (id
being the PK) - unless you intend to fold rows with the same name (including NULL).
Aggregate first, join later
If most or all rows of exam_items
are counted in the process, this equivalent query is typically considerably faster / cheaper:
SELECT o.id, o.name AS organisation_name, e.total_used
FROM organisations o
LEFT JOIN (
SELECT organisation_id AS id -- alias to simplify join syntax
, count(*) AS total_used -- count(*) = fastest to count all
FROM exam_items
WHERE item_template_id = #{sanitize(item_template_id)}
AND used
GROUP BY 1
) e USING (id)
ORDER BY o.name, o.id;
(This is assuming that you don't want to fold rows with the same name like mentioned above - the typical case.)
Now we can use the faster / simpler count(*)
in the subquery, and we need no GROUP BY
in the outer SELECT
.
See:
- Multiple array_agg() calls in a single query
Rails 4 JOIN GROUP BY and SELECT
Would the following work for you?
User.joins(:orders)
.select("users.*, max(orders.created_at) as most_recent, count(orders.id) as orders_count")
.group('users.id')
Taking the max
of order.created_at
should give you the date of the most recent order. I don't think you want to have your select
as part of has_many
orders, since you're looking for a list of users, not a list of orders. If you'd a method that returns this active record query, assuming you'll use it more than once, you can add the following to your User
model.
def self.with_order_info
self.joins(:orders)
.select("users.*, max(orders.created_at) as most_recent, count(orders.id) as orders_count")
.group('users.id')
end
And then, you can call that method anywhere using:
@users = User.with_order_info
As a further note (to be 100% clear), you should keep your association to orders as:
has_many :orders
SELECT list is not in GROUP BY clause and contains nonaggregated column .... incompatible with sql_mode=only_full_group_by
This
Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'returntr_prod.tbl_customer_pod_uploads.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
will be simply solved by changing the sql mode in MySQL by this command,
SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));
This too works for me..
I used this, because in my project there are many Queries like this so I just changed this sql mode to only_full_group_by
OR simply include all columns in the GROUP BY clause that was specified by the SELECT statement. The sql_mode can be left enabled.
Thank You... :-)
Rails: select unique values from a column
Model.select(:rating)
The result of this is a collection of Model
objects. Not plain ratings. And from uniq
's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
Related Topics
Oracle SQL Date Range Intersections
How to Retrieve a Column Value by Name Using Golang Database/Sql
Count Distinct Records (All Columns) Not Working
Creating User Defined Function in Spark-Sql
Passing a Dataframe List to a Where Clause in a SQL Query Embedded in R
How to Use Max() on a Subquery Result
What Is The Advantage of Using Varbinary Over Varchar Here
How to Increment Value in Postgres Update Statement on JSON Key
How to Select MySQL Rows in The Order of in Clause
Recursive Query Challenge - Simple Parent/Child Example
Quickest/Easiest Way to Use Search/Replace Through All Stored Procedures
Best Practice for Naming SQL Table Columns
Alter Table to Modify Default Value of Column
How to Send a Query Result in CSV Format
Is Cut() Style Binning Available in Dplyr
What Is The Purpose of Rowlock on Delete and When Should I Use It