Return a grouped list with occurrences using Rails and PostgreSQL
Your problem:
Unfortunately the strictness of Postgres breaks that query because it requires all fields to be specified in the group by clause.
Now, that has changed somewhat with PostgreSQL 9.1 (quoting release notes of 9.1):
Allow non-
GROUP BY
columns in the query target list when the primary
key is specified in theGROUP BY
clause (Peter Eisentraut)
What's more, the basic query you describe would not even run into this:
Show a list of the 5 most commonly used tags, together
with the times they have been tagged.
SELECT tag_id, count(*) AS times
FROM taggings
GROUP BY tag_id
ORDER BY times DESC
LIMIT 5;
Works in any case.
Summary Count by Group with ActiveRecord
I gave up on trying to do this the ActiveRecord way. Instead I just constructed my query into a string and passed the string into
ActiveRecord::Base.connection.execute(sql_string)
This had the side effect that my result set came out as a array instead of a set of objects. So getting at the values went from a syntax (where user_data is the name assigned to a single record from the result set) like
user_data.total_count
to
user_data['total_count']
But that's a minor issue. Not worth the hassle.
Limit in Group - ActiveRecord Postgres
For anyone experiencing a similar issue, I would recommend checking out window functions and this blog post covering different ways to solve a similar question. The three approaches covered in the post include using 1) group_by, 2) SQL subselects, 3) window functions.
My solution, using window functions:
@events.where("(events.id)
IN (
SELECT id FROM
( SELECT DISTINCT id,
row_number() OVER (PARTITION BY DATE_TRUNC('day', start) ORDER BY id) AS rank
FROM events) AS result
WHERE (
start >= '#{startt}' and
start <= '#{endt}' and
rank <= 3
)
)
")
Find single occurrence of matched and non-matched records from has many through association
There is something odd in your model. The relationship between groups.name
and the user_id
that shows up in both groups
and favourites
is unclear. The unique
constraint on favourites_groups
should make the user_id
in favourites
unnecessary, so I added a commented-out join
condition.
Please try this query to see if it returns what you need:
select g.id as group_id, g.name as group_name
from groups g
left join favourites_groups fg
on fg.group_id = g.id
left join favourites f
on f.id = fg.favorite_id
-- and f.user_id = g.user_id
where g.user_id = 100
and f.product_id = 1002
;
Update
Sorry about that. This should return what you want:
select g.id as group_id, g.name as group_name,
max(f.id) as favorite_id,
max(f.product_id) as product_id
from groups g
left join favourites_groups fg
on fg.group_id = g.id
left join favourites f
on f.id = fg.favorite_id
and f.product_id = 1000
where g.user_id = 100
group by g.id, g.name
order by g.id;
rails group order by count
You need explicitly specify column(s), on which you do GROUP BY in SELECT clause.
All other parts of SELECT clause must be aggregates like count(), sum(), etc.
Notice, that we use count(distinct ..) here because each animal ID might appear multiple times due to the chain of JOINs:
SELECT
interests.id,
COUNT(DISTINCT animals.id) as animals_count
JOIN interests_animals ON animals.id = interests_animals.animal_id
JOIN interests ON interests_animals.interest_id = interests.id
JOIN interests_users ON interests.id = interests_users.interest_id
WHERE interests_users.user_id = XXX
GROUP BY 1
ORDER BY 2 desc;
-- in GROUP BY and ORDER BY, it is usually convenient to use just numbers -- "1" means "the 1st column of SELECT clause", etc.
Also, "INNER" is an optional keyword (simply "JOIN" and "INNER JOIN" are the same thing).
Also, as a side note, you might found useful to add this to your SELECT clause:
, array_agg(animals.id order by animals.id) as animal_ids
-- this will give you integer array of all animal IDs that relate to a particular interest, ordered.
PostgreSQL - GROUP BY clause
Postgres 9.1 or later, quoting the release notes of 9.1 ...
Allow non-
GROUP BY
columns in the query target list when the primary
key is specified in theGROUP BY
clause (Peter Eisentraut)The SQL standard allows this behavior, and because of the primary key,
the result is unambiguous.
Related:
- Return a grouped list with occurrences using Rails and PostgreSQL
The queries in the question and in @Michael's answer have the logic backwards. We want to count how many tags match per article, not how many articles have a certain tag. So we need to GROUP BY w_article.id
, not by a_tags.id
.
list all articles with that tag, and also how many of given tags they match
To fix this:
SELECT count(t.tag) AS ct, a.* -- any column from table a allowed ...
FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id
JOIN w_article a ON a.id = a2t.article
WHERE t.tag IN ('css', 'php')
GROUP BY a.id -- ... since PK is in GROUP BY
LIMIT 9;
Assuming id
is the primary key of w_article
.
However, this form will be faster while doing the same:
SELECT a.*, ct
FROM (
SELECT a2t.article AS id, count(*) AS ct
FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id
GROUP BY 1
LIMIT 9 -- LIMIT early - cheaper
) sub
JOIN w_article a USING (id); -- attached alias to article in the sub
Closely related answer from just yesterday:
- Why does the following join increase the query time significantly?
SQL query to return a grouped result as a single row
The following should work in any RDBMS:
SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) as errors,
sum(case when status = 'complete' then 1 end) as completed,
sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs
GROUP BY created_at;
The query uses conditional aggregation so as to pivot grouped data. It assumes that status
values are known before-hand. If you have additional cases of status
values, just add the corresponding sum(case ...
expression.
Demo here
How to GROUP BY several days in PostgreSQL?
SELECT ts, COUNT(DISTINCT(user_id)) FROM
( SELECT current_date + s.ts FROM generate_series(-20,0,1) AS s(ts) )
AS series(ts)
LEFT JOIN messages
ON messages.created_at::date between ts - 1 and ts -- JOIN on a range
GROUP BY ts
ORDER BY ts
Related Topics
Postgres Trigger After Insert Accessing New
Difference Between Varchar(500) VS Varchar(Max) in SQL Server
How to Combine Two Rows and Calculate the Time Difference Between Two Timestamp Values in MySQL
SQL Server, Converting Seconds to Minutes, Hours, Days
Number of Fridays Between Two Dates
Does Limiting a Query to One Record Improve Performance
Is There Any Better Option to Apply Pagination Without Applying Offset in SQL Server
SQL Get "Iso Year" for Iso Week
Finding All Records Without Associated Ones
When Are Database Triggers Bad
Invalid Column Name on SQL Server Update After Column Create
Select Only Some Columns from a Table on a Join
How to Grant All Privileges on Views to Arbitrary User
How to Import Excel Files with Different Names and Same Schema into Database
How to Create a "Unique" Constraint on a Boolean MySQL Column