GroupingError: ERROR: column must appear in the GROUP BY clause or be used in an aggregate function
You are not allowed to select reviews.id
(selected implicitly through the wildcard *
) without adding it to the GROUP BY
clause or applying an aggregate function like avg()
. The solution is to do one of the following:
- Remove the wildcard
*
from your select - Add the field
reviews.id
to your group clause - Select
reviews.id
explicitly and apply an aggregate function to it (e.g.sum(reviews.id)
) - Replace the wildcard
*
with the table-specific wildcardalbums.*
The second and third option do not make much sense in your scenario though.
Based on your comment, I added option four.
PG::GroupingError: ERROR: column events.id must appear in the GROUP BY clause or be used in an aggregate function
Event
.order(:popularity)
.joins(:keywords)
.group('events.id') # <======
.where(keywords: { category: 'taxonomy' })
.group('keywords.name')
Postgres error [column must appear in the GROUP BY clause or be used in an aggregate function]
This is just the first of four errors that you will get. PostreSQL stops checking your SQL once it hits an error, so it does not mean that there is only one. In fact you have the same problem with all of id, created_at, updated_at and version. As the error tells if you are using GROUP BY
then all the columns in the SELECT
statement must either be in the GROUP BY
clause, or need to have some sort of aggregate function used on them. Assuming you do not want to add these to the GROUP BY
(on the grounds that nothing will then presumably have a COUNT
> 10), then you either have to drop them from the SELECT
or apply some aggregate function. In your case MAX
might be suitable, but without knowing more I cannot really tell.
ERROR: column must appear in the GROUP BY clause or be used in an aggregate function when using two joins
Here is a version with the GROUP BY
problem corrected:
SELECT
A.name,
A.unit,
B.child,
REGEXP_MATCHES(A.b_number, '([^.]*--[0-9]*).*') AS number,
SUM(CAST(A.amount AS decimal)) AS sum_amount,
COUNT(A.amount) AS cnt_amount
INTO result
FROM B
INNER JOIN A ON B.name = A.name AND B.parent = A.id
INNER JOIN C ON A.name = C.name AND B.child = C.id
GROUP BY
A.name,
A.unit,
B.child,
number;
Note that every column/alias which appears in the SELECT
clause also appears in GROUP BY
. Exceptions to this are columns which appear inside aggregate functions. In that case, it is OK for them to not appear in GROUP BY
.
Postgres SQL: column must appear in the GROUP BY clause or be used in an aggregate function
As a general rule, any column not listed in the GROUP BY
clause should show up aggregated in the SELECT
list.
For example s.name
should show up as max(s.name)
or min(s.name)
since it's not present n the GROUP BY
list. However, PostgreSQL implements functional dependency (a SQL Standard feature) for the GROUP BY
clause, and detects that s.name
is dependent in the s.id
column (that is probably a PK); in short, there's a single possible value s.name
for each s.id
. Therefore, there's no need in PostgreSQL to aggregate this column (you can, but it's not needed).
On the flip side, for lookupStudyType.description
PostgreSQL cannot determine if it's functionally dependent on s.id
or not. You'll need to aggregate it as max(lookupStudyType.description)
or min(lookupStudyType.description)
, or any other aggregation expression.
As a side note, I have rarely seen functional dependency implemented in other databases. Isn't PostgreSQL awesome? (I'm not affiliated with PostgreSQL in any way).
must appear in the GROUP BY clause or be used in an aggregate function
Yes, this is a common aggregation problem. Before SQL3 (1999), the selected fields must appear in the GROUP BY
clause[*].
To workaround this issue, you must calculate the aggregate in a sub-query and then join it with itself to get the additional columns you'd need to show:
SELECT m.cname, m.wmname, t.mx
FROM (
SELECT cname, MAX(avg) AS mx
FROM makerar
GROUP BY cname
) t JOIN makerar m ON m.cname = t.cname AND t.mx = m.avg
;
cname | wmname | mx
--------+--------+------------------------
canada | zoro | 2.0000000000000000
spain | usopp | 5.0000000000000000
But you may also use window functions, which looks simpler:
SELECT cname, wmname, MAX(avg) OVER (PARTITION BY cname) AS mx
FROM makerar
;
The only thing with this method is that it will show all records (window functions do not group). But it will show the correct (i.e. maxed at cname
level) MAX
for the country in each row, so it's up to you:
cname | wmname | mx
--------+--------+------------------------
canada | zoro | 2.0000000000000000
spain | luffy | 5.0000000000000000
spain | usopp | 5.0000000000000000
The solution, arguably less elegant, to show the only (cname, wmname)
tuples matching the max value, is:
SELECT DISTINCT /* distinct here matters, because maybe there are various tuples for the same max value */
m.cname, m.wmname, t.avg AS mx
FROM (
SELECT cname, wmname, avg, ROW_NUMBER() OVER (PARTITION BY avg DESC) AS rn
FROM makerar
) t JOIN makerar m ON m.cname = t.cname AND m.wmname = t.wmname AND t.rn = 1
;
cname | wmname | mx
--------+--------+------------------------
canada | zoro | 2.0000000000000000
spain | usopp | 5.0000000000000000
[*]: Interestingly enough, even though the spec sort of allows to select non-grouped fields, major engines seem to not really like it. Oracle and SQLServer just don't allow this at all. Mysql used to allow it by default, but now since 5.7 the administrator needs to enable this option (ONLY_FULL_GROUP_BY
) manually in the server configuration for this feature to be supported...
Rails PG GroupingError column must appear in the GROUP BY clause
I don't know why you want to select likes.id
in the first place. I see that you basically want the like_count
for each Idea; I don't see the point in selecting likes.id
. Also, when you already have the ideas.id
, I don't see why you would want to get the value of likes.likeable_id
since they'll both be equal. :/
Anyway, the problem is since you're grouping by likeable_id
(basically ideas.id
), you can't "select" likes.id
since they would be "lost" by the grouping.
I suppose SQLite is lax about this. I imagine it wouldn't group things properly.
ANYWAY(2) =>
Let me propose a cleaner solution.
# model
class Idea < ActiveRecord::Base
# to save you the effort of specifying the join-conditions
has_many :likes, foreign_key: :likeable_id
end
# in your code elsewhere
ideas = \
Idea.
joins(:likes).
group("ideas.id").
select("COUNT(likes.id) AS like_count, ideas.id, ideas.title, ideas.intro").
order("like_count DESC")
If you still want to get the IDs of likes for each item, then after the above, here's what you could do:
grouped_like_ids = \
Like.
select(:id, :likeable_id).
each_with_object({}) do |like, hash|
(hash[like.likeable_id] ||= []) << like.id
end
ideas.each do |idea|
# selected previously:
idea.like_count
idea.id
idea.title
idea.intro
# from the hash
like_ids = grouped_like_ids[idea.id] || []
end
Other readers: I'd be very interested in a "clean" one-query non-sub-query solution. Let me know in the comments if you leave a response. Thanks.
GroupingError: ERROR: column must appear in the GROUP BY clause or be used in an aggregate function
You cannot combine SELECT *
with GROUP BY some_column
in Postgres because that's a contradiction (unless it selects from a single table and some_column
is its PK). All non-aggregated columns (used in the SELECT
, HAVING
or ORDER BY
clause outside an aggregate function) must be in the GROUP BY
list - where the primary key column(s) cover(s) all columns of a table. Else it would be undefined which value to pick from the aggregated set.
The manual:
When
GROUP BY
is present, or any aggregate functions are present, it
is not valid for theSELECT
list expressions to refer to ungrouped
columns except within aggregate functions or when the ungrouped column
is functionally dependent on the grouped columns, since there would
otherwise be more than one possible value to return for an ungrouped
column. A functional dependency exists if the grouped columns (or a
subset thereof) are the primary key of the table containing the
ungrouped column.
A certain other RDBMS is known to play dirty tricks here and allow this and pick arbitrary values...
You seem to want a list of unique patients that have commented, with the latest comment each. The simplest way in Postgres is with DISTINCT ON
:
SELECT DISTINCT ON (patient_id) *
FROM comments
WHERE clinician_id = $1
ORDER BY patient_id, created_at DESC NULLS LAST;
But this won't fly with SQLite - which should not be in the loop to begin with. See:
- Generic Ruby solution for SQLite3 "LIKE" or PostgreSQL "ILIKE"?
NULLS LAST
is only relevant if created_at
can be NULL:
- Sort by column ASC, but NULL values first?
Details for DISTINCT ON
:
- Select first row in each GROUP BY group?
Related Topics
How to Get the Current Time as 13-Digit Integer in Ruby
In Rails - Is There a Rails Method to Convert Newlines to <Br>
What Does the "$" Character Mean in Ruby
How to Reference a Function in Ruby
Ruby: File Encryption/Decryption with Private/Public Keys
Ruby 1.9 Hash with a Dash in a Key
Actionview::Template::Error (Incompatible Character Encodings: Utf-8 and Ascii-8Bit)
Keep Form Fields Filled After an Error (Ror)
Missing Symbol When Installing Ruby-2.3.0 on Os X 10.11.6 by Rvm
How to Invoke an Instance Method on a Ruby Module Without Including It
How to Sort an Array of Hashes by a Value in the Hash
In Ruby's Test::Unit::Testcase, How to Override the Initialize Method
How to Pass Multiple Arguments to a Ruby Method as an Array
Routing Nested Resources in Rails 3