Does the order of columns matter in a group by clause?
No, the order doesn't matter for the GROUP BY clause.
MySQL and SQLite are the only databases I'm aware of that allow you to select columns which are omitted from the group by (non-standard, not portable) but the order doesn't matter there either.
Does Column Order Matter in a Group By Clause?
- Order of column names in
GROUP BY
does not matter, results will be the same
NOTE: Result is UN-ordered SET of rows, it means that every time your query is executed - database could potentially return it in different order
- To introduce order into this SET, you need to use
ORDER BY
and order of columns inORDER BY
makes difference to order of rows, but not to actual result set
Why does the order of columns in an index matter for a group by in Postgresql?
You are right that the result is the same no matter in which order the columns appear in the GROUP BY
clause, and that the same execution plan could be used.
The PostgreSQL optimizer just doesn't consider reordering the GROUP BY
expressions to see if a different ordering would match an existing index.
This is a limitation, and you can ask the pgsql-hackers list if an enhancement here would be desirable or not. You could back this up with a patch that implements the desired functionality.
However, I am not certain that such an enhancement would be accepted. The down side of such an enhancement would be that the optimizer has to work more, and that would affect planning times for all queries that use a GROUP BY
clause. In addition, it is quite easy to work around this limitation: just rewrite your query and change the order of GROUP BY
expressions. So I would say that things should be left the way they are now.
Order of columns in GROUP BY clause does affect index use
Your observation is correct. Results would be different as the "prefix" order of columns mentioned in the composite index declaration is used for decision making by the Cost based optimizer. This behavior is due to the usage of B-TREE index
GROUP BY clause is used for ordering the result and hence if
- the correct order of index is used or
- only leftmost columns are used in group by
- leftmost column is used in WHERE clause and rest in correct order in GROUP BY clause index would be used.
More on this and topic of Loose/Tight Index Scan can be found here
https://dev.mysql.com/doc/refman/5.7/en/group-by-optimization.html
BigQuery using a Group By function for two columns, order does not matter
You can use LEAST
and GREATEST
to sort the fruits in the two columns into alphabetical order, and then group on those sorted values:
SELECT Student,
LEAST(Fruit1, Fruit2) AS Fruit1,
GREATEST(Fruit1, Fruit2) AS Fruit2,
COUNT(*) AS Count,
CASE WHEN COUNT(*) > 1 THEN 'True' ELSE 'False' END AS "Repeated Condition"
FROM fruits
GROUP BY Student, LEAST(Fruit1, Fruit2), GREATEST(Fruit1, Fruit2)
Output:
student fruit1 fruit2 count Repeated Condition
Tom Apple Banana 2 True
Gary Apple Banana 1 False
Does the order matter in ORDER BY clause?
ORDER BY a ASC, b DESC, c DESC, d ASC
means that
- First Order a in Ascending order
- Then order b in descending order
- Then order c in descending order
- At last order d in Ascending order
In your example it does not matter because values are the same but in below example it senses such as
Example Inordered list suach as;
| a | b | c | d |
-----------------------
| 10 | 5 | 2 | 7 |
| 10 | 3 | 1 | 7 |
| 5 | 5 | 6 | 3 |
| 4 | 2 | 5 | 3 |
| 5 | 4 | 6 | 3 |
| 3 | 5 | 9 | 4 |
| 3 | 6 | 6 | 4 |
| 5 | 5 | 6 | 3 |
| 4 | 2 | 5 | 3 |
| 5 | 4 | 6 | 3 |
| 3 | 5 | 1 | 4 |
| 3 | 6 | 2 | 4 |
and ORDER BY d ASC, a ASC, b DESC, c DESC;
| a | b | c | d |
-----------------------
| 4 | 2 | 5 | 3 |
| 4 | 2 | 5 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 3 | 6 | 6 | 4 |
| 3 | 6 | 2 | 4 |
| 3 | 5 | 9 | 4 |
| 3 | 5 | 1 | 4 |
| 10 | 5 | 2 | 7 |
| 10 | 3 | 1 | 7 |
then Ordered by after ORDER BY a ASC, b DESC, c DESC, d ASC
is
| a | b | c | d |
--------------------
| 3 | 6 | 6 | 4 |
| 3 | 6 | 6 | 4 |
| 3 | 6 | 2 | 4 |
| 3 | 5 | 9 | 4 |
| 3 | 5 | 1 | 4 |
| 4 | 2 | 5 | 3 |
| 4 | 2 | 5 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 10 | 5 | 2 | 7 |
| 10 | 3 | 1 | 7 |
For ORDER BY d DESC, c DESC, b ASC, a ASC
, result will be like that
| a | b | c | d |
| 10 | 5 | 2 | 7 |
| 10 | 3 | 1 | 7 |
| 3 | 5 | 9 | 4 |
| 3 | 6 | 6 | 4 |
| 3 | 6 | 2 | 4 |
| 3 | 5 | 1 | 4 |
| 5 | 4 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 5 | 6 | 3 |
| 4 | 2 | 5 | 3 |
| 4 | 2 | 5 | 3 |
Please play the given example with order commands here http://rextester.com/LKBRK29917
SQL: does the order of GROUP BY conditions matter?
In any database but MySQL, your statement would be correct without additional comment. MySQL -- with deprecated functionality -- returns the result set in the order specified by the group by
. Let me repeat that this is deprecated and you should not depend on the functionality, but it is there and has been used.
In terms of the functionality of group by
this makes little difference. You will get the same groups with the same calculated values for each group. The ordering of the result set might differ -- but the group by
does not return an ordered result set unless you use oder by
.
Related Topics
How Long Should SQL Email Fields Be
Singular or Plural Database Table Names
How to Insert Multiple Rows into Oracle with a Sequence Value
"Order by ... Using" Clause in Postgresql
How to Escape Non-Format Characters in Oracle's To_Char
How to Run a SQL Plus Script in Powershell
Ms SQL Server - How to Create a View from a Cte
SQL Conditional Column Data Return in a Select Statement
How to Add a Unique Constraint to a Postgresql Table, After It's Already Created
How to Combine 2 Select Statements into One
Insert a Blob via a SQL Script
How to Get Week Start and End Date String in Postgresql
Calling Stored Procedure from Another Stored Procedure SQL Server
Export from SQL Server 2012 to .CSV Through Management Studio