Does the Order of Columns Matter in a Group by Clause

Does the order of columns matter in a group by clause?

No, the order doesn't matter for the GROUP BY clause.

MySQL and SQLite are the only databases I'm aware of that allow you to select columns which are omitted from the group by (non-standard, not portable) but the order doesn't matter there either.

Does Column Order Matter in a Group By Clause?

  • Order of column names in GROUP BY does not matter, results will be the same

NOTE: Result is UN-ordered SET of rows, it means that every time your query is executed - database could potentially return it in different order

  • To introduce order into this SET, you need to use ORDER BY and order of columns in ORDER BY makes difference to order of rows, but not to actual result set

Why does the order of columns in an index matter for a group by in Postgresql?

You are right that the result is the same no matter in which order the columns appear in the GROUP BY clause, and that the same execution plan could be used.

The PostgreSQL optimizer just doesn't consider reordering the GROUP BY expressions to see if a different ordering would match an existing index.

This is a limitation, and you can ask the pgsql-hackers list if an enhancement here would be desirable or not. You could back this up with a patch that implements the desired functionality.

However, I am not certain that such an enhancement would be accepted. The down side of such an enhancement would be that the optimizer has to work more, and that would affect planning times for all queries that use a GROUP BY clause. In addition, it is quite easy to work around this limitation: just rewrite your query and change the order of GROUP BY expressions. So I would say that things should be left the way they are now.

Order of columns in GROUP BY clause does affect index use

Your observation is correct. Results would be different as the "prefix" order of columns mentioned in the composite index declaration is used for decision making by the Cost based optimizer. This behavior is due to the usage of B-TREE index

GROUP BY clause is used for ordering the result and hence if

  • the correct order of index is used or
  • only leftmost columns are used in group by
  • leftmost column is used in WHERE clause and rest in correct order in GROUP BY clause index would be used.

More on this and topic of Loose/Tight Index Scan can be found here
https://dev.mysql.com/doc/refman/5.7/en/group-by-optimization.html

BigQuery using a Group By function for two columns, order does not matter

You can use LEAST and GREATEST to sort the fruits in the two columns into alphabetical order, and then group on those sorted values:

SELECT Student,
LEAST(Fruit1, Fruit2) AS Fruit1,
GREATEST(Fruit1, Fruit2) AS Fruit2,
COUNT(*) AS Count,
CASE WHEN COUNT(*) > 1 THEN 'True' ELSE 'False' END AS "Repeated Condition"
FROM fruits
GROUP BY Student, LEAST(Fruit1, Fruit2), GREATEST(Fruit1, Fruit2)

Output:

student     fruit1  fruit2  count   Repeated Condition
Tom Apple Banana 2 True
Gary Apple Banana 1 False

Does the order matter in ORDER BY clause?

ORDER BY a ASC, b DESC, c DESC, d ASC means that

  • First Order a in Ascending order
  • Then order b in descending order
  • Then order c in descending order
  • At last order d in Ascending order

In your example it does not matter because values are the same but in below example it senses such as

Example Inordered list suach as;

| a  |  b  |  c  | d  |
-----------------------
| 10 | 5 | 2 | 7 |
| 10 | 3 | 1 | 7 |
| 5 | 5 | 6 | 3 |
| 4 | 2 | 5 | 3 |
| 5 | 4 | 6 | 3 |
| 3 | 5 | 9 | 4 |
| 3 | 6 | 6 | 4 |
| 5 | 5 | 6 | 3 |
| 4 | 2 | 5 | 3 |
| 5 | 4 | 6 | 3 |
| 3 | 5 | 1 | 4 |
| 3 | 6 | 2 | 4 |

and ORDER BY d ASC, a ASC, b DESC, c DESC;

 | a   |  b |  c | d  |
-----------------------
| 4 | 2 | 5 | 3 |
| 4 | 2 | 5 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 3 | 6 | 6 | 4 |
| 3 | 6 | 2 | 4 |
| 3 | 5 | 9 | 4 |
| 3 | 5 | 1 | 4 |
| 10 | 5 | 2 | 7 |
| 10 | 3 | 1 | 7 |

then Ordered by after ORDER BY a ASC, b DESC, c DESC, d ASC is

 | a  |  b |  c |  d | 
--------------------
| 3 | 6 | 6 | 4 |
| 3 | 6 | 6 | 4 |
| 3 | 6 | 2 | 4 |
| 3 | 5 | 9 | 4 |
| 3 | 5 | 1 | 4 |
| 4 | 2 | 5 | 3 |
| 4 | 2 | 5 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 10 | 5 | 2 | 7 |
| 10 | 3 | 1 | 7 |

For ORDER BY d DESC, c DESC, b ASC, a ASC , result will be like that

 |  a  |  b |  c |  d | 
| 10 | 5 | 2 | 7 |
| 10 | 3 | 1 | 7 |
| 3 | 5 | 9 | 4 |
| 3 | 6 | 6 | 4 |
| 3 | 6 | 2 | 4 |
| 3 | 5 | 1 | 4 |
| 5 | 4 | 6 | 3 |
| 5 | 4 | 6 | 3 |
| 5 | 5 | 6 | 3 |
| 5 | 5 | 6 | 3 |
| 4 | 2 | 5 | 3 |
| 4 | 2 | 5 | 3 |

Please play the given example with order commands here http://rextester.com/LKBRK29917

SQL: does the order of GROUP BY conditions matter?

In any database but MySQL, your statement would be correct without additional comment. MySQL -- with deprecated functionality -- returns the result set in the order specified by the group by. Let me repeat that this is deprecated and you should not depend on the functionality, but it is there and has been used.

In terms of the functionality of group by this makes little difference. You will get the same groups with the same calculated values for each group. The ordering of the result set might differ -- but the group by does not return an ordered result set unless you use oder by.



Related Topics



Leave a reply



Submit