Why Does MySQL Allow "Group By" Queries Without Aggregate Functions

Why does MySQL allow group by queries WITHOUT aggregate functions?

I believe that it was to handle the case where grouping by one field would imply other fields are also being grouped:

SELECT user.id, user.name, COUNT(post.*) AS posts 
FROM user
LEFT OUTER JOIN post ON post.owner_id=user.id
GROUP BY user.id

In this case the user.name will always be unique per user.id, so there is convenience in not requiring the user.name in the GROUP BY clause (although, as you say, there is definite scope for problems)

MySQL: How does groupby work on columns without aggregate functions?

Usually use of GROUP BY while listing a field in the select expression without an aggregate function is invalid SQL and should throw an error.

MySQL, however, allows this and simply chooses one value randomly. Try to avoid it, because it is confusing.

To disallow this, you can say at runtime:

SET sql_mode := CONCAT('ONLY_FULL_GROUP_BY,',@@sql_mode);

or use the configuration value and/or command line option sql-mode.

Yes, listing two aggregate functions is completely valid.

Any reason for GROUP BY clause without aggregation function?

is the GROUP BY statement in any way useful without an accompanying aggregate function?

Using DISTINCT would be a synonym in such a situation, but the reason you'd want/have to define a GROUP BY clause would be in order to be able to define HAVING clause details.

If you need to define a HAVING clause, you have to define a GROUP BY - you can't do it in conjunction with DISTINCT.

MySQL breaks without GROUP BY clause

If you need an aggregation function result and the value for this aggregation you can't perform this query avoinding the group by so you must use a subquery (or a join ) eg:

SELECT table_name, num_att
FROM T
where num_att = (select max(num_att) from T)

this features starting from mysql 5.7 .. previous version allow also the use of aggreagated function without group by ...but the use column in group by is correct way ..

Why does MySQL allow you to group by columns that are not selected

Because the book is wrong.

The columns in the group by have only one relationship to the columns in the select according to the ANSI standard. If a column is in the select, with no aggregation function, then it (or the expression it is in) needs to be in the group by statement. MySQL actually relaxes this condition.

This is even useful. For instance, if you want to select rows with the highest id for each group from a table, one way to write the query is:

select t.*
from table t
where t.id in (select max(id)
from table t
group by thegroup
);

(Note: There are other ways to write such a query, this is just an example.)

EDIT:

The query that you are suggesting:

select EMP_ID, SALARY
from EMPLOYEE_PAY_TBL
group by BONUS;

would work in MySQL but probably not in any other database (unless BONUS happens to be a poorly named primary key on the table, but that is another matter). It will produce one row for each value of BONUS. For each row, it will get an arbitrary EMP_ID and SALARY from rows in that group. The documentation actually says "indeterminate", but I think arbitrary is easier to understand.

What you should really know about this type of query is simply not to use it. All the "bare" columns in the SELECT (that is, with no aggregation functions) should be in the GROUP BY. This is required in most databases. Note that this is the inverse of what the book says. There is no problem doing:

select EMP_ID
from EMPLOYEE_PAY_TBL
group by EMP_ID, BONUS;

Except that you might get multiple rows back for the same EMP_ID with no way to distinguish among them.

MySQL Aggregate Functions without GROUP BY clause

It's by design - it's one of many extensions to the standard that MySQL permits.

For a query like SELECT name, MAX(age) FROM t; the reference docs says that:

Without GROUP BY, there is a single group and it is indeterminate
which name value to choose for the group

See the documentation on group by handling for more information.

The setting ONLY_FULL_GROUP_BY controls this behavior, see 5.1.7 Server SQL Modes enabling this would disallow a query with an aggregate function lacking a group by statement and it's enabled by default from MySQL version 5.7.5.

Why columns in selection without aggregate function needs to be part of Group by clause in MySQL?

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions (like sum, max, min etc which would return single value for each group), since there would otherwise be more than one possible value to return for an ungrouped column and select won't just return you an arbitrary value.

However, there are multiple workarounds to this.

Option 1. Which you did yourself, adding the other column in group by as -

    SELECT 
matchid
, mdate
, COUNT(player)
FROM game
JOIN goal
ON id = matchid
WHERE (team1= 'POL' OR team2= 'POL')
GROUP BY matchid, mdate;

Option 2. Also, what you could do in this instance is to add aggregate function on the other column as below (since the field mdate is functionally dependent on match id hence you can do that. You can use any aggregate function which would pick a value)

    SELECT
matchid
, max(mdate) as mdate
, COUNT(player)
FROM game
JOIN goal
ON id = matchid
WHERE (team1= 'POL' OR team2= 'POL')
GROUP BY matchid;

Option 3. You can calculate the aggregate in a sub-query and then join it with itself to get the additional columns you'd need to show as below

    select 
t1.matchid
, t2.mdate
, t1.count_player
from
(SELECT
matchid
, COUNT(player) as count_player
FROM game
JOIN goal
ON id = matchid
WHERE (team1= 'POL' OR team2= 'POL')
GROUP BY matchid) t1
join game t2 on t1.matchid = t2.id;

Option 4. You can also use window function and get the distinct tuple value

    SELECT distinct 
matchid
, mdate
, COUNT(player) over(partition by matchid) as
count_player
FROM game
JOIN goal
ON id = matchid
WHERE (team1= 'POL' OR team2= 'POL');


Related Topics



Leave a reply



Submit