Can I use non-aggregate columns with group by?
You can't get the Id of the row that MAX found, because there might not be only one id with the maximum age.
GROUP BY clause with non aggregate functions
Sometimes is quite acceptable. Your query, written in more standard SQL, would be something like:
SELECT col, CHAR_LENGTH(col)
FROM (SELECT col FROM table GROUP BY col) c
or as:
SELECT col, MAX(CHAR_LENGTH(col))
FROM table
GROUP BY col
using non aggregate functions you can simplify the query a little bit, but the query would be a little more difficult to read.
It could also be useful when you are sure that all non aggregated columns share the same value:
SELECT id, name, surname
FROM table
GROUP BY id
HAVING COUNT(*)=1
or when it doesn't matter which value you need to return:
SELECT id, name
FROM table
GROUP BY id
will return a single name associated to that id (probably the first name encountered, but we can't be sure which one is the first, order by doesn't help here...). Be warned that if you want to select multiple non aggregated columns:
SELECT id, name, surname
FROM table
GROUP BY id
we have no guarantees that the name and surname returned will belong to the same row.
I would prefer not to use this extension, unless you are 100% sure of why you are using it.
Using GROUP BY and JOIN-ing non-aggregated columns to query
Maybe by aggregating the extra columns?
SELECT foo, bar, baz, moo
, MAX(ins_date) AS mdate
, AVG(percentage) AS avg_perc
, MAX(yes) AS YEAHBABY
, MAX(maybe) AS MAYBEBABY
, MAX(no) AS NONONONONOOO
FROM dummy dum
WHERE ins_date BETWEEN '2018-07-01 00:00:00'
AND '2019-11-01 23:59:59'
AND moo LIKE '%moo%'
AND baz = 'baz'
GROUP BY foo, bar, baz, moo
foo | bar | baz | moo | mdate | avg_perc | YEAHBABY | MAYBEBABY | NONONONONOOO
:------- | :------ | :-- | :---------------- | :------------------ | -------: | -------: | --------: | -----------:
foothing | bar_one | baz | amoosing | 2018-11-29 05:31:00 | 82.8000 | 11236 | 625 | 841
foothing | bar_one | baz | mooman_being | 2019-04-21 10:31:00 | 70.0000 | 3969 | 16 | 121
foothing | bar_one | baz | strawberry_moosse | 2019-03-17 06:37:00 | 80.0000 | 23716 | 529 | 49
db<>fiddle here
Or join to the grouped fields and MAX date.
Or use an emulated row_number.
Or use an EXISTS
.
SELECT foo, bar, baz, moo
, ins_date
, percentage
, yes, maybe, no
FROM dummy dum
WHERE EXISTS (
SELECT 1
FROM dummy dum2
WHERE dum2.ins_date BETWEEN '2018-07-01 00:00:00'
AND '2019-11-01 23:59:59'
AND dum2.moo LIKE '%moo%'
AND dum2.baz = 'baz'
AND dum2.foo = dum.foo
AND dum2.bar = dum.bar
AND dum2.baz = dum.baz
AND dum2.moo = dum.moo
GROUP BY foo, bar, baz, moo
HAVING MAX(dum2.ins_date) = dum.ins_date
);
foo | bar | baz | moo | ins_date | percentage | yes | maybe | no
:------- | :------ | :-- | :---------------- | :------------------ | ---------: | ----: | ----: | --:
foothing | bar_one | baz | strawberry_moosse | 2019-03-17 06:37:00 | 80 | 23716 | 529 | 49
foothing | bar_one | baz | mooman_being | 2019-04-21 10:31:00 | 70 | 3969 | 16 | 121
foothing | bar_one | baz | amoosing | 2018-11-29 05:31:00 | 97 | 9025 | 361 | 1
db<>fiddle here
Are there in guarantees about the non-aggregated columns in a GROUP BY query?
This behavior is covered in SELECT/Simple Select Processing/Side note: Bare columns in an aggregate queries.
In your query the columns id
and letter
, which are not aggregated and are not included in the GROUP BY
clause, are called bare columns.
Because you use the MAX()
aggregate function, the values of these 2 columns:
... take values from the input row which also contains the minimum or
maximum
But, since there may exist more than 1 rows with the maximum val
for the same parent
:
There is still an ambiguity if two or more of the input rows have the
same minimum or maximum value
This means that for your sample data there is no guarantee that for parent = 10
you will get the row with id = 1
in the results.
You may get the row with id = 2
which also contains the maximum val
.
Assuming that in such a case, where for the same parent there may exist more than 1 rows with the maximum val
, you want the row with the minimum id
, you can do it with window functions:
SELECT id, val, parent, letter
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY parent ORDER BY val DESC, id) rn
FROM tablename
)
WHERE rn = 1
or:
SELECT DISTINCT
FIRST_VALUE(id) OVER (PARTITION BY parent ORDER BY val DESC, id) id,
MAX(val) OVER (PARTITION BY parent) val,
parent,
FIRST_VALUE(letter) OVER (PARTITION BY parent ORDER BY val DESC, id) letter
FROM tablename
See the demo.
How do we select non-aggregate columns in a query with a GROUP BY clause, which is not functionally dependent on columns in GROUP BY clause?
Try execute the below query. This will remove the restriction of such.
SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));
Select multiple (non-aggregate function) columns with GROUP BY
You have yourself a greatest-n-per-group problem. This is one of the possible solutions:
select c.mukey, c.comppct_r, c.name, c.type
from c yt
inner join(
select c.mukey, max(c.comppct_r) comppct_r
from c
group by c.mukey
) ss on c.mukey = ss.mukey and c.comppct_r= ss.comppct_r
Another possible approach, same output:
select c1.*
from c c1
left outer join c c2
on (c1.mukey = c2.mukey and c1.comppct_r < c2.comppct_r)
where c2.mukey is null;
There's a comprehensive and explanatory answer on the topic here: SQL Select only rows with Max Value on a Column
SQL Server - How to have non-aggregated columns in Select statement when using group by?
Figured this out literally as soon as I posted it. I was just confused on the proper way to use a window function.
SELECT [Id]
,LastModifiedDate
,Row_Number() over (Partition by Id order by LastModifiedDate DESC) as rn
,[FirstName]
,[LastName]
,[Company]
,[Title]
FROM [MyServer].[Database].[Table] as C
where rn = 1
What i was trying to do was:
,Row_Number() over (Partition by Id order by MAX(LastModifiedDate) DESC) as rn
but MAX makes no sense because it is DESC
Related Topics
SQL Select Rows with Only a Certain Value in Them
SQL Statement Help - Select Latest Order for Each Customer
How to Rename an Index in MySQL
SQL Server 2005 Using Dateadd to Add a Day to a Date
Query Across Multiple Databases on Same Server
How to Write SQL Using Speech Recognition
Return Bit Value as 1/0 and Not True/False in SQL Server
Select Distinct on One Column, Return Multiple Other Columns (SQL Server)
Best-Performance Query for "Select Max in Group"
How to Select the Most Frequently Appearing Values
Arithmetic Overflow Error Converting Numeric to Data Type Numeric
Extracting the Total Number of Seconds from an Interval Data-Type
How to Select Columns from a Table Which Have Non Null Values
Delphi - Prevent Against SQL Injection
Windowed Functions Can Only Appear in the Select or Order by Clauses
How to Pull a List of Id's from a SQL Table as a Comma-Separated Values String