Is there a performance difference in using a GROUP BY with MAX() as the aggregate vs ROW_NUMBER over partition by?
The group by should be faster. The row number has to assign a row to all rows in the table. It does this before filtering out the ones it doesn't want.
The second query is, by far, the better construct. In the first, you have to be sure that the columns in the partition clause match the columns that you want. More importantly, "group by" is a well-understood construct in SQL. I would also speculate that the group by might make better use of indexes, but that is speculation.
SQL: difference between PARTITION BY and GROUP BY
They're used in different places. GROUP BY
modifies the entire query, like:
select customerId, count(*) as orderCount
from Orders
group by customerId
But PARTITION BY
just works on a window function, like ROW_NUMBER()
:
select row_number() over (partition by customerId order by orderId)
as OrderNumberForThisCustomer
from Orders
GROUP BY
normally reduces the number of rows returned by rolling
them up and calculating averages or sums for each row.PARTITION BY
does not affect the number of rows returned, but it
changes how a window function's result is calculated.
Get top 1 row of each group
;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
FROM DocumentStatusLogs
)
SELECT *
FROM cte
WHERE rn = 1
If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead
As for normalised or not, it depends if you want to:
- maintain status in 2 places
- preserve status history
- ...
As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.
Comparison Group by VS Over Partition By
Yes It may affects
Second query is an example of Inline View.
It's a very useful method for performing reports with various types of counts or use of any aggregate functions with it.
Oracle executes the subquery and then uses the resulting rows as a view in the FROM clause.
As we consider about performance , always recommend inline view instead of choosing another subquery type.
And one more thing second query will give all max records,while first one will give you only one max record.
see here
Row_number over partition and find the max rn value
Use max
window function.
SELECT T.*,MAX(rn) OVER(PARTITION BY OrderNo) AS rn_max
FROM (
select OrderNO,PartCode,Quantity,row_number() over(partition by OrderNO order by DateEntered desc) as rn
from YourTable
) T
Edit: An easier option is to use count
as suggested by @Jason A. Long in the comments.
select OrderNO
,PartCode
,Quantity
,row_number() over(partition by OrderNO order by DateEntered desc) as rn
,count(*) over(partition by OrderNO) as maxrn
from YourTable
Need Help for SQL MIN MAX Group By and AGGREGATE
This is a gaps-and-island problem, where you want to group together "adjacent" rows having the same holeid
and alteration
.
Here is on approach using window functions: the difference between row numbers can be used to define the groups.
select
max(id) max_id,
min([from]) min_from,
max([to]) max_to,
alteration
from (
select
a.*,
row_number() over(partition by holeid order by [from]) rn1,
row_number() over(partition by holeid, alteration order by [from]) rn2
from dbo.alt a
) t
group by holeid, alteration, rn1 - rn2
order by min_from
Demo on DB Fiddle:
min_from | max_to | alteration
:------- | :----- | :---------
0.00 | 132.60 | AA-LT-1
132.60 | 171.28 | ARG-1-MSI
171.28 | 216.80 | AA-LT-1
216.80 | 232.60 | ARG-2-Kaol
232.60 | 256.90 | ARG-1-MSI
256.90 | 265.70 | ARG-2-Kaol
265.70 | 290.10 | ARG-1-MSI
290.10 | 294.85 | ARG-2-Kaol
294.85 | 325.00 | ARG-1-MSI
325.00 | 332.10 | ARG-2-Kaol
332.10 | 382.70 | ARG-1-MSI
382.70 | 396.10 | ARG-2-Kaol
396.10 | 416.20 | ARG-1-MSI
Note: your sample data has no column id
so this does not appear in the above results.
How to parse first_value aggregate in a group by statement [SNOWFLAKE] SQL
First_value is not an aggregate function. But an window function, thus you get an error when you use it in relation to a GROUP BY. If you want to use it with a group up put an ANY_VALUE around it.
here is some data I will use below in a CTE:
with data(id, seq, val) as (
select * from values
(1, 1, 10),
(1, 2, 11),
(1, 3, 12),
(1, 4, 13),
(2, 1, 20),
(2, 2, 21),
(2, 3, 22)
)
So to show FIRST_VALUE is a window function we can just use it
select *
,first_value(val)over(partition by id order by seq) as first_val
from data
ID | SEQ | VAL | FIRST_VAL |
---|---|---|---|
1 | 1 | 10 | 10 |
1 | 2 | 11 | 10 |
1 | 3 | 12 | 10 |
1 | 4 | 13 | 10 |
2 | 1 | 20 | 20 |
2 | 2 | 21 | 20 |
2 | 3 | 22 | 20 |
Get records with max value for each group of grouped SQL results
There's a super-simple way to do this in mysql:
select *
from (select * from mytable order by `Group`, age desc, Person) x
group by `Group`
This works because in mysql you're allowed to not aggregate non-group-by columns, in which case mysql just returns the first row. The solution is to first order the data such that for each group the row you want is first, then group by the columns you want the value for.
You avoid complicated subqueries that try to find the max()
etc, and also the problems of returning multiple rows when there are more than one with the same maximum value (as the other answers would do)
Note: This is a mysql-only solution. All other databases I know will throw an SQL syntax error with the message "non aggregated columns are not listed in the group by clause" or similar. Because this solution uses undocumented behavior, the more cautious may want to include a test to assert that it remains working should a future version of MySQL change this behavior.
Version 5.7 update:
Since version 5.7, the sql-mode
setting includes ONLY_FULL_GROUP_BY
by default, so to make this work you must not have this option (edit the option file for the server to remove this setting).
Is there any difference between GROUP BY and DISTINCT
MusiGenesis' response is functionally the correct one with regard to your question as stated; the SQL Server is smart enough to realize that if you are using "Group By" and not using any aggregate functions, then what you actually mean is "Distinct" - and therefore it generates an execution plan as if you'd simply used "Distinct."
However, I think it's important to note Hank's response as well - cavalier treatment of "Group By" and "Distinct" could lead to some pernicious gotchas down the line if you're not careful. It's not entirely correct to say that this is "not a question about aggregates" because you're asking about the functional difference between two SQL query keywords, one of which is meant to be used with aggregates and one of which is not.
A hammer can work to drive in a screw sometimes, but if you've got a screwdriver handy, why bother?
(for the purposes of this analogy, Hammer : Screwdriver :: GroupBy : Distinct
and screw => get list of unique values in a table column
)
Related Topics
Can SQL Clr Triggers Do This? or Is There a Better Way
Different Value Counts on Same Column
How to Create Foreign Keys Across Databases
Why Are Dot-Separated Prefixes Ignored in the Column List for Insert Statements
Inserting Default Value as Current Date + 30 Days in MySQL
Notify My Wcf Service When My Database Is Updated
SQL How to Search a Many to Many Relationship
Return Value from MySQL Stored Procedure
When Should You Consider Indexing Your SQL Tables
How to Find Fifth Highest Salary in a Single Query in SQL Server
SQL Query - Delete Duplicates If More Than 3 Dups
Replace Unicode Characters in T-Sql
Bigquery SQL: Average, Geometric Mean, Remove Outliers, Median