The SQL Over() Clause - When and Why Is It Useful

SQL server what is this over clause doing

The row_number() is a ranking function used to get a sequential value within a portion of the data. The DatePart is going to return the week number in smallest to largest value. In this query all values with the same week number will align by the sequence number or row_number.

SQL - When would an empty OVER clause be used?

OVER() is part of analytic function and define partitions in your recordset. OVER() is just one partition and applied to the whole dataset

i.e. COUNT(*) OVER() will return in each row how many records in your dataset.

look to this http://msdn.microsoft.com/en-us/library/ms189461.aspx

Why do I get a syntax error near my over clause?

over (order by age) is not a modifier of the complete expression. lead(...) over(...) is a single unit, you cannot put random stuff in between. If you want to put the - age part later in the expression, you can, you just have to move it back further than what you tried:

select lead(age, 1) over (order by age) - age as diff

SQL over clause - dividing partition into numbered sub-partitions

If you have sql server 2012+, you can use lag() and a window summation to get this:

select *,sum(PartNoAdd) over (partition by AccountId order by AsOfDate asc) as PartNo_calc
from
(
select *,
case when DebitCredit=lag(DebitCredit,1) over (partition by AccountId order by AsOfDate asc) then 0 else 1 end as PartNoAdd
from t
)t2
order by AccountId asc, AsOfDate asc

At the inner query, PartNoAdd checks if the previous DebitCard for this account is the same. If it is, it returns 0 (we should add nothing), else it returns 1.

Then the outer query sums all the PartNoAdd for this Account.

OVER clause in Oracle

The OVER clause specifies the partitioning, ordering and window "over which" the analytic function operates.

Example #1: calculate a moving average

AVG(amt) OVER (ORDER BY date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)

date amt avg_amt
===== ==== =======
1-Jan 10.0 10.5
2-Jan 11.0 17.0
3-Jan 30.0 17.0
4-Jan 10.0 18.0
5-Jan 14.0 12.0

It operates over a moving window (3 rows wide) over the rows, ordered by date.

Example #2: calculate a running balance

SUM(amt) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

date amt sum_amt
===== ==== =======
1-Jan 10.0 10.0
2-Jan 11.0 21.0
3-Jan 30.0 51.0
4-Jan 10.0 61.0
5-Jan 14.0 75.0

It operates over a window that includes the current row and all prior rows.

Note: for an aggregate with an OVER clause specifying a sort ORDER, the default window is UNBOUNDED PRECEDING to CURRENT ROW, so the above expression may be simplified to, with the same result:

SUM(amt) OVER (ORDER BY date)

Example #3: calculate the maximum within each group

MAX(amt) OVER (PARTITION BY dept)

dept amt max_amt
==== ==== =======
ACCT 5.0 7.0
ACCT 7.0 7.0
ACCT 6.0 7.0
MRKT 10.0 11.0
MRKT 11.0 11.0
SLES 2.0 2.0

It operates over a window that includes all rows for a particular dept.

SQL Fiddle: http://sqlfiddle.com/#!4/9eecb7d/122

TSQL OVER CLAUSE That has no partition by has Order By clause

First one is not working for me (returning Msg 8120
Column 'Salaries.employeeID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
), until I add group by employeeID:

SELECT 
employeeID as ID,
RANK() OVER (ORDER BY AVG (Salary) DESC) AS Value
FROM Salaries
GROUP BY employeeID

Perhaps, for better understanding, it can be rewritten equivalently as:

;with cte as (
SELECT employeeID, AVG (Salary) as AvgSalary
FROM Salaries
GROUP BY employeeID
)
select employeeID as ID
, RANK() OVER (ORDER BY AvgSalary DESC) as Value
--, AvgSalary
from cte

In this case, average salary by employee is calculated in the CTE, and then query is extended with ranking column Value. Adding partition by employeeID to over clause:

;with cte as (
SELECT employeeID, AVG (Salary) as AvgSalary
FROM Salaries
GROUP BY employeeID
)
select employeeID as ID
, RANK() OVER (partition by employeeID ORDER BY AvgSalary DESC) as Value
--, AvgSalary
from cte

will lead to Value = 1 for every row in the result set (which is not what seem attempted to be achieved), because of rank() will reset numbering to 1 for each distinct employeeID, and employeeID is distinct in every row, since data was aggregated by this column prior to ranking.

T-SQL Statement OVER Clause Ranking Functions

Here I have answered a similar problem. Below is the adaptation to your conditions:

-- Preparation
declare @t table (
Field1 int,
Field2 char,
ValidFrom date
);

insert into @t (Field1, Field2, ValidFrom)
values
(200, 'a', '19990101'),
(200, 'b', '20150101'),
(210, 'c', '20150101'),
(210, 'c', '20100101');

-- The query
with cte as (
select t.*,
lag(t.Field2) over(partition by t.Field1 order by t.ValidFrom) as [Prev2]
from @t t
)
select c.Field1, c.Field2, c.ValidFrom,
sum(case when c.Prev2 = c.Field2 then 0 else 1 end)
over(partition by c.Field1 order by c.ValidFrom) as [ExtraColumn]
from cte c;

I only hope you aren't going to run this against millions or records, because 2 partitionings won't make it easy on the CPU and memory. Oh yes, and you need SQL Server 2012 or later for this to work.



Related Topics



Leave a reply



Submit