Partition by with and Without Keep in Oracle

PARTITION BY with and without KEEP in Oracle

In your example, there's no difference, because your aggregate is on the same column that you are sorting on. The real point/power of "KEEP" is when you aggregate and sort on different columns. For example (borrowing the "test" table from the other answer)...

SELECT deptno,  min(name) keep ( dense_rank first order by sal desc, name  ) ,
max(sal)
FROM test
group by deptno

;

This query gets the name of person with the highest salary in each department. Consider the alternative without a "KEEP" clause:

SELECT deptno, name, sal
FROM test t
WHERE not exists ( SELECT 'person with higher salary in same department'
FROM test t2
WHERE t2.deptno = t.deptno
and (( t2.sal > t.sal )
OR ( t2.sal = t.sal AND t2.name < t.name ) ) )

The KEEP clause is easier and more efficient (only 3 consistent gets vs 34 gets for the alternative, in this simple example).

MAX() OVER PARTITION BY in Oracle SQL

Use window function ROW_NUMBER() OVER (PARTITION BY receipt_item ORDER BY receipt_date DESC) to assign a sequence number to each row. The row with the most recent receipt_date for a receipt_item will be numbered as 1.

WITH
-- various other subqueries above...

AllData AS
(
SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE,
ROW_NUMBER() OVER (PARTITION BY RECEIPT_ITEM ORDER BY RECEIPT_DATE DESC ) AS RN
FROM tblVend
INNER JOIN tblReceipt ON VEND_NUM = RECEIPT_VEND_NUM
WHERE
VEND_NUM IN ( '100','200') AND RECEIPT_DATE >= '01-Jan-2017'
)
SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE
FROM AllData WHERE RN = 1

Partition by with condition statement

I think you want:

select t.*,
count(*) over (partition by item) as num_stores,
sum(promo_flg) over (partition by item) as num_promo_1
from t;

If you actually need distinct counts:

select t.*,
count(distinct store) over (partition by item) as num_stores,
count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from t;

Here is a db<>fiddle. The fiddle uses Oracle because it supports COUNT(DISTINCT) as a window function.

Here is an alternative, if the window functions don't work:

select *
from t join
(select item, count(distinct store) as num_stores, count(distinct case when promo_flg = 1 then store end) as num_stores_promo
from t
group by item
) tt
using (item);

Oracle SQL: Further sort PARTITION BY groups based on first row in each partition

Luckily, you only need to add an analytic max() to the order by clause. You don't need to do anything else.

Suppose "current query" is your existing query, not ordered yet in any way (no order by clause). Add the following at the very end:

... existing query ...
order by max(timetocomplete) over (partition by itemkey) desc,
itemkey,
timetocomplete desc
;

Note that you do not need to add the analytic function to the select clause. The SQL standard says you do; Oracle syntax says you don't. Oracle is taking care of the small additional steps for us, behind the scenes.

This computes the max time to complete for each key. It orders by that max first. In the case of ties (two or more different keys with the same max time to complete), it further orders by key first, and then within each key, by time to complete (descending).

Oracle 'Partition By' and 'Row_Number' keyword

PARTITION BY segregate sets, this enables you to be able to work(ROW_NUMBER(),COUNT(),SUM(),etc) on related set independently.

In your query, the related set comprised of rows with similar cdt.country_code, cdt.account, cdt.currency. When you partition on those columns and you apply ROW_NUMBER on them. Those other columns on those combination/set will receive sequential number from ROW_NUMBER

But that query is funny, if your partition by some unique data and you put a row_number on it, it will just produce same number. It's like you do an ORDER BY on a partition that is guaranteed to be unique. Example, think of GUID as unique combination of cdt.country_code, cdt.account, cdt.currency

newid() produces GUID, so what shall you expect by this expression?

select
hi,ho,
row_number() over(partition by newid() order by hi,ho)
from tbl;

...Right, all the partitioned(none was partitioned, every row is partitioned in their own row) rows' row_numbers are all set to 1

Basically, you should partition on non-unique columns. ORDER BY on OVER needed the PARTITION BY to have a non-unique combination, otherwise all row_numbers will become 1

An example, this is your data:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','X'),
('A','Y'),
('A','Z'),
('B','W'),
('B','W'),
('C','L'),
('C','L');

Then this is analogous to your query:

select
hi,ho,
row_number() over(partition by hi,ho order by hi,ho)
from tbl;

What will be the output of that?

HI  HO  COLUMN_2
A X 1
A Y 1
A Z 1
B W 1
B W 2
C L 1
C L 2

You see thee combination of HI HO? The first three rows has unique combination, hence they are set to 1, the B rows has same W, hence different ROW_NUMBERS, likewise with HI C rows.

Now, why is the ORDER BY needed there? If the previous developer merely want to put a row_number on similar data (e.g. HI B, all data are B-W, B-W), he can just do this:

select
hi,ho,
row_number() over(partition by hi,ho)
from tbl;

But alas, Oracle(and Sql Server too) doesn't allow partition with no ORDER BY; whereas in Postgresql, ORDER BY on PARTITION is optional: http://www.sqlfiddle.com/#!1/27821/1

select
hi,ho,
row_number() over(partition by hi,ho)
from tbl;

Your ORDER BY on your partition look a bit redundant, not because of the previous developer's fault, some database just don't allow PARTITION with no ORDER BY, he might not able find a good candidate column to sort on. If both PARTITION BY columns and ORDER BY columns are the same just remove the ORDER BY, but since some database don't allow it, you can just do this:

SELECT cdt.*,
ROW_NUMBER ()
OVER (PARTITION BY cdt.country_code, cdt.account, cdt.currency
ORDER BY newid())
seq_no
FROM CUSTOMER_DETAILS cdt

You cannot find a good column to use for sorting similar data? You might as well sort on random, the partitioned data have the same values anyway. You can use GUID for example(you use newid() for SQL Server). So that has the same output made by previous developer, it's unfortunate that some database doesn't allow PARTITION with no ORDER BY

Though really, it eludes me and I cannot find a good reason to put a number on the same combinations (B-W, B-W in example above). It's giving the impression of database having redundant data. Somehow reminded me of this: How to get one unique record from the same list of records from table? No Unique constraint in the table

It really looks arcane seeing a PARTITION BY with same combination of columns with ORDER BY, can not easily infer the code's intent.

Live test: http://www.sqlfiddle.com/#!3/27821/6


But as dbaseman have noticed also, it's useless to partition and order on same columns.

You have a set of data like this:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','X'),
('A','X'),
('A','X'),
('B','Y'),
('B','Y'),
('C','Z'),
('C','Z');

Then you PARTITION BY hi,ho; and then you ORDER BY hi,ho. There's no sense numbering similar data :-) http://www.sqlfiddle.com/#!3/29ab8/3

select
hi,ho,
row_number() over(partition by hi,ho order by hi,ho) as nr
from tbl;

Output:

HI  HO  ROW_QUERY_A
A X 1
A X 2
A X 3
B Y 1
B Y 2
C Z 1
C Z 2

See? Why need to put row numbers on same combination? What you will analyze on triple A,X, on double B,Y, on double C,Z? :-)


You just need to use PARTITION on non-unique column, then you sort on non-unique column(s)'s unique-ing column. Example will make it more clear:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','D'),
('A','E'),
('A','F'),
('B','F'),
('B','E'),
('C','E'),
('C','D');

select
hi,ho,
row_number() over(partition by hi order by ho) as nr
from tbl;

PARTITION BY hi operates on non unique column, then on each partitioned column, you order on its unique column(ho), ORDER BY ho

Output:

HI  HO  NR
A D 1
A E 2
A F 3
B E 1
B F 2
C D 1
C E 2

That data set makes more sense

Live test: http://www.sqlfiddle.com/#!3/d0b44/1

And this is similar to your query with same columns on both PARTITION BY and ORDER BY:

select
hi,ho,
row_number() over(partition by hi,ho order by hi,ho) as nr
from tbl;

And this is the ouput:

HI  HO  NR
A D 1
A E 1
A F 1
B E 1
B F 1
C D 1
C E 1

See? no sense?

Live test: http://www.sqlfiddle.com/#!3/d0b44/3


Finally this might be the right query:

SELECT cdt.*,
ROW_NUMBER ()
OVER (PARTITION BY cdt.country_code, cdt.account -- removed: cdt.currency
ORDER BY
-- removed: cdt.country_code, cdt.account,
cdt.currency) -- keep
seq_no
FROM CUSTOMER_DETAILS cdt

What are difference between KEEP and OVER in Oracle analytic SQL

Keep and OVER clause can be used in same query or individually. While OVER clause is used to work over analytical functions KEEP is used with DENSE_RANK FIRST to KEEP the value of FIRST row in DENSE_RANK.

Please refer to below post for detailed explanation.

POST

Oracle SQL Return First & Last Value From Different Columns By Partition

Use first/last option to find statuses. The rest is classic aggregation:

select employee, min(start_), sum(duration),
max(init_status) keep (dense_rank first order by start_),
max(fin_status) keep (dense_rank last order by start_)
from test_data t
group by employee, item_id
order by employee, item_id;

start is a reserved word, so I used start_ for my test.



Related Topics



Leave a reply



Submit