Oracle 'Partition By' and 'Row_Number' Keyword

Oracle 'Partition By' and 'Row_Number' keyword

PARTITION BY segregate sets, this enables you to be able to work(ROW_NUMBER(),COUNT(),SUM(),etc) on related set independently.

In your query, the related set comprised of rows with similar cdt.country_code, cdt.account, cdt.currency. When you partition on those columns and you apply ROW_NUMBER on them. Those other columns on those combination/set will receive sequential number from ROW_NUMBER

But that query is funny, if your partition by some unique data and you put a row_number on it, it will just produce same number. It's like you do an ORDER BY on a partition that is guaranteed to be unique. Example, think of GUID as unique combination of cdt.country_code, cdt.account, cdt.currency

newid() produces GUID, so what shall you expect by this expression?

select
   hi,ho,
   row_number() over(partition by newid() order by hi,ho)
from tbl;

...Right, all the partitioned(none was partitioned, every row is partitioned in their own row) rows' row_numbers are all set to 1

Basically, you should partition on non-unique columns. ORDER BY on OVER needed the PARTITION BY to have a non-unique combination, otherwise all row_numbers will become 1

An example, this is your data:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','X'),
('A','Y'),
('A','Z'),
('B','W'),
('B','W'),
('C','L'),
('C','L');

Then this is analogous to your query:

select
   hi,ho,
   row_number() over(partition by hi,ho order by hi,ho)
from tbl;

What will be the output of that?

HI  HO  COLUMN_2
A   X   1
A   Y   1
A   Z   1
B   W   1
B   W   2
C   L   1
C   L   2

You see thee combination of HI HO? The first three rows has unique combination, hence they are set to 1, the B rows has same W, hence different ROW_NUMBERS, likewise with HI C rows.

Now, why is the ORDER BY needed there? If the previous developer merely want to put a row_number on similar data (e.g. HI B, all data are B-W, B-W), he can just do this:

select
   hi,ho,
   row_number() over(partition by hi,ho)
from tbl;

But alas, Oracle(and Sql Server too) doesn't allow partition with no ORDER BY; whereas in Postgresql, ORDER BY on PARTITION is optional: http://www.sqlfiddle.com/#!1/27821/1

select
   hi,ho,
   row_number() over(partition by hi,ho)
from tbl;

Your ORDER BY on your partition look a bit redundant, not because of the previous developer's fault, some database just don't allow PARTITION with no ORDER BY, he might not able find a good candidate column to sort on. If both PARTITION BY columns and ORDER BY columns are the same just remove the ORDER BY, but since some database don't allow it, you can just do this:

SELECT cdt.*,
        ROW_NUMBER ()
        OVER (PARTITION BY cdt.country_code, cdt.account, cdt.currency
              ORDER BY newid())
           seq_no
   FROM CUSTOMER_DETAILS cdt

You cannot find a good column to use for sorting similar data? You might as well sort on random, the partitioned data have the same values anyway. You can use GUID for example(you use newid() for SQL Server). So that has the same output made by previous developer, it's unfortunate that some database doesn't allow PARTITION with no ORDER BY

Though really, it eludes me and I cannot find a good reason to put a number on the same combinations (B-W, B-W in example above). It's giving the impression of database having redundant data. Somehow reminded me of this: How to get one unique record from the same list of records from table? No Unique constraint in the table

It really looks arcane seeing a PARTITION BY with same combination of columns with ORDER BY, can not easily infer the code's intent.

Live test: http://www.sqlfiddle.com/#!3/27821/6

But as dbaseman have noticed also, it's useless to partition and order on same columns.

You have a set of data like this:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','X'),
('A','X'),
('A','X'),
('B','Y'),
('B','Y'),
('C','Z'),
('C','Z');

Then you PARTITION BY hi,ho; and then you ORDER BY hi,ho. There's no sense numbering similar data :-) http://www.sqlfiddle.com/#!3/29ab8/3

select
   hi,ho,
   row_number() over(partition by hi,ho order by hi,ho) as nr
from tbl;

Output:

HI  HO  ROW_QUERY_A
A   X   1
A   X   2
A   X   3
B   Y   1
B   Y   2
C   Z   1
C   Z   2

See? Why need to put row numbers on same combination? What you will analyze on triple A,X, on double B,Y, on double C,Z? :-)

You just need to use PARTITION on non-unique column, then you sort on non-unique column(s)'s unique-ing column. Example will make it more clear:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','D'),
('A','E'),
('A','F'),
('B','F'),
('B','E'),
('C','E'),
('C','D');

select
   hi,ho,
   row_number() over(partition by hi order by ho) as nr
from tbl;

PARTITION BY hi operates on non unique column, then on each partitioned column, you order on its unique column(ho), ORDER BY ho

Output:

HI  HO  NR
A   D   1
A   E   2
A   F   3
B   E   1
B   F   2
C   D   1
C   E   2

That data set makes more sense

Live test: http://www.sqlfiddle.com/#!3/d0b44/1

And this is similar to your query with same columns on both PARTITION BY and ORDER BY:

select
   hi,ho,
   row_number() over(partition by hi,ho order by hi,ho) as nr
from tbl;

And this is the ouput:

HI  HO  NR
A   D   1
A   E   1
A   F   1
B   E   1
B   F   1
C   D   1
C   E   1

See? no sense?

Live test: http://www.sqlfiddle.com/#!3/d0b44/3

Finally this might be the right query:

SELECT cdt.*,
     ROW_NUMBER ()
     OVER (PARTITION BY cdt.country_code, cdt.account -- removed: cdt.currency
           ORDER BY 
               -- removed: cdt.country_code, cdt.account, 
               cdt.currency) -- keep
        seq_no
FROM CUSTOMER_DETAILS cdt

Oracle Partition By Keyword

The PARTITION BY clause sets the range of records that will be used for each "GROUP" within the OVER clause.

In your example SQL, DEPT_COUNT will return the number of employees within that department for every employee record. (It is as if you're de-nomalising the emp table; you still return every record in the emp table.)

emp_no  dept_no  DEPT_COUNT
1       10       3
2       10       3
3       10       3 <- three because there are three "dept_no = 10" records
4       20       2
5       20       2 <- two because there are two "dept_no = 20" records

If there was another column (e.g., state) then you could count how many departments in that State.

It is like getting the results of a GROUP BY (SUM, AVG, etc.) without the aggregating the result set (i.e. removing matching records).

It is useful when you use the LAST OVER or MIN OVER functions to get, for example, the lowest and highest salary in the department and then use that in a calculation against this records salary without a sub select, which is much faster.

Read the linked AskTom article for further details.

Error on Partition over row number

Remove as rows. It is not proper syntax for the table/query alias. It is syntax for column alias.

select *
from (
   select T.*, 
          row_number() over (partition by DIS_COL order by COL_2) as row_number --ORDER BY FIELD DETERMINES WHICH ROW IS THE FIRST ROW AND THUS WHICH ONE IS SELECTED.
   from MY_TABLE t
   ) 
where row_number = 1
AND (CRITERIA_COL = 'CRIT_1'
OR CRITERIA_COL_2 = 'CRIT_2');

ROW_NUMBER over PARTITION BY restart row counter between breaks

You can use MATCH_RECOGNIZE:

SELECT user_id,
       activity_date,
       foc_id,
       ROW_NUMBER() OVER ( PARTITION BY user_id, mno ORDER BY activity_date ) AS seq_num
FROM   table_name
MATCH_RECOGNIZE (
  PARTITION BY user_id
  ORDER     BY activity_date
  MEASURES
    MATCH_NUMBER() AS mno
  ALL ROWS PER MATCH
  PATTERN ( same_foc_id* last_row  )
  DEFINE
    same_foc_id AS FIRST( foc_id ) = NEXT( foc_id )
)

or, multiple ROW_NUMBERs:

SELECT user_id,
       activity_date,
       foc_id,
       ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id, grp ORDER BY activity_date ) AS seq_num
FROM   (
  SELECT user_id,
         activity_date,
         foc_id,
         ROW_NUMBER() OVER ( PARTITION BY user_id ORDER BY activity_date )
           - ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id ORDER BY activity_date ) AS grp
  FROM   table_name
)
ORDER BY user_id, activity_date

Which, for the sample data:

CREATE TABLE table_name ( user_id, activity_date, foc_id ) AS
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:20:34' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:39' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:44' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:58' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:20:11' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:16' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:33' HOUR TO SECOND, 'A240' FROM DUAL;

Both output:


USER_ID | ACTIVITY_DATE       | FOC_ID | SEQ_NUM
:------ | :------------------ | :----- | ------:
UVAC3   | 2020-11-04 14:20:34 | A240   |       1
UVAC3   | 2020-11-04 14:21:23 | A240   |       2
UVAC3   | 2020-11-04 14:21:23 | A240   |       3
UVAC3   | 2020-11-04 14:21:23 | A240   |       4
UVAC3   | 2020-11-04 15:19:39 | B410   |       1
UVAC3   | 2020-11-04 15:19:44 | B410   |       2
UVAC3   | 2020-11-04 15:19:58 | B410   |       3
UVAC3   | 2020-11-04 15:20:11 | B410   |       4
UVAC3   | 2020-11-04 15:22:16 | A240   |       1
UVAC3   | 2020-11-04 15:22:33 | A240   |       2

db<>fiddle here

Oracle 12 - Reset a sequence Column with row_number() over Partition

Use a MERGE statement and correlate on the ROWID pseudo-column to efficiently self-join:

MERGE INTO table_name dst
USING (
  SELECT ROWID AS rid,
         ROW_NUMBER() OVER (
           PARTITION BY product
           ORDER BY ROWNUM  -- or ORDER BY seq
         ) AS newSeq
  FROM   table_name
) src
ON ( dst.ROWID = src.rid )
WHEN MATCHED THEN
  UPDATE SET seq = src.newSeq;

(Note: You do NOT need an ORDER BY expression appended to the SELECT statement as it will have no practical effect as the order of the source query is irrelevant during the MERGE and could make the SELECT statement less efficient if the ordering is applied and the SQL engine does unnecessary work.)

Which, for the sample data:

CREATE TABLE table_name ( product, "GROUP", seq ) AS
SELECT 10,  5, 1 FROM DUAL UNION ALL
SELECT 10,  1, 5 FROM DUAL UNION ALL
SELECT 11,  2, 3 FROM DUAL UNION ALL
SELECT 11,  4, 4 FROM DUAL UNION ALL
SELECT 11, 24, 5 FROM DUAL

Then results in:

SELECT * FROM table_name

PRODUCT GROUP SEQ
10 5 1
10 1 2
11 2 1
11 4 2
11 24 3

Related Topics

Which SQL Statement Is Faster? (Having VS. Where...)
Varchar as Foreign Key/Primary Key in Database Good or Bad
Should I Use the Cascade Delete Rule
SQL Select with Column Name Like
Update Rows in One Table with Data from Another Table Based on One Column in Each Being Equal
Is There Any Query for Cassandra as Same as SQL:Like Condition
Sql: Select Rows with a Column Value That Occurs at Least N Times
Temporal Database Design, with a Twist (Live VS Draft Rows)
How to Increment Integer Columns Value by 1 in SQL
Django In/Not in Query
How to Delete from a Table Where Id Is in a List of Ids
SQL Row_Number() Function in Where Clause
Why & When Should I Use Sparse Column? (SQL Server 2008)
Oracle Date "Between" Query
Conditional SQL Count
How to Insert a String Which Contains an "&"
What's the Correct Name for an "Association Table" (A Many-To-Many Relationship)
T-SQL Subquery Max(Date) and Joins

Leave a reply

Submit