What's the Difference Between Rank() and Dense_Rank() Functions in Oracle

What's the difference between RANK() and DENSE_RANK() functions in oracle?

RANK() gives you the ranking within your ordered partition. Ties are assigned the same rank, with the next ranking(s) skipped. So, if you have 3 items at rank 2, the next rank listed would be ranked 5.

DENSE_RANK() again gives you the ranking within your ordered partition, but the ranks are consecutive. No ranks are skipped if there are ranks with multiple items.

As for nulls, it depends on the ORDER BY clause. Here is a simple test script you can play with to see what happens:

with q as (
select 10 deptno, 'rrr' empname, 10000.00 sal from dual union all
select 11, 'nnn', 20000.00 from dual union all
select 11, 'mmm', 5000.00 from dual union all
select 12, 'kkk', 30000 from dual union all
select 10, 'fff', 40000 from dual union all
select 10, 'ddd', 40000 from dual union all
select 10, 'bbb', 50000 from dual union all
select 10, 'xxx', null from dual union all
select 10, 'ccc', 50000 from dual)
select empname, deptno, sal
, rank() over (partition by deptno order by sal nulls first) r
, dense_rank() over (partition by deptno order by sal nulls first) dr1
, dense_rank() over (partition by deptno order by sal nulls last) dr2
from q;

EMP DEPTNO SAL R DR1 DR2
--- ---------- ---------- ---------- ---------- ----------
xxx 10 1 1 4
rrr 10 10000 2 2 1
fff 10 40000 3 3 2
ddd 10 40000 3 3 2
ccc 10 50000 5 4 3
bbb 10 50000 5 4 3
mmm 11 5000 1 1 1
nnn 11 20000 2 2 2
kkk 12 30000 1 1 1

9 rows selected.

Here's a link to a good explanation and some examples.

Oracle rewrite query using rank or dense_rank

You may use RANK as follows:

WITH cte AS (
SELECT e.employee_id, e.first_name, e.last_name, e.department_id,
d.department_name, e.sal,
RANK() OVER (PARTITION BY e.department_id ORDER BY e.sal DESC) rnk
FROM employees e
INNER JOIN dept d ON e.department_id = d.department_id
)

SELECT employee_id, first_name, last_name, department_id, department_name, sal
FROM cte
WHERE rnk = 1;

use RANK or DENSE_RANK along with aggregate function

You are looking for max score, one row, so use row_number():

select score, row_id, name
from (select t.*, row_number() over (order by score desc, row_id) rn from t)
where rn = 1

demo

You can use rank and dense_rank in your example, but they can return more than one row, for instance when you add row (0.95, 501, 'PQR') to your data.


keep dense_rank is typically used when searched value is other than search criteria, for instance if we look for salary of employee who works the longest:

max(salary) keep (dense_rank first order by sysdate - hiredate desc)

max in this case means that if there are two or more employees who works longest, but exactly the same number of days than we take highest salary.

max(salary) 
keep (dense_rank first order by sysdate - hiredate desc)
over (partition by deptno)

This is the same as above, but salary of longest working employees is shown for each department separately. You can even use empty over() to show salary of longest working employee in separate column except other data like name, salary, hire_date.

Why is dense_rank() function assigning same rank to different records?

Dense_rank and rank will return the same number as long as the value in it's order by clause remains the same.

The difference between dense_rank and rank is that once the value in the order by clause changes, dense_rank will return the next consecutive number, while rank will return a number that's based on the row number.

Row_number will return a different number for each row in the partition, regardless of the uniqueness of the order by column within the partition.
If the order by values aren't unique, row_number will return an arbitrary number.

See a live demo on SQL Fiddle.

Window Function- Dense_Rank and Row_Number difference

In your query, the difference between using dense_rank() and row_number() is that the former allows top ties, while the latter does not.

So if two (or more) records have the same, earliest, transaction_settled_at for a given signup_id, then condition dense_rank() ... = 1 will keep them both, while row_number() will select an undefined record out of the two.

If there no risk of ties, both functions will in your context produce the same resulting dataset.

To reduce the possibility of ties, you can also add additional sorting criterias to the order by clause of the window function:

dense_rank() over (
partition by signup_id
order by transaction_settled_at, some_other_column desc, some_more_column
)

DENSE_RANK() - What's wrong here?

The problem here is your misunderstanding of ranking functions. From DENSE_RANK (Transact-SQL)

Remarks

If two or more rows have the same rank value in the same partition, each of those rows will receive the same rank.

The same is true for RANK. This means that, for your data, as both name and price have the same values, they are given the same rank; in this case 1. The difference between the 2 functions is how they handle rows after equal rows. DENSE_RANK will increment sequentially for each "new" rank, where as RANK will skip rankings where there are rows with equality. I.e. 1,1,2,3 and, 1,1,3,4 respectively.

What you clearly want here, however is ROW_NUMBER. I do note, however, that partitioning and ordering by the same columns normally is a flaw as well, as what ever row is numbered first is arbitrary, and that arbitrary numbering might not be the same each time. Ideally you should be ordering by another column that provides an explicit order; perhaps an ID:

SELECT name,
price,
ROW_NUMBER() OVER (PARTITION BY name, price ORDER BY SomeOtherColumn) AS RN
FROM dbo.YourTable;

If you don't have a column to order by, you can use an arbitrary value, or even (SELECT NULL) as Gordon has done in their answer.



Related Topics



Leave a reply



Submit