Selecting Top N Elements of a Group in Oracle

Selecting top n elements of a group in Oracle

Is this what you are after?

My test set-up:

SQL> alter session set nls_date_format = 'DD-Mon-YYYY HH24:Mi:SS';

Session altered.

SQL> drop table so_test;

Table dropped.

SQL> create table so_test  (
  2    n varchar2(32)
  3  , v varchar2(32)
  4  , t date );

Table created.

SQL> 
SQL> insert into so_test values ( 'X' , 'Test1', to_date('01-Jan-2011 12:00:00','DD-Mon-YYYY HH24:Mi:SS') );

1 row created.

SQL> insert into so_test values ( 'X' , 'Test2', to_date('01-Jan-2011 13:00:00','DD-Mon-YYYY HH24:Mi:SS') );

1 row created.

SQL> insert into so_test values ( 'X' , 'Test3', to_date('01-Jan-2011 14:00:00','DD-Mon-YYYY HH24:Mi:SS') );

1 row created.

SQL> insert into so_test values ( 'Y' , 'Test5', to_date('02-Jan-2011 12:00:00','DD-Mon-YYYY HH24:Mi:SS') );

1 row created.

SQL> insert into so_test values ( 'Y' , 'Test6', to_date('03-Jan-2011 12:00:00','DD-Mon-YYYY HH24:Mi:SS') );

1 row created.

SQL> insert into so_test values ( 'Y' , 'Test7', to_date('04-Jan-2011 12:00:00','DD-Mon-YYYY HH24:Mi:SS') );

1 row created.

SQL>

Here is the query:

SQL> select n,v,t from (
  2  select n, v , t , rank() over ( partition by n order by t desc) r
  3  from so_test
  4  ) where r <= 2;

N                V                T
-------------------------------- -------------------------------- --------------------
X                Test3                01-Jan-2011 14:00:00
X                Test2                01-Jan-2011 13:00:00
Y                Test7                04-Jan-2011 12:00:00
Y                Test6                03-Jan-2011 12:00:00

SQL>

Oracle SQL: How to SELECT N records for each group / cluster

I recommend an analytical function such as rank() or row_number(). You could do this with hard-coded unions, but the analytical function does all the hard work for you.

select *
from 
(
    select
      bt.col_a,
      bt.col_b,
      bt.process_type_cod,
      row_number() over ( partition by process_type_cod order by col_a nulls last ) rank
    from small_table st
    inner join big_table bt
      on st.process_type_cod = bt.process_type_cod
)
where rank < 11
;

You may not even need that join since big_table has all of the types you care about. In that case, just change the 'from clause' to use big_table and drop the join.

What this does is performs the query and then sorts the records using the 'order by' operator in the partition statement. For a given group (here we grouped by col_a), a numerical row number (i.e. 1, 2, 3, 4, 5, n+1...) is applied to each record consecutively. In the outer where clause, just filter by the records with a number lower than N.

Return the first n records per group Oracle SQL

Consider this non-Windows function approach using a count correlated aggregate query. The idea is to run a department rank subquery and then use that in a derived table that filters outer query by this department rank. Please note your desired results do not return by ordered START_DATE but simply query's row number.

SELECT main.EMPLOYEE, t.START_DATE, t.DEPARTMENT
FROM 
   (SELECT t.EMPLOYEE, t.START_DATE, t.DEPARTMENT, 
          (SELECT Count(*) FROM Employees sub 
           WHERE sub.START_DATE <= t.START_DATE 
           AND sub.Department = t.Department) AS DeptRank
FROM Employees t) main
WHERE main.DeptRank <= 3
ORDER BY main.DEPARTMENT, main.START_DATE;

-- EMPLOYEE START_DATE  DEPARTMENT
-- Tawnee   1/2/1904    Legal
-- Jacinta  1/2/1924    Legal
-- Kirsten  1/2/1933    Legal
-- Edwina   1/2/1902    Mergers
-- Louise   1/2/1912    Mergers
-- Kelly    1/2/1954    Mergers
-- Jane     1/2/1900    Sales
-- Amy      1/2/1901    Sales
-- Cara     1/2/1955    Sales

For the Windows function counterpart:

SELECT main.EMPLOYEE, t.START_DATE, t.DEPARTMENT
FROM 
   (SELECT t.EMPLOYEE, t.START_DATE, t.DEPARTMENT, 
           RANK() OVER (PARTITION BY Department
           ORDER BY START_DATE) AS DeptRank
FROM Employees t) main
WHERE main.DeptRank <= 3
ORDER BY main.DEPARTMENT, main.START_DATE;

And as @Matt comments, you may want to handle ties (i.e., employees who started on same day). Both above solutions will output all such employees depending on rank filter. To take one of the ties in correlated subquery, use Employee name as tiebreaker (or better yet a unique ID if available):

SELECT main.EMPLOYEE, t.START_DATE, t.DEPARTMENT
FROM 
   (SELECT t.EMPLOYEE, t.START_DATE, t.DEPARTMENT, 
           (SELECT Count(*) FROM Employees sub 
            WHERE sub.Department =  t.Department 
             AND (sub.START_DATE <= t.START_DATE
                  OR sub.START_DATE = t.START_DATE 
                  AND sub.EMPLOYEE < t.EMPLOYEE) AS DeptRank
FROM Employees t) main
WHERE main.DeptRank <= 3
ORDER BY main.DEPARTMENT, main.START_DATE;

And for window-function query use ROW_NUMBER() in place of RANK().

Oracle SELECT TOP 10 records

You'll need to put your current query in subquery as below :

SELECT * FROM (
  SELECT DISTINCT 
  APP_ID, 
  NAME, 
  STORAGE_GB, 
  HISTORY_CREATED, 
  TO_CHAR(HISTORY_DATE, 'DD.MM.YYYY') AS HISTORY_DATE  
  FROM HISTORY WHERE 
    STORAGE_GB IS NOT NULL AND 
      APP_ID NOT IN (SELECT APP_ID FROM HISTORY WHERE TO_CHAR(HISTORY_DATE, 'DD.MM.YYYY') ='06.02.2009')
  ORDER BY STORAGE_GB DESC )
WHERE ROWNUM <= 10

Oracle applies rownum to the result after it has been returned.
You need to filter the result after it has been returned, so a subquery is required. You can also use RANK() function to get Top-N results.

For performance try using NOT EXISTS in place of NOT IN. See this for more.

How to select top N rows for each group in a Entity Framework GroupBy with EF 3.1

Update (EF Core 6.0):

EF Core 6.0 added support for translating GroupBy result set projection, so the original code for taking (key, items) now works as it should, i.e.

var query = context.Set<DbDocument>()
    .Where(e => partnerIds.Contains(e.SenderId))
    .GroupBy(e => e.SenderId)
    .Select(g => new
    {
        g.Key,
        Documents = g.OrderByDescending(e => e.InsertedDateTime).Take(10)
    });

However flattening (via SelectMany) is still unsupported, so you have to use the below workaround if you need such query shape.

Original (EF Core 3.0/3.1/5.0):

This is a common problem, unfortunately not supported by EF Core 3.0/3.1/5.0 query translator specifically for GroupBy.

The workaround is to do the groping manually by correlating 2 subqueries - one for keys and one for corresponding data.

Applying it to your examples would be something like this.

If you need (key, items) pairs:

var query = context.Set<DbDocument>()
    .Where(t => partnerIds.Contains(t.SenderId))
    .Select(t => t.SenderId).Distinct() // <--
    .Select(key => new
    {
        Key = key,
        Documents = 
            context.Set<DbDocument>().Where(t => t.SenderId == key) // <--
                 .OrderByDescending(t => t.InsertedDateTime).Take(10)
                 .ToList() // <--
    });

If you need just flat result set containing top N items per key:

var query = context.Set<DbDocument>()
    .Where(t => partnerIds.Contains(t.SenderId))
    .Select(t => t.SenderId).Distinct() // <--
    .SelectMany(key => context.Set<DbDocument>().Where(t => t.SenderId == key) // <--
        .OrderByDescending(t => t.InsertedDateTime).Take(10)
    );

Select only n rows in an SQL for an element

You can use row_number()

select * from
(
select *, row_number() over(partition by table_2_id order by id desc) as rn
from tablename
)A where rn<=5

Selecting Top N Elements of a Group in Oracle