Oracle SQL - Identify Sequential Value Ranges

Oracle SQL - Identify sequential value ranges

This is easy to do with a technique called Tabibitosan.

What this technique does is compare the positions of each group's rows to the overall set of rows, in order to work out if rows in the same group are next to each other or not.

E.g., with your example data, this looks like:

WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 6 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 7 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT ID,
NAME,
department,
row_number() OVER (ORDER BY ID) overall_rn,
row_number() OVER (PARTITION BY department ORDER BY ID) department_rn,
row_number() OVER (ORDER BY ID) - row_number() OVER (PARTITION BY department ORDER BY ID) grp
FROM your_table;

ID NAME DEPARTMENT OVERALL_RN DEPARTMENT_RN GRP
---------- ------- ---------- ---------- ------------- ----------
1 Michael Marketing 1 1 0
2 Alex Marketing 2 2 0
3 Tom Marketing 3 3 0
4 John Sales 4 1 3
5 Brad Marketing 5 4 1
6 Leo Marketing 6 5 1
7 Kevin Production 7 1 6

Here, I've given all the rows across the entire set of data a row number in ascending id order (the overall_rn column), and I've given the rows in each department a row number (the department_rn column), again in ascending id order.

Now that I've done that, we can subtract one from the other (the grp column).

Notice how the number in the grp column remains the same for deparment rows that are next to each other, but it changes each time there's a gap.

E.g. for the Marketing department, rows 1-3 are next to each other and have grp = 0, but the 4th Marketing row is actually on the 5th row of the overall results set, so it now has a different grp number. Since the 5th marketing row is on the 6th row of the overall set, it has the same grp number as the 4th marketing row, so we know they're next to each other.

Once we have that grp information, it's a simple matter of doing an aggregate query grouping on both the department and our new grp column, using min and max to find the start and end ids:

WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 6 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 7 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT department,
MIN(ID) start_id,
MAX(ID) end_id
FROM (SELECT ID,
NAME,
department,
row_number() OVER (ORDER BY ID) - row_number() OVER (PARTITION BY department ORDER BY ID) grp
FROM your_table)
GROUP BY department, grp;

DEPARTMENT START_ID END_ID
---------- ---------- ----------
Marketing 1 3
Marketing 5 6
Sales 4 4
Production 7 7

N.B., I've assumed that gaps in the id columns aren't important (i.e. if there was no row for id = 6 (so Leo and Kevin's ids were 7 and 8 respectively), then Leo and Brad would still appear in the same group, with a start id = 5 and end id = 7.

If gaps in the id columns count as indicating a new group, then you could just use the id to label the overall set of rows (i.e. no need to caluclate the overall_rn; just use the id column instead).

That means your query would become:

WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 7 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
SELECT 8 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT department,
MIN(ID) start_id,
MAX(ID) end_id
FROM (SELECT ID,
NAME,
department,
ID - row_number() OVER (PARTITION BY department ORDER BY ID) grp
FROM your_table)
GROUP BY department, grp;

DEPARTMENT START_ID END_ID
---------- ---------- ----------
Marketing 1 3
Sales 4 4
Marketing 5 5
Marketing 7 7
Production 8 8

Find ranges from a series of numbers in SQL/Oracle

You could do it using ROW_NUMBER analytic function. See Find range of consecutive values in a sequence of numbers or dates.

For example,

Range

SQL> with data(num) as(
2 select 1 from dual union
3 select 2 from dual union
4 select 3 from dual union
5 select 5 from dual union
6 select 6 from dual union
7 select 7 from dual union
8 select 10 from dual union
9 select 11 from dual union
10 select 12 from dual union
11 select 20 from dual
12 )
13 select min(num)||'-'|| max(num) as "range"
14 from (select num,
15 num-Row_Number() over(order by num)
16 as rn
17 from data)
18 group by rn
19 order by min(num);

range
-------------------------------------------------
1-3
5-7
10-12
20-20

SQL>

List

SQL> with data(num) as(
2 select 1 from dual union
3 select 2 from dual union
4 select 3 from dual union
5 select 5 from dual union
6 select 6 from dual union
7 select 7 from dual union
8 select 10 from dual union
9 select 11 from dual union
10 select 12 from dual union
11 select 20 from dual
12 )
13 SELECT listagg(range, ',') WITHIN GROUP(
14 ORDER BY min_num) AS "list"
15 FROM
16 (SELECT MIN(num) min_num,
17 MIN(num)
18 ||'-'
19 || MAX(num) range
20 FROM
21 (SELECT num, num-Row_Number() over(order by num) AS rn FROM DATA
22 )
23 GROUP BY rn
24 );

list
-------------------------------------------------------------------------
1-3,5-7,10-12,20-20

SQL>

Update OP wants a solution in PL/SQL to store the list in a PL/SQL variable.

Setup

SQL> CREATE TABLE t AS
2 SELECT *
3 FROM
4 ( WITH data(num) AS
5 ( SELECT 1 FROM dual
6 UNION
7 SELECT 2 FROM dual
8 UNION
9 SELECT 3 FROM dual
10 UNION
11 SELECT 5 FROM dual
12 UNION
13 SELECT 6 FROM dual
14 UNION
15 SELECT 7 FROM dual
16 UNION
17 SELECT 10 FROM dual
18 UNION
19 SELECT 11 FROM dual
20 UNION
21 SELECT 12 FROM dual
22 UNION
23 SELECT 20 FROM dual
24 )
25 SELECT * FROM DATA);

Table created.

PL/SQL block

SQL> SET SERVEROUTPUT ON
SQL> DECLARE
2 v_list VARCHAR2(100);
3 BEGIN
4 SELECT listagg(RANGE, ',') WITHIN GROUP(
5 ORDER BY min_num)
6 INTO v_list
7 FROM
8 (SELECT MIN(num) min_num,
9 MIN(num)
10 ||'-'
11 || MAX(num) range
12 FROM
13 (SELECT num, num-Row_Number() over(order by num) AS rn FROM t
14 )
15 GROUP BY rn
16 );
17 dbms_output.put_line(v_list);
18 END;
19 /
1-3,5-7,10-12,20-20

PL/SQL procedure successfully completed.

SQL>

Oracle/SQL - Select specified range of sequential records

I believe you're looking for something like this in Oracle:

select * 
from ( select a.*, rownum rnum
from ( YOUR_QUERY_GOES_HERE -- including the order by ) a
where rownum <= MAX_ROWS )
where rnum >= MIN_ROWS
/

Good discussion from Ask Tom is here

Oracle SQL - Find Consecutive Values on Consecutive Dates

You can do this with a very handy trick. The groups of consecutive values can be calculated using a difference of row_number()s. Then, you need to get the count for each group and select the ones that match your condition:

select t.*
from (select t.*, count(*) over (partition by id, flag, grp) as cnt
from (select t.*,
(row_number() over (partition by id order by date) -
row_number() over (partition by id, flag order by date)
) as grp
from table t
) t
) t
where cnt >= 3;

Strictly speaking, you do not need the difference of row_numbers(). Assuming your dates have no time components, the following will also suffice:

select t.*
from (select t.*, count(*) over (partition by id, flag, grp) as cnt
from (select t.*,
(date -
row_number() over (partition by id, flag order by date)
) as grp
from table t
) t
) t
where cnt >= 3;

SQL - Find missing int values in mostly ordered sequential series

I've been there.

FOR ORACLE:

I found this extremely useful query on the net a while ago and noted down, however I don't remember the site now, you may search for "GAP ANALYSIS" on Google.

SELECT   CASE
WHEN ids + 1 = lead_no - 1 THEN TO_CHAR (ids +1)
ELSE TO_CHAR (ids + 1) || '-' || TO_CHAR (lead_no - 1)
END
Missing_track_no
FROM (SELECT ids,
LEAD (ids, 1, NULL)
OVER (ORDER BY ids ASC)
lead_no
FROM YOURTABLE
)
WHERE lead_no != ids + 1

Here, the result is:

MISSING _TRACK_NO
-----------------
6

If there were multiple gaps,say 2,6,7,9 then it would be:

MISSING _TRACK_NO
-----------------
2
6-7
9

Oracle SQL - Set consecutive numbers as a range

In this problem you have to identify the sequence of consecutive weeks and group them. Here's my solution.

  • Use the LAG function to identify any breaks in the sequence.
  • Use the SUM function to assign a group number to each sequence.
  • Find the starting and ending week in each group.
  • Finally use the LISTAGG function to aggregate the result.

Query:

with x(wk, cost, startgroup) as(
--identify the start of a sequence
select wk, cost,
case when wk = lag(wk,1) over (partition by cost order by wk) + 1
then 0
else 1
end
from mytable
where id = '345'
),
y(wk, cost, grp) as(
--assign group number
select wk, cost,
sum(startgroup) over (partition by cost order by wk)
from x
),
z(wk, cost, grp) as(
--get the max/min week for each group
select case when min(wk) = max(wk)
then cast(min(wk) as varchar2(10))
else min(wk) ||'-'||max(wk)
end,
cost, grp
from y
group by cost, grp
)
--aggregate by cost
select listagg(wk,',') within group(order by grp),
cost
from z
group by cost;

Demo at sqlfiddle.

How to query for non-consecutive values?

Alternatively, using LEAD analytic function, along with your fancy formatting. TEST CTE is what you already have; lines #9 onwards is what you need.

SQL> with test (col) as
2 (select 1 from dual union all
3 select 3 from dual union all
4 select 4 from dual union all
5 select 9 from dual union all
6 select 10 from dual union all
7 select 11 from dual
8 ),
9 temp as
10 (select col,
11 lead(col) over (order by col) lcol
12 from test
13 )
14 select '[' || col ||' - '|| lcol ||']' result
15 From temp
16 where lcol - col > 1
17 order by col;

RESULT
-------------------------------------------------------
[1 - 3]
[4 - 9]

SQL>

[EDIT: Adjusted so that you shouldn't have to think too much]

This is what you have:

SQL> select * From t_mark;

M_ID
----------
1
3
4
9
10
11

6 rows selected.

This is what you need:

SQL> with temp as
2 (select m_id,
3 lead(m_id) over (order by m_id) lm_id
4 from t_mark
5 )
6 select '[' || m_id ||' - '|| lm_id ||']' result
7 From temp
8 where lm_id - m_id > 1
9 order by m_id;

RESULT
------------------------------------------------------------------
[1 - 3]
[4 - 9]

SQL>

Basically, you should learn how to use a CTE (common table expression, a.k.a. the with factoring clause).

Detect consecutive dates ranges using SQL

No joins or recursive CTEs needed. The standard gaps-and-island solution is to group by (value minus row_number), since that is invariant within a consecutive sequence. The start and end dates are just the MIN() and MAX() of the group.

WITH t AS (
SELECT InfoDate d,ROW_NUMBER() OVER(ORDER BY InfoDate) i
FROM @d
GROUP BY InfoDate
)
SELECT MIN(d),MAX(d)
FROM t
GROUP BY DATEDIFF(day,i,d)

Oracle sql query to group consecutive records by date

Assuming we can just use start_date to identify the adjacent records (i.e., there are no gaps), then you can use the difference of row numbers approach:

select id, min(start_date) as mn_date, max(end_date) as mx_date, rate
from (select t.*,
row_number() over (partition by id order by start_date) as seqnum_i,
row_number() over (partition by id, rate order by start_date) as seqnum_ir
from t
) t
group by id (seqnum_i - seqnum_ir), rate;

To see how this works, look at the results of the subquery. You should be able to "see" how the difference of the two row numbers defines the groups of adjacent records with the same rate.



Related Topics



Leave a reply



Submit