How to Find Gaps in Sequential Numbering in MySQL

How to find gaps in sequential numbering in mysql?

Update

ConfexianMJS provided much better answer in terms of performance.

The (not as fast as possible) answer

Here's version that works on table of any size (not just on 100 rows):

SELECT (t1.id + 1) as gap_starts_at, 
       (SELECT MIN(t3.id) -1 FROM arrc_vouchers t3 WHERE t3.id > t1.id) as gap_ends_at
FROM arrc_vouchers t1
WHERE NOT EXISTS (SELECT t2.id FROM arrc_vouchers t2 WHERE t2.id = t1.id + 1)
HAVING gap_ends_at IS NOT NULL

gap_starts_at - first id in current gap
gap_ends_at - last id in current gap

How to find gaps in sequential numbering in HSQLDB?

The problem is the use of the variable @rownum. This is not supported by HSQLDB.

With HSQLDB you can do it in a simple manner.

Suppose the table is called CUSTOMER and the sequence column is called ID. The queries below show how SEQUENCE_ARRAY works and is used for finding the missing values.

-- this returns consecutive numbers within a fixed range
SELECT * FROM UNNEST (SEQUENCE_ARRAY(1, 1000, 1))
-- this returns all the possible consecutive numbers for an existing table
SELECT * FROM UNNEST (SEQUENCE_ARRAY((SELECT MIN(ID) FROM CUSTOMER), (SELECT MAX(ID) FROM CUSTOMER), 1))

-- this returns the list of unused IDs.
SELECT * FROM UNNEST (SEQUENCE_ARRAY((SELECT MIN(ID) FROM CUSTOMER), (SELECT MAX(ID) FROM CUSTOMER), 1)) SEQ(IDCOL)
LEFT OUTER JOIN CUSTOMER ON CUSTOMER.ID = SEQ.IDCOL WHERE CUSTOMER.ID IS NULL

mysql check for gaps in numeric sequence

Outputs a list of missing ranges as in the link provided, but within the specified range (not extensively tested).

You'll need to iterator through them to get the actual values.

CREATE TABLE tempTable AS ...

DECLARE @StartID INT ...
DECLARE @EndID INT ...

SELECT @StartID as gap_starts_at, 
       COALESCE((SELECT MIN(t3.id) -1 FROM tempTable t3
                 WHERE t3.id > @StartID AND t3.id < @EndID), @EndID) as gap_ends_at
FROM tempTable t1
WHERE NOT EXISTS (SELECT t2.id FROM tempTable t2 WHERE t2.id = @StartID)
UNION
SELECT (t1.id + 1) as gap_starts_at, 
       COALESCE((SELECT MIN(t3.id) -1 FROM tempTable t3 WHERE t3.id > t1.id),
                @EndID) as gap_ends_at
FROM #tempTable t1
WHERE NOT EXISTS (SELECT t2.id FROM tempTable t2 WHERE t2.id = t1.id + 1)
      AND id < @EndID

EDIT: Here's a link with a few ways to find missing values (I don't think any of them work with ranges though, but some may be easier to extend then others.

Next group of sequential numbers mysql

create table nums
(
    num int not null
);

-- truncate table nums;
insert nums (num) values (1),(2),(14),(15),(16),(17),(20),(21),(22),(23),(24),(30),(81),(120),(121),(122),(123),(124);

select min(t2.num)
from
( 
select t1.num
from nums t1 
where 5 in (select count(*) from nums where num in (t1.num,t1.num+1,t1.num+2,t1.num+3,t1.num+4))
) t2;

Answer:
20

MySQL finding gaps in column with multiple ID

You can do this with not exists:

select s.*
from sequence s
where not exists (select 1 from sequence s2 where s2.id = s.id and s2.value = s.value + 1) and
      exists (select 1 from sequence s2 where s2.id = s.id and s2.value > s.value);

The exists clause is important so you don't report the final value for each id.

EDIT:

Here is a better approach:

select s.value + 1 as startgap,
       (select min(s2.value) - 1 from sequence s2 where s2.id = s.id and s2.value > s.value) as endgap
from sequence s
where not exists (select 1 from sequence s2 where s2.id = s.id and s2.value = s.value + 1) and
      exists (select 1 from sequence s2 where s2.id = s.id and s2.value > s.value);

How do I find a gap in running counter with SQL?

In MySQL and PostgreSQL:

SELECT  id + 1
FROM    mytable mo
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    mytable mi 
        WHERE   mi.id = mo.id + 1
        )
ORDER BY
        id
LIMIT 1

In SQL Server:

SELECT  TOP 1
        id + 1
FROM    mytable mo
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    mytable mi 
        WHERE   mi.id = mo.id + 1
        )
ORDER BY
        id

In Oracle:

SELECT  *
FROM    (
        SELECT  id + 1 AS gap
        FROM    mytable mo
        WHERE   NOT EXISTS
                (
                SELECT  NULL
                FROM    mytable mi 
                WHERE   mi.id = mo.id + 1
                )
        ORDER BY
                id
        )
WHERE   rownum = 1

ANSI (works everywhere, least efficient):

SELECT  MIN(id) + 1
FROM    mytable mo
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    mytable mi 
        WHERE   mi.id = mo.id + 1
        )

Systems supporting sliding window functions:

SELECT  -- TOP 1
        -- Uncomment above for SQL Server 2012+
        previd
FROM    (
        SELECT  id,
                LAG(id) OVER (ORDER BY id) previd
        FROM    mytable
        ) q
WHERE   previd <> id - 1
ORDER BY
        id
-- LIMIT 1
-- Uncomment above for PostgreSQL

Sequential occurrence (advanced gaps and islands problem)

Here is one solution (SQL Server).

DECLARE @max_in_row TABLE(
    hit_finish_dttm VARCHAR(255),
    hid VARCHAR(255),
    agent_login VARCHAR(255),
    flg_no_talk int
);
INSERT INTO @max_in_row(hit_finish_dttm, hid, agent_login, flg_no_talk)
VALUES('2020-03-01', 'EQERR13', 'Dmitrii', 0),
      ('2020-03-02', 'EQERR13', 'Dmitrii', 1),
      ('2020-03-03', 'EQERR13', 'Dmitrii', 1),
      ('2020-03-01', 'RR13EQE', 'Dmitrii', 0),
      ('2020-03-02', 'RR13EQE', 'Dmitrii', 1),
      ('2020-03-03', 'RR13EQE', 'Dmitrii', 0),
      ('2020-03-04', 'RR13EQE', 'Dmitrii', 0),
      ('2020-03-05', 'RR13EQE', 'Dmitrii', 1),
      ('2020-03-06', 'RR13EQE', 'Dmitrii', 1),
      ('2020-03-07', 'RR13EQE', 'Dmitrii', 0),
      ('2020-03-01', 'EQERR13', 'Alex', 1),
      ('2020-03-02', 'EQERR13', 'Alex', 1),
      ('2020-03-03', 'EQERR13', 'Alex', 0),
      ('2020-03-04', 'EQERR13', 'Alex', 1),
      ('2020-03-05', 'EQERR13', 'Alex', 1),
      ('2020-03-06', 'EQERR13', 'Alex', 1),
      ('2020-03-02', 'RR13EQE', 'Alex', 1),
      ('2020-03-03', 'RR13EQE', 'Alex', 0),
      ('2020-03-04', 'RR13EQE', 'Alex', 1)
;
WITH OrderNormalized AS
(
    --Since the 0 and 1 can come out of sequence in the data, build up clusters of distinct groups with a chronological order flag 
    --to use as a virtual grouped timetable
    SELECT *,
        GroupNumber = DENSE_RANK() OVER(ORDER BY hid, agent_login ),
        OrderInGroup = RANK() OVER(PARTITION BY  hid, agent_login ORDER BY hit_finish_dttm) 
    FROM 
        @max_in_row
)
,GapsMarked AS
(
    --Order the Gaps so they can be joined with connected islands
    --This is needed because the value can go from 0 to 1 multiple times per partition. 
    --That condition needs to be accounted for to reset the count.
    SELECT *,
        NoTalkGroupNumber = RANK() OVER(PARTITION BY  GroupNumber ORDER BY OrderInGroup)
    FROM 
        OrderNormalized
    WHERE
        flg_no_talk = 0
        
)
,IslandsGrouped AS
(
    --Data is partitioned and the gaps serialized above. Now join the islands with the closest 
    --gap looking backwards and take the min NOTE: There is a cleaner solution here, I just don't have the time to think it up right now
    SELECT 
        D.*,  
        NoTalkGroupNumber=CASE WHEN MIN(G.NoTalkGroupNumber) IS NULL THEN 0 ELSE MIN(G.NoTalkGroupNumber) END 
    FROM 
        OrderNormalized D
        LEFT JOIN GapsMarked G ON G.GroupNumber = D.GroupNumber AND G.OrderInGroup > D.OrderInGroup
    GROUP BY
        D.agent_login,D.flg_no_talk,D.GroupNumber,D.hid,D.hit_finish_dttm,D.OrderInGroup
)
,SeralizedItemsInIslandGroups AS
(
    SELECT 
        *,
        --This serializes by summing sequential flg_no_talk within each  respective islands 
        ItemOrder = SUM(flg_no_talk) OVER (PARTITION BY GroupNumber,NoTalkGroupNumber ORDER BY OrderInGroup ROWS UNBOUNDED PRECEDING)
    FROM 
        IslandsGrouped 
)

SELECT 
    agent_login, hid, MAX(ItemOrder) FROM SeralizedItemsInIslandGroups
GROUP BY
    agent_login, hid

And Here is a PostgreSQL Fiddle->

SQL Fiddle

PostgreSQL 9.6 Schema Setup:

    CREATE TABLE max_in_row (
        hit_finish_dttm VARCHAR(255),
        hid VARCHAR(255),
        agent_login VARCHAR(255),
        flg_no_talk int
    );

Query 1:

INSERT INTO max_in_row(hit_finish_dttm, hid, agent_login, flg_no_talk)
VALUES('2020-03-01', 'EQERR13', 'Dmitrii', 0),
      ('2020-03-02', 'EQERR13', 'Dmitrii', 1),
      ('2020-03-03', 'EQERR13', 'Dmitrii', 1),
      ('2020-03-01', 'RR13EQE', 'Dmitrii', 0),
      ('2020-03-02', 'RR13EQE', 'Dmitrii', 1),
      ('2020-03-03', 'RR13EQE', 'Dmitrii', 0),
      ('2020-03-04', 'RR13EQE', 'Dmitrii', 0),
      ('2020-03-05', 'RR13EQE', 'Dmitrii', 1),
      ('2020-03-06', 'RR13EQE', 'Dmitrii', 1),
      ('2020-03-07', 'RR13EQE', 'Dmitrii', 0),
      ('2020-03-01', 'EQERR13', 'Alex', 1),
      ('2020-03-02', 'EQERR13', 'Alex', 1),
      ('2020-03-03', 'EQERR13', 'Alex', 0),
      ('2020-03-04', 'EQERR13', 'Alex', 1),
      ('2020-03-05', 'EQERR13', 'Alex', 1),
      ('2020-03-06', 'EQERR13', 'Alex', 1),
      ('2020-03-02', 'RR13EQE', 'Alex', 1),
      ('2020-03-03', 'RR13EQE', 'Alex', 0),
      ('2020-03-04', 'RR13EQE', 'Alex', 1)

Results:

Query 2:

WITH OrderNormalized AS
(
    SELECT *,
        DENSE_RANK() OVER(ORDER BY hid, agent_login ) GroupNumber,
        RANK() OVER(PARTITION BY  hid, agent_login ORDER BY hit_finish_dttm) OrderInGroup 
    FROM 
        max_in_row
)
,GapsMarked AS
(
    SELECT *,
        RANK() OVER(PARTITION BY  GroupNumber ORDER BY OrderInGroup) NoTalkGroupNumber
    FROM 
        OrderNormalized
    WHERE
        flg_no_talk = 0
        
)
,IslandsGrouped AS
(
    SELECT 
        D.*,  
        CASE WHEN MIN(G.NoTalkGroupNumber) IS NULL THEN 0 ELSE MIN(G.NoTalkGroupNumber) END NoTalkGroupNumber
    FROM 
        OrderNormalized D
        LEFT JOIN GapsMarked G ON G.GroupNumber = D.GroupNumber AND G.OrderInGroup > D.OrderInGroup
    GROUP BY
        D.agent_login,D.flg_no_talk,D.GroupNumber,D.hid,D.hit_finish_dttm,D.OrderInGroup
)
,SeralizedItemsInIslandGroups AS
(
    SELECT 
        *,
        SUM(flg_no_talk) OVER (PARTITION BY GroupNumber,NoTalkGroupNumber ORDER BY OrderInGroup ROWS UNBOUNDED PRECEDING) ItemOrder    FROM 
        IslandsGrouped 
)

SELECT 
    agent_login, hid, MAX(ItemOrder) FROM SeralizedItemsInIslandGroups
GROUP BY
    agent_login, hid

Results:

| agent_login |     hid | max |
|-------------|---------|-----|
|     Dmitrii | RR13EQE |   2 |
|        Alex | RR13EQE |   1 |
|        Alex | EQERR13 |   3 |
|     Dmitrii | EQERR13 |   2 |

How to find a ranges of sequential numbers without gaps in a table

You need to identify groups that are the same. There is a trick to this, which is a difference of row numbers.

select min(id) as fromid, max(id) as toid, type
from (select t.*,
             (row_number() over (partition by type order by id) -
              row_number() over (partition by type, badvalue order by id)
             ) as grp
      from table t
     ) grp
where badvalue = 0
group by grp, type;

There is a nuance here, because you only seem to want rows where "bad value" is 0. Note that this condition goes in the outer select, so it doesn't interfere with the row_number() calculations.

How to Find Gaps in Sequential Numbering in MySQL