How to Find a "Gap" in Running Counter With Sql

How do I find a gap in running counter with SQL?

In MySQL and PostgreSQL:

SELECT  id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
LIMIT 1

In SQL Server:

SELECT  TOP 1
id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id

In Oracle:

SELECT  *
FROM (
SELECT id + 1 AS gap
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
)
WHERE rownum = 1

ANSI (works everywhere, least efficient):

SELECT  MIN(id) + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)

Systems supporting sliding window functions:

SELECT  -- TOP 1
-- Uncomment above for SQL Server 2012+
previd
FROM (
SELECT id,
LAG(id) OVER (ORDER BY id) previd
FROM mytable
) q
WHERE previd <> id - 1
ORDER BY
id
-- LIMIT 1
-- Uncomment above for PostgreSQL

How to find a gap in range in SQL

in mysql and postgresql

SELECT  id + 1
FROM test mo
WHERE NOT EXISTS
(
SELECT NULL
FROM test mi
WHERE mi.id = mo.id + 1
) and mo.id> 100
ORDER BY
id
LIMIT 1

fiddle for mysql and fiddle for postgresql

in ms sql

SELECT  TOP 1
id + 1
FROM test mo
WHERE NOT EXISTS
(
SELECT NULL
FROM test mi
WHERE mi.id = mo.id + 1
)
and mo.id > 100
ORDER BY
id

fiddle

Detect consecutive dates in SQL and start counter

One way to tackle such Gaps-And-Islands problem is to calculate a rank that's based on the difference with the previous date.

The method to calculate the difference in days between dates however depends on your RDBMS.

This example uses DATEDIFF (MS Sql Server)

SELECT 
Dates,
ROW_NUMBER() OVER (PARTITION BY Rnk ORDER BY Dates) AS Counter
FROM
(
SELECT
Dates,
SUM(CASE
WHEN DATEDIFF(day, prevDate, Dates) = 1 THEN 0
ELSE 1 END) OVER (ORDER BY Dates) AS Rnk
FROM
(
SELECT
Dates,
LAG(Dates) OVER (ORDER BY Dates) AS prevDate
FROM your_table
) q1
) q2
ORDER BY Dates;

How do I find a gap in running counter in Informix table

The selected answers for the other question only need trivial modifications to work with Informix.

For instance — one solution given in the other question is:

SELECT  id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
LIMIT 1

The Informix equivalent, adapted to your schema, is:

SELECT LIMIT 1 serial_request + 1 AS id
FROM mission AS m1
WHERE NOT EXISTS
(SELECT *
FROM mission AS m2
WHERE m2.serial_request = m1.serial_request + 1
)
ORDER BY id

Given the sample data, it correctly produces the answer 34 (the first number missing after a number that appears in the table). To get the first number missing after a row entered in 2012, you simply need to add a condition:

   AND m1.year_request = 2012

Leading to:

SELECT LIMIT 1 serial_request + 1 AS id
FROM mission AS m1
WHERE NOT EXISTS
(SELECT *
FROM mission AS m2
WHERE m2.serial_request = m1.serial_request + 1
)
AND m1.year_request = 2012
ORDER BY id

That does yield the answer 47 as requested.

You could also use the OLAP windowing functions solution (again, based on SQL from the other question):

SELECT LIMIT 1
q.previd + 1
FROM (SELECT serial_request,
LAG(serial_request) OVER (ORDER BY serial_request) previd
FROM mission
WHERE year_request = 2012
) AS q
WHERE q.previd <> q.serial_request - 1
ORDER BY q.previd;

This yields 47 too. We can debate about the relative merits of the plus and minus 1 in the code and where they should go.


Test Schema and Data

DROP TABLE IF EXISTS mission;

CREATE TEMP TABLE mission
(
emp_num INTEGER NOT NULL PRIMARY KEY,
serial_request SERIAL NOT NULL UNIQUE,
year_request SMALLINT NOT NULL
);

INSERT INTO mission(emp_num, serial_request, year_request) VALUES(33, 45, 2012);
INSERT INTO mission(emp_num, serial_request, year_request) VALUES(667, 46, 2012);
INSERT INTO mission(emp_num, serial_request, year_request) VALUES(588, 50, 2012);
INSERT INTO mission(emp_num, serial_request, year_request) VALUES(78, 53, 2012);
INSERT INTO mission(emp_num, serial_request, year_request) VALUES(89, 33, 2013);

Testing: Informix 12.10.FC5 running on Mac OS X 10.11.6.

How to find gaps in sequential numbering in mysql?

Update

ConfexianMJS provided much better answer in terms of performance.

The (not as fast as possible) answer

Here's version that works on table of any size (not just on 100 rows):

SELECT (t1.id + 1) as gap_starts_at, 
(SELECT MIN(t3.id) -1 FROM arrc_vouchers t3 WHERE t3.id > t1.id) as gap_ends_at
FROM arrc_vouchers t1
WHERE NOT EXISTS (SELECT t2.id FROM arrc_vouchers t2 WHERE t2.id = t1.id + 1)
HAVING gap_ends_at IS NOT NULL
  • gap_starts_at - first id in current gap
  • gap_ends_at - last id in current gap

Find missing values within range of running counter

SELECT a.id+1000 AS start, MIN(b.id) - 1 AS end
FROM mytable AS a, mytable AS b
WHERE a.id < b.id
GROUP BY a.id
HAVING start < MIN(b.id)

or

SELECT  id + 1000
FROM users mo
WHERE NOT EXISTS
(
SELECT NULL
FROM users mi
WHERE mi.id = mo.id + 1
) AND id < 65000
ORDER BY
id

try this, it will print all the gaps

SQL Query to show gaps between multiple date ranges

The Query:

Assuming the fields containing the interval are named Start and Finish, and the table is named YOUR_TABLE, the query...

SELECT Finish, Start
FROM
(
SELECT DISTINCT Start, ROW_NUMBER() OVER (ORDER BY Start) RN
FROM YOUR_TABLE T1
WHERE
NOT EXISTS (
SELECT *
FROM YOUR_TABLE T2
WHERE T1.Start > T2.Start AND T1.Start < T2.Finish
)
) T1
JOIN (
SELECT DISTINCT Finish, ROW_NUMBER() OVER (ORDER BY Finish) RN
FROM YOUR_TABLE T1
WHERE
NOT EXISTS (
SELECT *
FROM YOUR_TABLE T2
WHERE T1.Finish > T2.Start AND T1.Finish < T2.Finish
)
) T2
ON T1.RN - 1 = T2.RN
WHERE
Finish < Start

...gives the following result on your test data:

Finish                      Start
2012-01-01 10:00:00.000 2012-01-01 18:00:00.000

The important property of this query is that it would work on overlapping intervals as well.


The Algorithm:

1. Merge Overlapping Intervals

The subquery T1 accepts only those interval starts that are outside other intervals. The subquery T2 does the same for interval ends. This is what removes overlaps.

The DISTINCT is important in case there are two identical interval starts (or ends) that are both outside other intervals. The WHERE Finish < Start simply eliminates any empty intervals (i.e. duration 0).

We also attach a row number relative to temporal ordering, which will be needed in the next step.

The T1 yields:

Start                       RN
2012-01-01 08:00:00.000 1
2012-01-01 18:00:00.000 2

The T2 yields:

Finish                      RN
2012-01-01 10:00:00.000 1
2012-01-01 20:00:00.000 2

2. Reconstruct the Result

We can now reconstruct either the "active" or the "inactive" intervals.

The inactive intervals are reconstructed by putting together end of the previous interval with the beginning of the next one, hence - 1 in the ON clause. Effectively, we put...

Finish                      RN
2012-01-01 10:00:00.000 1

...and...

Start                       RN
2012-01-01 18:00:00.000 2

...together, resulting in:

Finish                      Start
2012-01-01 10:00:00.000 2012-01-01 18:00:00.000

(The active intervals could be reconstructed by putting rows from T1 alongside rows from T2, by using JOIN ... ON T1.RN = T2.RN and reverting WHERE.)


The Example:

Here is a slightly more realistic example. The following test data:

Device      Event      Start                      Finish
Device 1 Event A 2012-01-01 08:00:00.000 2012-01-01 10:00:00.000
Device 2 Event B 2012-01-01 18:00:00.000 2012-01-01 20:00:00.000
Device 3 Event C 2012-01-02 11:00:00.000 2012-01-02 15:00:00.000
Device 4 Event D 2012-01-02 10:00:00.000 2012-01-02 12:00:00.000
Device 5 Event E 2012-01-02 10:00:00.000 2012-01-02 15:00:00.000
Device 6 Event F 2012-01-03 09:00:00.000 2012-01-03 10:00:00.000

Gives the following result:

Finish                      Start
2012-01-01 10:00:00.000 2012-01-01 18:00:00.000
2012-01-01 20:00:00.000 2012-01-02 10:00:00.000
2012-01-02 15:00:00.000 2012-01-03 09:00:00.000

How to count the Gap between dates in SQL Server with overlapping dates

I don't quite follow your results.

I think the problem is a combination of gaps-and-islands along with "fill-in-gaps". That is, you need to combine the existing data that overlaps. Then you need to get the pieces in-between.

You can identify the overlaps by using a cumulative max of the end date from preceding rows. When it is less than or equal to the current start, then you have a new "island". The result is aggregation and lead().

The code looks like this:

select roomTypeId, grp, max(enddate), lead(min(startdate)) over (partition by roomtypeid order by min(startdate)),
datediff(day, max(enddate), lead(min(startdate)) over (partition by roomtypeid order by min(startdate)) )
from (select t.*,
sum(case when prev_enddate > startdate then 0 else 1 end) over
(partition by RoomTypeId order by startdate) as grp
from (select t.*,
max(enddate) over (partition by RoomTypeId
order by startdate
rows between unbounded preceding and 1 preceding
) as prev_enddate
from temp t
) t
) t
group by roomtypeid, grp
order by roomtypeid, min(startdate);

Here is the db<>fiddle.

The results are subtly different, but they make more sense to me.

Find the gap in ID column + select the prev/next date column

Should work on most databases other than MySQL

select      *

from (select lag (id) over (order by id) + 1 as gap_start
,id - 1 as gap_end
,lag (dt) over (order by id) as dt_before_gap
,dt as dt_after_gap
,case when lag (id) over (order by id) + 1 <> id then 'Y' end as is_gap

from t
) t

where is_gap = 'Y'
;


Related Topics



Leave a reply



Submit