Detect Overlapping Date Ranges from the Same Table

Detect overlapping date ranges from the same table

If you already have entries for each day that should work, but if you don't the overhead is significant, and if that query is used often, if will affect performance.

If the data is in this format, you can detect overlaps using simple date arithmetic, because an overlap is simply one interval starting after a given interval, but before the given is finished, something like

select dr1.* from date_ranges dr1
inner join date_ranges dr2
on dr2.start > dr1.start -- start after dr1 is started
and dr2.start < dr1.end -- start before dr1 is finished

If you need special handling for interval that are wholly within another interval, or you need to merge intervals, i.e.

PKey  Start       End         Type
==== ===== === ====
01 01/01/2010 20/01/2010 S
02 15/01/2010 31/01/2010 S

yielding

Start       End         Type
===== === ====
01/01/2010 31/01/2010 S

you will need more complex calculation.

In my experience with this kind of problems, once you get how to do the calculation by hand, it's easy to transfer it into SQL :)

Determine Whether Two Date Ranges Overlap

(StartA <= EndB) and (EndA >= StartB)

Proof:

Let ConditionA Mean that DateRange A Completely After DateRange B

_                        |---- DateRange A ------|
|---Date Range B -----| _

(True if StartA > EndB)

Let ConditionB Mean that DateRange A is Completely Before DateRange B

|---- DateRange A -----|                        _ 
_ |---Date Range B ----|

(True if EndA < StartB)

Then Overlap exists if Neither A Nor B is true -

(If one range is neither completely after the other,

nor completely before the other,
then they must overlap.)

Now one of De Morgan's laws says that:

Not (A Or B) <=> Not A And Not B

Which translates to: (StartA <= EndB) and (EndA >= StartB)


NOTE: This includes conditions where the edges overlap exactly. If you wish to exclude that,

change the >= operators to >, and <= to <


NOTE2. Thanks to @Baodad, see this blog, the actual overlap is least of:

{ endA-startA, endA - startB, endB-startA, endB - startB }

(StartA <= EndB) and (EndA >= StartB)
(StartA <= EndB) and (StartB <= EndA)


NOTE3. Thanks to @tomosius, a shorter version reads:

DateRangesOverlap = max(start1, start2) < min(end1, end2)

This is actually a syntactical shortcut for what is a longer implementation, which includes extra checks to verify that the start dates are on or before the endDates. Deriving this from above:

If start and end dates can be out of order, i.e., if it is possible that startA > endA or startB > endB, then you also have to check that they are in order, so that means you have to add two additional validity rules:

(StartA <= EndB) and (StartB <= EndA) and (StartA <= EndA) and (StartB <= EndB)
or:

(StartA <= EndB) and (StartA <= EndA) and (StartB <= EndA) and (StartB <= EndB)
or,

(StartA <= Min(EndA, EndB) and (StartB <= Min(EndA, EndB))
or:

(Max(StartA, StartB) <= Min(EndA, EndB)

But to implement Min() and Max(), you have to code, (using C ternary for terseness),:

(StartA > StartB? Start A: StartB) <= (EndA < EndB? EndA: EndB)

Find Overlapping Dates and Return Overlapping Records

Assuming that your #Overlap table has a primary key (or unique key) TableId

You could use this following query

SELECT * FROM #Overlap o
WHERE EXISTS
(
SELECT 1 FROM #Overlap o2
WHERE o2.SubType = o.SubType
AND o2.Cause = o.Cause
AND o2.CircuitID = o.CircuitID
AND (
o2.BegDate BETWEEN o.BegDate AND o.EndDate
OR o2.EndDate BETWEEN o.BegDate AND o.EndDate
OR o.BegDate BETWEEN o2.BegDate AND o2.EndDate
OR o.EndDate BETWEEN o2.BegDate AND o2.EndDate
)
AND o2.TableId != o.TableId
)
ORDER BY o.SubType, o.Cause, o.BegDate

Demo link: http://rextester.com/KBFX30109

As @HABO suggestion, you also can use to check overlapping

    AND o2.BegDate <= o.EndDate 
AND o.BegDate <= o2.EndDate

Asking for overlapping date ranges in same SQL table

You can use the LAG and LEAD analytic functions and then you do not need to use any self-joins:

Oracle Setup:

CREATE TABLE VEHICLE_GEARS ( vehicle_id, startdate, enddate, gear_id ) AS
SELECT '000000000', DATE '2014-02-25', DATE '2999-12-31', 302 FROM DUAL UNION ALL -- No duplicate vehicle & gear
SELECT '000000000', DATE '2006-12-15', DATE '2999-12-31', 604 FROM DUAL UNION ALL -- No duplicate vehicle & gear
SELECT '000000000', DATE '2006-12-15', DATE '2999-12-31', 605 FROM DUAL UNION ALL -- No duplicate vehicle & gear
SELECT '000000006', DATE '1991-07-01', DATE '2999-12-31', 601 FROM DUAL UNION ALL -- No duplicate vehicle & gear
SELECT '000000006', DATE '2016-01-18', DATE '2999-12-31', 605 FROM DUAL UNION ALL -- No duplicate vehicle & gear
SELECT '000000006', DATE '2012-02-20', DATE '2999-12-31', 804 FROM DUAL UNION ALL -- Overlaps previous
SELECT '000000006', DATE '1991-07-01', DATE '2999-12-31', 804 FROM DUAL UNION ALL -- Overlaps next
SELECT '000000147', DATE '1991-07-01', DATE '1992-10-08', 601 FROM DUAL UNION ALL -- Same end date as next start date
SELECT '000000147', DATE '1992-10-08', DATE '2999-12-31', 601 FROM DUAL UNION ALL -- Same start date as previous end date
SELECT '000000567', DATE '2018-01-01', DATE '2018-12-31', 203 FROM DUAL UNION ALL -- Overlaps next
SELECT '000000567', DATE '2018-12-01', DATE '2019-01-10', 203 FROM DUAL UNION ALL -- Overlaps previous and next
SELECT '000000567', DATE '2018-12-31', DATE '2019-01-10', 203 FROM DUAL UNION ALL -- Overlaps previous and next
SELECT '000000567', DATE '2018-12-31', DATE '2019-01-31', 203 FROM DUAL UNION ALL -- Overlaps previous
SELECT '000000567', DATE '2019-01-31', DATE '2019-02-28', 203 FROM DUAL -- Same start date as previous

Query:

SELECT vehicle_id,
startdate,
enddate,
gear_id
FROM (
SELECT G.*,
LAG( enddate ) OVER ( PARTITION BY vehicle_id, gear_id ORDER BY startdate, enddate ) AS prev_enddate,
LEAD( startdate ) OVER ( PARTITION BY vehicle_id, gear_id ORDER BY startdate, enddate ) AS next_startdate
FROM VEHICLE_GEARS G
)
WHERE prev_enddate > startdate
AND ( enddate <= next_startdate OR next_startdate IS NULL );

Output:


VEHICLE_ID | STARTDATE | ENDDATE | GEAR_ID
:--------- | :-------- | :-------- | ------:
000000006 | 20-FEB-12 | 31-DEC-99 | 804
000000567 | 31-DEC-18 | 31-JAN-19 | 203

db<>fiddle here

Find common date range from a set of overlapping date ranges

This answer counts the number of overlapping intervals. It assumes that date ranges having same EID do not overlap. Below is the query with explanation inline:

DECLARE @EventsTBL TABLE (PID INT, EID INT, StartDate DATETIME, EndDate DATETIME);
INSERT INTO @EventsTBL VALUES
(13579, 1, '01 Jan 2018', '31 Mar 2019'),
(13579, 2, '01 Feb 2018', '31 May 2018'),
(13579, 2, '01 Jul 2018', '31 Jan 2019'),
(13579, 7, '01 Mar 2018', '31 Mar 2019'),
(13579, 5, '01 Feb 2018', '30 Apr 2018'),
(13579, 5, '01 Oct 2018', '31 Mar 2019'),
(13579, 8, '01 Jan 2018', '30 Apr 2018'),
(13579, 8, '01 Jun 2018', '31 Dec 2018'),
(13579, 13, '01 Jan 2018', '31 Mar 2019'),
(13579, 6, '01 Apr 2018', '31 May 2018'),
(13579, 6, '01 Sep 2018', '30 Nov 2018'),
(13579, 4, '01 Feb 2018', '31 Jan 2019'),
(13579, 19, '01 Mar 2018', '31 Jul 2018'),
(13579, 19, '01 Oct 2018', '28 Feb 2019'),
(13570, 16, '01 Feb 2018', '30 Jun 2018'),
(13570, 16, '01 Aug 2018', '31 Aug 2018'),
(13570, 16, '01 Oct 2018', '28 Feb 2019'),
(13570, 23, '01 Mar 2018', '30 Jun 2018'),
(13570, 23, '01 Nov 2018', '31 Jan 2019');

WITH cte1 AS (
/*
* augment the data with the number of distinct EID per PID
* we will need this later
*/
SELECT e.PID, a.EIDCount, StartDate, EndDate
FROM @EventsTBL AS e
JOIN (
SELECT PID, COUNT(DISTINCT EID) AS EIDCount
FROM @EventsTBL
GROUP BY PID
) AS a ON e.PID = a.PID
), cte2 AS (
/*
* build a list of "points in time" at which an event started or ended
* and the number concurrent events changed
* the zero value rows are required!
*/
SELECT PID, EIDCount, StartDate AS pdate, 1 AS pval
FROM cte1
UNION ALL
SELECT PID, EIDCount, EndDate, 0
FROM cte1
UNION ALL
SELECT PID, EIDCount , DATEADD(DAY, 1, EndDate), -1
FROM cte1
), cte3 AS (
/*
* calculate running sum of pval over dates; minus ones first
*/
SELECT PID, EIDCount, pdate, SUM(pval) OVER (PARTITION BY PID ORDER BY pdate, pval) AS evtcount
FROM cte2
), cte4 AS (
/*
* consolidate data for same dates and we are done with the main part
*/
SELECT PID, EIDCount, pdate, MAX(evtcount) AS evtcount
FROM cte3
GROUP BY PID, EIDCount, pdate
), cte5 AS (
/*
* assign "change flag" to rows where number of concurrent events
* enters or exits the required count w.r.t. previous row
*/
SELECT PID, EIDCount, pdate, evtcount, CASE
WHEN evtcount < EIDCount AND LAG(evtcount) OVER (PARTITION BY PID ORDER BY pdate) < EIDCount THEN 0
WHEN evtcount = EIDCount AND LAG(evtcount) OVER (PARTITION BY PID ORDER BY pdate) = EIDCount THEN 0
ELSE 1
END AS chg
FROM cte4
), cte6 AS (
/*
* convert "change flag" to "group numbers" over consecutive rows using running sum
*/
SELECT PID, EIDCount, pdate, evtcount, SUM(chg) OVER (PARTITION BY PID ORDER BY pdate) AS grp
FROM cte5
)
/*
* group rows by pid and group numbers
*/
SELECT PID, MIN(pdate) AS StartDate, MAX(pdate) AS EndDate
FROM cte6
WHERE evtcount = EIDCount
GROUP BY PID, grp
ORDER BY PID, StartDate

Demo on db<>fiddle

Determine whether the time range of one table entry overlaps with another KDB+/Q

A bit of a long-winded solution here but I think this covers your use case:

q)raze{update overlap:{any(x within'z)|y within'z}'[startRange;endRange]{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
RIC startRange endRange overlap
--------------------------------------------------------------------------
A.N 2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000 1
A.N 2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000 0
A.N 2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000 1
GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000 1
GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000 1

To break this answer down, firstly we needed the time ranges to check for an overlap. We start with all the time ranges for a given RIC:

q)`overlap xcols raze{update overlap:count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap ..
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000) ..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000) ..

We want to exclude the time range for the entry we're working from:

q)`overlap xcols raze{update overlap:{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap RIC startRange endRange ..
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------..
(2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000) A.N 2022.01.03D09:31:54.000000000 2022.01.03D09:31:5..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000) A.N 2022.01.03D09:32:04.000000000 2022.01.03D09:32:0..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000) A.N 2022.01.03D09:31:54.100000000 2022.01.03D09:31:5..
,2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000 GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:5..
,2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000 GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:5..

And finally see if startRange or endRange are within these time ranges:

q)`overlap xcols raze{update overlap:{any(x within'z)|y within'z}'[startRange;endRange]{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap RIC startRange endRange
--------------------------------------------------------------------------
1 A.N 2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000
0 A.N 2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000
1 A.N 2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000
1 GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000
1 GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000

EDIT:

A faster solution

q)raze{update overlap:{$[any x within'z;1b;any y within'z]}'[startRange;endRange]{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O

Find date range overlaps within the same table, for specific user MySQL

Here is the first part: Overlapping cars per user...

SQLFiddle - correlated Query and Join Query

Second part - more than one user in one car at the same time: SQLFiddle - correlated Query and Join Query. Query below...

I use the correlated queries:

You will likely need indexes on userid and 'car'. However - please check the 'explain plan' to see how it mysql is accessing the data. And just try it :)

Overlapping cars per user

The query:

SELECT `allCars`.`userid`  AS `allCars_userid`, 
`allCars`.`car` AS `allCars_car`,
`allCars`.`From` AS `allCars_From`,
`allCars`.`To` AS `allCars_To`,
`allCars`.`tableid` AS `allCars_id`
FROM
`cars` AS `allCars`
WHERE
EXISTS
(SELECT 1
FROM `cars` AS `overlapCar`
WHERE
`allCars`.`userid` = `overlapCar`.`userid`
AND `allCars`.`tableid` <> `overlapCar`.`tableid`
AND NOT ( `allCars`.`From` >= `overlapCar`.`To` /* starts after outer ends */
OR `allCars`.`To` <= `overlapCar`.`From`)) /* ends before outer starts */
ORDER BY
`allCars`.`userid`,
`allCars`.`From`,
`allCars`.`car`;

The results:

allCars_userid  allCars_car  allCars_From  allCars_To  allCars_id  
-------------- ----------- ------------ ---------- ------------
1 Navara 2015-03-01 2015-03-31 3
1 GTR 2015-03-28 2015-04-30 4
1 Skyline 2015-04-29 2015-05-31 9
2 Aygo 2015-03-01 2015-03-31 7
2 206 2015-03-29 2015-04-30 8
2 Skyline 2015-04-29 2015-05-31 10

Why it works? or How I think about it:

I use the correlated query so I don't have duplicates to deal with and it is probably the easiest to understand for me. There are other ways of expressing the query. Each has advantages and drawbacks. I want something I can easily understand.

Requirement: For each user ensure that they don't have two or more cars at the same time.

So, for each user record (AllCars) check the complete table (overlapCar) to see if you can find a different record that overlaps for the time of the current record. If we find one then select the current record we are checking (in allCars).

Therefore the overlap check is:

  • the allCars userid and the overLap userid must be the same

  • the allCars car record and the overlap car record must be different

  • the allCars time range and the overLap time range must overlap.

    The time range check:

    Instead of checking for overlapping times use positive tests. The easiest approach, is to check it doesn't overlap, and apply a NOT to it.

One car with More than One User at the same time...

The query:

SELECT  `allCars`.`car`     AS `allCars_car`,
`allCars`.`userid` AS `allCars_userid`,
`allCars`.`From` AS `allCars_From`,
`allCars`.`To` AS `allCars_To`,
`allCars`.`tableid` AS `allCars_id`

FROM
`cars` AS `allCars`
WHERE
EXISTS
(SELECT 1
FROM `cars` AS `overlapUser`
WHERE
`allCars`.`car` = `overlapUser`.`car`
AND `allCars`.`tableid` <> `overlapUser`.`tableid`
AND NOT ( `allCars`.`From` >= `overlapUser`.`To` /* starts after outer ends */
OR `allCars`.`To` <= `overlapUser`.`From`)) /* ends before outer starts */
ORDER BY
`allCars`.`car`,
`allCars`.`userid`,
`allCars`.`From`;


The results:

allCars_car  allCars_userid  allCars_From  allCars_To    allCars_id  
----------- -------------- ------------ ---------- ------------
Skyline 1 2015-04-29 2015-05-31 9
Skyline 2 2015-04-29 2015-05-31 10

Edit:

In view of the comments, by @philipxy , about time ranges needing 'greater than or equal to' checks I have updated the code here. I havn't changed the SQLFiddles.



Related Topics



Leave a reply



Submit