Flattening Intersecting Timespans

Flattening intersecting timespans

Here is a SQL only solution. I used DATETIME for the columns. Storing the time separate is a mistake in my opinion, as you will have problems when the times go past midnight. You can adjust this to handle that situation though if you need to. The solution also assumes that the start and end times are NOT NULL. Again, you can adjust as needed if that's not the case.

The general gist of the solution is to get all of the start times that don't overlap with any other spans, get all of the end times that don't overlap with any spans, then match the two together.

The results match your expected results except in one case, which checking by hand looks like you have a mistake in your expected output. On the 6th there should be a span that ends at 2009-06-06 10:18:45.000.

SELECT
ST.start_time,
ET.end_time
FROM
(
SELECT
T1.start_time
FROM
dbo.Test_Time_Spans T1
LEFT OUTER JOIN dbo.Test_Time_Spans T2 ON
T2.start_time < T1.start_time AND
T2.end_time >= T1.start_time
WHERE
T2.start_time IS NULL
) AS ST
INNER JOIN
(
SELECT
T3.end_time
FROM
dbo.Test_Time_Spans T3
LEFT OUTER JOIN dbo.Test_Time_Spans T4 ON
T4.end_time > T3.end_time AND
T4.start_time <= T3.end_time
WHERE
T4.start_time IS NULL
) AS ET ON
ET.end_time > ST.start_time
LEFT OUTER JOIN
(
SELECT
T5.end_time
FROM
dbo.Test_Time_Spans T5
LEFT OUTER JOIN dbo.Test_Time_Spans T6 ON
T6.end_time > T5.end_time AND
T6.start_time <= T5.end_time
WHERE
T6.start_time IS NULL
) AS ET2 ON
ET2.end_time > ST.start_time AND
ET2.end_time < ET.end_time
WHERE
ET2.end_time IS NULL

Flattening intersecting timespans by user in PostgreSQL

I found this example of how to make a 'range aggregate' using windowing functions and a lot of nested subqueries. I just adapted it to partition and group by user_id, and it seems to do what you want:

SELECT user_id, min(login_time) as login_time, max(logout_time) as logout_time
FROM (
SELECT user_id, login_time, logout_time,
max(new_start) OVER (PARTITION BY user_id ORDER BY login_time, logout_time) AS left_edge
FROM (
SELECT user_id, login_time, logout_time,
CASE
WHEN login_time <= max(lag_logout_time) OVER (
PARTITION BY user_id ORDER BY login_time, logout_time
) THEN NULL
ELSE login_time
END AS new_start
FROM (
SELECT
user_id,
login_time,
logout_time,
lag(logout_time) OVER (PARTITION BY user_id ORDER BY login_time, logout_time) AS lag_logout_time
FROM app_log
) AS s1
) AS s2
) AS s3
GROUP BY user_id, left_edge
ORDER BY user_id, min(login_time)

Results in:

 user_id |     login_time      |     logout_time
---------+---------------------+---------------------
1 | 2014-01-01 08:00:00 | 2014-01-01 10:49:00
1 | 2014-01-01 10:55:00 | 2014-01-01 11:00:00
2 | 2014-01-01 09:00:00 | 2014-01-01 11:49:00
2 | 2014-01-01 11:55:00 | 2014-01-01 12:00:00
(4 rows)

It works by first detecting the beginning of each new range (partitioned by user_id), then extending and grouping by the detected ranges. I found I had to read that article very carefully to understand it!

The article suggests it can be simplified with Postgresql>=9.0 by removing the innermost subquery and changing the window range, but I could not get that to work.

Finding 'free' times in MySQL

ax came up with the best answer I've seen so far to this - that is, http://explainextended.com/2009/06/13/flattening-timespans-mysql/

Calculate missing date ranges and overlapping date ranges between two dates

It's a little variation of the function to flatten intersecting timespans in SQL Server:

  • Flattening timespans: SQL Server

It's one of the rare cases when cursor-based approach in SQL Server is faster the a set-based one:


CREATE FUNCTION mytable(@p_from DATETIME, @p_till DATETIME)
RETURNS @t TABLE
(
q_type VARCHAR(20) NOT NULL,
q_start DATETIME NOT NULL,
q_end DATETIME NOT NULL
)
AS
BEGIN
DECLARE @qs DATETIME
DECLARE @qe DATETIME
DECLARE @ms DATETIME
DECLARE @me DATETIME
DECLARE cr_span CURSOR FAST_FORWARD
FOR
SELECT startDate, endDate
FROM mytable
WHERE startDate BETWEEN @p_from AND @p_till
ORDER BY
startDate
OPEN cr_span
FETCH NEXT
FROM cr_span
INTO @qs, @qe
SET @ms = @qs
SET @me = @qe
WHILE @@FETCH_STATUS = 0
BEGIN
FETCH NEXT
FROM cr_span
INTO @qs, @qe
IF @qs > @me
BEGIN
INSERT
INTO @t
VALUES ('overlap', @ms, @me)
INSERT
INTO @t
VALUES ('gap', @me, @qs)
SET @ms = @qs
END
SET @me = CASE WHEN @qe > @me THEN @qe ELSE @me END
END
IF @ms IS NOT NULL
BEGIN
INSERT
INTO @t
VALUES (@ms, @me)
END
CLOSE cr_span
RETURN
END
GO

This function compresses each contiguous set of intersecting ranges into one range, and returns both the range and the following gap.

Min effective and termdate for contiguous dates

I did something like this to get the effdate and same for termdate, made them as two separate views and got the final result.

SELECT distinct e0.effdate,e0.ID
FROM dbo.datatable e0 LEFT OUTER JOIN dbo.datatable PREV ON
PREV.ID = e0.ID
AND PREV.termdate = DATEADD(dy, -1, e0.Effdate)
WHERE PREV.ID IS NULL


Related Topics



Leave a reply



Submit