Find Start and Stop Date for Contiguous Dates in Multiple Rows

find start and stop date for contiguous dates in multiple rows

The old answer had a weakness: every row is checked only with the previous to verify if the period are overlapping, if an earlier row have a period that last more the logic will not consider it. For example:

Code | DateStart           | DateFrom            | Overlap
-----+---------------------+---------------------+---------
1006 | 2014-06-18 07:00:00 | 2014-06-18 19:00:00 | 0
1006 | 2014-06-18 08:10:00 | 2014-06-18 10:00:00 | 1
1006 | 2014-06-18 16:00:00 | 2014-06-18 20:30:00 | 0 <- don't overlap with
previous but overlap
with the first

To improve that PrevStop need to become LastStop and have the value of the max of the previous DateFrom for the Code

With N AS (
SELECT Code, DateFrom, DateTo
, LastStop = MAX(DateTo)
OVER (PARTITION BY Code ORDER BY DateFrom, DateTo
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
FROM Table1
), B AS (
SELECT Code, DateFrom, DateTo
, Block = SUM(CASE WHEN LastStop Is Null Then 1
WHEN LastStop < DateFrom Then 1
ELSE 0
END)
OVER (PARTITION BY Code ORDER BY DateFrom, LastStop)
FROM N
)
SELECT Code
, MIN(DateFrom) DateFrom
, MAX(DateTo) DateTo
FROM B
GROUP BY Code, Block
ORDER BY Code, Block

SQLFiddle Demo

ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING is needed to remove the current row from theMAX.


Old Answer

This query works only every the period is not completely inside the precedent.

The idea is to check for every row if it is linked with the next/previous one.

If rows are linked they form a block and they will be grouped together to get the first DateFrom and the last DateTo

With N AS (
SELECT Code, DateFrom, DateTo
, PrevStop = LAG(DateTo, 1, NULL)
OVER (PARTITION BY Code ORDER BY DateFrom)
FROM Table1
), B AS (
SELECT Code, DateFrom, DateTo
, Block = SUM(CASE WHEN PrevStop Is Null Then 1
WHEN PrevStop < DateFrom Then 1
ELSE 0
END)
OVER (PARTITION BY Code ORDER BY PrevStop)
FROM N
)
SELECT Code
, MIN(DateFrom) DateFrom
, MAX(DateTo) DateTo
FROM B
GROUP BY Code, Block
ORDER BY Code, Block

SQLFiddle demo with some data added to check with more block on the same code/day

The query search for the block starter checking every row if they are the first for the code (PrevStop IS NULL) or if they are outside the previous one (PrevStop < DateFrom).

The windowed SUM retrieve only the previous row by the ORDER to create costant value for block of linked data, for example with the test data we will get

Code | DateStart           | DateFrom            | Starter | Block
-----+---------------------+---------------------+---------+------
1006 | 2014-06-18 07:00:00 | 2014-06-18 09:00:00 | 1 | 1
1006 | 2014-06-18 08:10:00 | 2014-06-18 06:00:00 | 0 | 1
1006 | 2014-06-18 08:00:00 | 2014-06-18 08:30:00 | 0 | 1
1006 | 2014-06-18 07:00:00 | 2014-06-18 07:30:00 | 1 | 2
1006 | 2014-06-18 08:00:00 | 2014-06-18 08:30:00 | 1 | 3
1006 | 2014-06-18 08:10:00 | 2014-06-18 09:00:00 | 0 | 3
3006 | 2014-06-18 07:00:00 | 2014-06-18 08:00:00 | 1 | 1
3006 | 2014-06-18 09:00:00 | 2014-06-18 10:00:00 | 1 | 2

grouping by Code and Block get the result

Get Start and End date from multiple rows of dates, excluding weekends

This is a Gaps & Island problem with the twist that you need to consider weekend continuity.

You can do:

select max(name) as name, min(date_from) as date_from, max(date_to) as date_to
from (
select *, sum(inc) over(order by date_to) as grp
from (
select *,
case when lag(ext_to) over(order by date_to) = date_from
then 0 else 1 end as inc
from (
select *,
case when (datepart(weekday, date_to) = 6)
then dateadd(day, 3, date_to)
else dateadd(day, 1, date_to) end as ext_to
from t
) x
) y
) z
group by grp

Result:

name  date_from   date_to
---- ---------- ----------
A 2021-11-08 2021-11-09
A 2021-12-23 2022-01-03

See running example at db<>fiddle #1.

Note: Your question doesn't mention it, but you probably want to segment per person. I didn't do it.

EDIT: Adding partition by name

Partitioning by name is quite easy actually. The following query does it:

select name, min(date_from) as date_from, max(date_to) as date_to
from (
select *, sum(inc) over(partition by name order by date_to) as grp
from (
select *,
case when lag(ext_to) over(partition by name order by date_to) = date_from
then 0 else 1 end as inc
from (
select *,
case when (datepart(weekday, date_to) = 6)
then dateadd(day, 3, date_to)
else dateadd(day, 1, date_to) end as ext_to
from t
) x
) y
) z
group by name, grp
order by name, grp

See running query at db<>fiddle #2.

How to find contiguous dates in numerous rows in SQL Server

You can find where periods of "continuous" service begin by using lag(). Then a cumulative sum of this flag provides a group, which can be used for aggregation:

select people_id, min(datestart) as datestart,
(case when count(dateend) = count(*) then max(dateend) end) as dateend
from (select t.*,
sum(case when prev_dateend = datestart then 0 else 1 end) over
(partition by people_id order by datestart) as grp
from (select t.*,
lag(dateend) over (partition by people_id order by date_start) as prev_dateend
from t
) t
) t
group by people_id, grp
having count(*) > count(dateend);

SQL: find continuous date ranges across multiple rows?

Sample data

create table tbl (person int, startdate datetime, enddate datetime, hours decimal(10,2));
insert tbl values
(5163 ,'2013-04-29 07:00:00.000' ,'2013-04-29 11:00:00.000', 4.00),
(5163 ,'2013-04-29 11:30:00.000' ,'2013-04-29 15:30:00.000', 4.00),
(5163 ,'2013-04-29 15:30:00.000' ,'2013-04-29 19:06:00.000', 3.60),
(5851 ,'2013-05-02 19:00:00.000' ,'2013-05-02 23:00:00.000', 4.00),
(5851 ,'2013-05-02 23:00:00.000' ,'2013-05-03 00:00:00.000', 1.00),
(5851 ,'2013-05-03 00:00:00.000' ,'2013-05-03 00:31:00.000', 0.52);

The query

;with cte as (
select person, startdate, enddate, hours
from tbl
union all
select t.person, cte.startdate, t.enddate, cast(cte.hours + t.hours as decimal(10,2))
from cte
join tbl t on cte.person = t.person and cte.enddate = t.startdate
), cte2 as (
select *, rn = row_number() over (partition by person, enddate order by startdate)
from cte
)
select person, startdate, max(enddate) enddate, max(hours) hours
from cte2
where rn=1
group by person, startdate
order by person, startdate;

Results

person      startdate               enddate                 hours
----------- ----------------------- ----------------------- -------
5163 2013-04-29 07:00:00.000 2013-04-29 11:00:00.000 4.00
5163 2013-04-29 11:30:00.000 2013-04-29 19:06:00.000 7.60
5851 2013-05-02 19:00:00.000 2013-05-03 00:31:00.000 5.52


Related Topics



Leave a reply



Submit