find start and stop date for contiguous dates in multiple rows
The old answer had a weakness: every row is checked only with the previous to verify if the period are overlapping, if an earlier row have a period that last more the logic will not consider it. For example:
Code | DateStart | DateFrom | Overlap
-----+---------------------+---------------------+---------
1006 | 2014-06-18 07:00:00 | 2014-06-18 19:00:00 | 0
1006 | 2014-06-18 08:10:00 | 2014-06-18 10:00:00 | 1
1006 | 2014-06-18 16:00:00 | 2014-06-18 20:30:00 | 0 <- don't overlap with
previous but overlap
with the first
To improve that PrevStop
need to become LastStop
and have the value of the max of the previous DateFrom
for the Code
With N AS (
SELECT Code, DateFrom, DateTo
, LastStop = MAX(DateTo)
OVER (PARTITION BY Code ORDER BY DateFrom, DateTo
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
FROM Table1
), B AS (
SELECT Code, DateFrom, DateTo
, Block = SUM(CASE WHEN LastStop Is Null Then 1
WHEN LastStop < DateFrom Then 1
ELSE 0
END)
OVER (PARTITION BY Code ORDER BY DateFrom, LastStop)
FROM N
)
SELECT Code
, MIN(DateFrom) DateFrom
, MAX(DateTo) DateTo
FROM B
GROUP BY Code, Block
ORDER BY Code, Block
SQLFiddle Demo
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
is needed to remove the current row from theMAX
.
Old Answer
This query works only every the period is not completely inside the precedent.
The idea is to check for every row if it is linked with the next/previous one.
If rows are linked they form a block and they will be grouped together to get the first DateFrom
and the last DateTo
With N AS (
SELECT Code, DateFrom, DateTo
, PrevStop = LAG(DateTo, 1, NULL)
OVER (PARTITION BY Code ORDER BY DateFrom)
FROM Table1
), B AS (
SELECT Code, DateFrom, DateTo
, Block = SUM(CASE WHEN PrevStop Is Null Then 1
WHEN PrevStop < DateFrom Then 1
ELSE 0
END)
OVER (PARTITION BY Code ORDER BY PrevStop)
FROM N
)
SELECT Code
, MIN(DateFrom) DateFrom
, MAX(DateTo) DateTo
FROM B
GROUP BY Code, Block
ORDER BY Code, Block
SQLFiddle demo with some data added to check with more block on the same code/day
The query search for the block starter checking every row if they are the first for the code (PrevStop IS NULL
) or if they are outside the previous one (PrevStop < DateFrom
).
The windowed SUM
retrieve only the previous row by the ORDER
to create costant value for block of linked data, for example with the test data we will get
Code | DateStart | DateFrom | Starter | Block
-----+---------------------+---------------------+---------+------
1006 | 2014-06-18 07:00:00 | 2014-06-18 09:00:00 | 1 | 1
1006 | 2014-06-18 08:10:00 | 2014-06-18 06:00:00 | 0 | 1
1006 | 2014-06-18 08:00:00 | 2014-06-18 08:30:00 | 0 | 1
1006 | 2014-06-18 07:00:00 | 2014-06-18 07:30:00 | 1 | 2
1006 | 2014-06-18 08:00:00 | 2014-06-18 08:30:00 | 1 | 3
1006 | 2014-06-18 08:10:00 | 2014-06-18 09:00:00 | 0 | 3
3006 | 2014-06-18 07:00:00 | 2014-06-18 08:00:00 | 1 | 1
3006 | 2014-06-18 09:00:00 | 2014-06-18 10:00:00 | 1 | 2
grouping by Code
and Block
get the result
Get Start and End date from multiple rows of dates, excluding weekends
This is a Gaps & Island problem with the twist that you need to consider weekend continuity.
You can do:
select max(name) as name, min(date_from) as date_from, max(date_to) as date_to
from (
select *, sum(inc) over(order by date_to) as grp
from (
select *,
case when lag(ext_to) over(order by date_to) = date_from
then 0 else 1 end as inc
from (
select *,
case when (datepart(weekday, date_to) = 6)
then dateadd(day, 3, date_to)
else dateadd(day, 1, date_to) end as ext_to
from t
) x
) y
) z
group by grp
Result:
name date_from date_to
---- ---------- ----------
A 2021-11-08 2021-11-09
A 2021-12-23 2022-01-03
See running example at db<>fiddle #1.
Note: Your question doesn't mention it, but you probably want to segment per person. I didn't do it.
EDIT: Adding partition by name
Partitioning by name is quite easy actually. The following query does it:
select name, min(date_from) as date_from, max(date_to) as date_to
from (
select *, sum(inc) over(partition by name order by date_to) as grp
from (
select *,
case when lag(ext_to) over(partition by name order by date_to) = date_from
then 0 else 1 end as inc
from (
select *,
case when (datepart(weekday, date_to) = 6)
then dateadd(day, 3, date_to)
else dateadd(day, 1, date_to) end as ext_to
from t
) x
) y
) z
group by name, grp
order by name, grp
See running query at db<>fiddle #2.
How to find contiguous dates in numerous rows in SQL Server
You can find where periods of "continuous" service begin by using lag()
. Then a cumulative sum of this flag provides a group, which can be used for aggregation:
select people_id, min(datestart) as datestart,
(case when count(dateend) = count(*) then max(dateend) end) as dateend
from (select t.*,
sum(case when prev_dateend = datestart then 0 else 1 end) over
(partition by people_id order by datestart) as grp
from (select t.*,
lag(dateend) over (partition by people_id order by date_start) as prev_dateend
from t
) t
) t
group by people_id, grp
having count(*) > count(dateend);
SQL: find continuous date ranges across multiple rows?
Sample data
create table tbl (person int, startdate datetime, enddate datetime, hours decimal(10,2));
insert tbl values
(5163 ,'2013-04-29 07:00:00.000' ,'2013-04-29 11:00:00.000', 4.00),
(5163 ,'2013-04-29 11:30:00.000' ,'2013-04-29 15:30:00.000', 4.00),
(5163 ,'2013-04-29 15:30:00.000' ,'2013-04-29 19:06:00.000', 3.60),
(5851 ,'2013-05-02 19:00:00.000' ,'2013-05-02 23:00:00.000', 4.00),
(5851 ,'2013-05-02 23:00:00.000' ,'2013-05-03 00:00:00.000', 1.00),
(5851 ,'2013-05-03 00:00:00.000' ,'2013-05-03 00:31:00.000', 0.52);
The query
;with cte as (
select person, startdate, enddate, hours
from tbl
union all
select t.person, cte.startdate, t.enddate, cast(cte.hours + t.hours as decimal(10,2))
from cte
join tbl t on cte.person = t.person and cte.enddate = t.startdate
), cte2 as (
select *, rn = row_number() over (partition by person, enddate order by startdate)
from cte
)
select person, startdate, max(enddate) enddate, max(hours) hours
from cte2
where rn=1
group by person, startdate
order by person, startdate;
Results
person startdate enddate hours
----------- ----------------------- ----------------------- -------
5163 2013-04-29 07:00:00.000 2013-04-29 11:00:00.000 4.00
5163 2013-04-29 11:30:00.000 2013-04-29 19:06:00.000 7.60
5851 2013-05-02 19:00:00.000 2013-05-03 00:31:00.000 5.52
Related Topics
Blank Values in Date Column Returning as 1900/01/01 on Running Select Statement
How Does One Escape an Apostrophe in Db2 SQL
Rounding Issue in Log and Exp Functions
Merge Overlapping Time Intervals, How
How to Run Raw SQL with Kotlin's Exposed Library
See SQL from Entityframework with Collection-Like Queries
How to Determine Position of Row in SQL Result-Set
How to Subtract 2 Dates in Oracle to Get the Result in Hour and Minute
Can SQL Profiler Display Return Result Sets Alongside the Query
Powershell SQL Select Output to Variable
SQL - Pivot Table and Group by Not Working
How to Force Oracle to Change a Query's Plan Without Using Hints
MySQL Error 1248 (42000): Every Derived Table Must Have Its Own Alias