Merging Date Intervals in SQL Server

Merging date intervals in SQL Server

It takes longer for me to set up the sample data than to write the query - it would be better if you posted questions that include CREATE TABLE and INSERT/SELECT statements. I don't know what your table is called, I've called mine Periods:

create table Periods (
    StartDate date not null,
    EndDate date not null
)
go
insert into Periods(StartDate,EndDate)
select '19820302','19820930' union all
select '19821001','19850117' union all
select '19850626','19850726' union all
select '19850730','19911231' union all
select '19920101','19951231' union all
select '19960101','20040531' union all
select '20040605','20060131' union all
select '20060201','20110520'
go
; with MergedPeriods as (
    Select p1.StartDate, p1.EndDate
    from
        Periods p1
            left join
        Periods p2
            on
                p1.StartDate = DATEADD(day,1,p2.EndDate)
    where
        p2.StartDate is null
    union all
    select p1.StartDate,p2.EndDate
    from
        MergedPeriods p1
            inner join
        Periods p2
            on
                p1.EndDate = DATEADD(day,-1,p2.StartDate)
)
select StartDate,MAX(EndDate) as EndDate
from MergedPeriods group by StartDate

Result:

StartDate   EndDate
1982-03-02  1985-01-17
1985-06-26  1985-07-26
1985-07-30  2004-05-31
2004-06-05  2011-05-20

Merge overlapping dates in SQL Server

SQL DEMO

declare @t table (Name varchar(100),  Datetime_Start  datetime,  Datetime_End datetime);
insert into @t values
 ('A'   , '2017-01-02 00:00' , '2017-03-28 00:10'),
 ('A'   , '2017-05-14 23:50' , '2017-05-29 23:50'),
 ('B'   , '2017-05-18 00:00' , '2017-05-18 04:00'),
 ('B'   , '2017-05-18 02:00' , '2017-05-18 03:00'),
 ('C'   , '2017-01-02 00:00' , '2017-01-17 15:50'),
 ('C'   , '2017-01-14 03:50' , '2017-01-28 15:50');

with Datetime_Starts as 
( 
  select distinct name, Datetime_Start 
  from @t as t1 
  where not exists 
    (select * from @t as t2 
     where t2.name = t1.name 
       and t2.Datetime_Start < t1.Datetime_Start 
       and t2.Datetime_End >= t1.Datetime_Start) 
), 
Datetime_Ends as 
( 
  select distinct name, Datetime_End 
  from @t as t1 
  where not exists 
    (select * from @t as t2 
     where t2.name = t1.name 
       and t2.Datetime_End > t1.Datetime_End 
       and t2.Datetime_Start <= t1.Datetime_End) 
) 

select name, Datetime_Start, 
      (select min(Datetime_End) 
        from Datetime_Ends as e 
        where e.name = s.name 
            and Datetime_End >= Datetime_Start) as Datetime_End 
    from Datetime_Starts as s;

merge consecutive dates ranges using SQL

This is a Gaps & Islands problem. You can use the typical solution using LAG(). For example:

select
  max(client_id) as client_id,
  max(status) as status,
  min(start_date) as start_date,
  max(end_date) as end_date
from (
  select *, sum(i) over(partition by client_id order by start_date) as g
  from (
    select *,
      case when dateadd(day, -1, start_date) <> 
        lag(end_date) over(partition by client_id order by start_date) 
      then 1 else 0 end as i
    from t
  ) x
) y
group by client_id, g
order by client_id, g

Result:

 client_id  status  start_date  end_date   
 ---------- ------- ----------- ---------- 
 1          1       2022-01-01  2022-01-04 
 1          1       2022-01-12  2022-01-15 
 2          1       2022-01-03  2022-01-03 
 2          1       2022-01-05  2022-01-06

See running example at db<>fiddle.

Combine consecutive date ranges

The strange bit you see with my use of the date '31211231' is just a very large date to handle your "no-end-date" scenario. I have assumed you won't really have many date ranges per employee, so I've used a simple Recursive Common Table Expression to combine the ranges.

To make it run faster, the starting anchor query keeps only those dates that will not link up to a prior range (per employee). The rest is just tree-walking the date ranges and growing the range. The final GROUP BY keeps only the largest date range built up per starting ANCHOR (employmentid, startdate) combination.

SQL Fiddle

MS SQL Server 2008 Schema Setup:

create table Tbl (
  employmentid int,
  startdate datetime,
  enddate datetime);

insert Tbl values
(5, '2007-12-03', '2011-08-26'),
(5, '2013-05-02', null),
(30, '2006-10-02', '2011-01-16'),
(30, '2011-01-17', '2012-08-12'),
(30, '2012-08-13', null),
(66, '2007-09-24', null);

/*
-- expected outcome
EmploymentId StartDate   EndDate
5            2007-12-03  2011-08-26
5            2013-05-02  NULL
30           2006-10-02  NULL
66           2007-09-24  NULL
*/

Query 1:

;with cte as (
   select a.employmentid, a.startdate, a.enddate
     from Tbl a
left join Tbl b on a.employmentid=b.employmentid and a.startdate-1=b.enddate
    where b.employmentid is null
    union all
   select a.employmentid, a.startdate, b.enddate
     from cte a
     join Tbl b on a.employmentid=b.employmentid and b.startdate-1=a.enddate
)
   select employmentid,
          startdate,
          nullif(max(isnull(enddate,'32121231')),'32121231') enddate
     from cte
 group by employmentid, startdate
 order by employmentid

Results:

| EMPLOYMENTID |                        STARTDATE |                       ENDDATE |
-----------------------------------------------------------------------------------
|            5 |  December, 03 2007 00:00:00+0000 | August, 26 2011 00:00:00+0000 |
|            5 |       May, 02 2013 00:00:00+0000 |                        (null) |
|           30 |   October, 02 2006 00:00:00+0000 |                        (null) |
|           66 | September, 24 2007 00:00:00+0000 |                        (null) |

Combine continuous datetime intervals by type

Since your ranges are continuous, the problem essentially becomes a gaps-and-islands one. If only you had a criterion to help you to distinguish between different sequences with the same t value, you could group all the rows using that criterion, then just take MIN(s), MAX(e) for every group.

One method of obtaining such a criterion is to use two ROW_NUMBER calls. Consider the following query:

SELECT
  *,
  rnk1 = ROW_NUMBER() OVER (               ORDER BY s),
  rnk2 = ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM @periods
;

For your example it would return the following set:

s           e           t   rnk1  rnk2
----------  ----------  --  ----  ----
2013-01-01  2013-01-02  3   1     1
2013-01-02  2013-01-04  1   2     1
2013-01-04  2013-01-05  1   3     2
2013-01-05  2013-01-06  2   4     1
2013-01-06  2013-01-07  2   5     2
2013-01-07  2013-01-08  2   6     3
2013-01-08  2013-01-09  1   7     3

The interesting thing about the rnk1 and rnk2 rankings is that if you subtract one from the other, you will get values that, together with t, uniquely identify every distinct sequence of rows with the same t:

s           e           t   rnk1  rnk2  rnk1 - rnk2
----------  ----------  --  ----  ----  -----------
2013-01-01  2013-01-02  3   1     1     0
2013-01-02  2013-01-04  1   2     1     1
2013-01-04  2013-01-05  1   3     2     1
2013-01-05  2013-01-06  2   4     1     3
2013-01-06  2013-01-07  2   5     2     3
2013-01-07  2013-01-08  2   6     3     3
2013-01-08  2013-01-09  1   7     3     4

Knowing that, you can easily apply grouping and aggregation. This is what the final query might look like:

WITH partitioned AS (
  SELECT
    *,
    g = ROW_NUMBER() OVER (               ORDER BY s)
      - ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
  FROM @periods
)
SELECT
  s = MIN(s),
  e = MAX(e),
  t
FROM partitioned
GROUP BY
  t,
  g
;

If you like, you can play with this solution at SQL Fiddle.

Merging Date Intervals in SQL Server