Detect consecutive dates in SQL and start counter
One way to tackle such Gaps-And-Islands problem is to calculate a rank that's based on the difference with the previous date.
The method to calculate the difference in days between dates however depends on your RDBMS.
This example uses DATEDIFF (MS Sql Server)
SELECT
Dates,
ROW_NUMBER() OVER (PARTITION BY Rnk ORDER BY Dates) AS Counter
FROM
(
SELECT
Dates,
SUM(CASE
WHEN DATEDIFF(day, prevDate, Dates) = 1 THEN 0
ELSE 1 END) OVER (ORDER BY Dates) AS Rnk
FROM
(
SELECT
Dates,
LAG(Dates) OVER (ORDER BY Dates) AS prevDate
FROM your_table
) q1
) q2
ORDER BY Dates;
SQL query to find gaps within a column of dates
This is a type of gaps-and-islands problem. In this case, subtracting a sequential number from each day is probably the simplest solution for identifying the "islands":
select user, status, count(*) as num_days, min(date), max(date)
from (select t.*,
row_number() over (partition by user, status order by date) as seqnum
from t
) t
group by user, status, date - seqnum * interval '1 day'
SQL - Group rows by contiguous date
For RDBMS supported window functions (example on MS SQL database):
with Test(value, dt) as(
select 100, cast('2000-01-01' as date) union all
select 110, cast('2002-05-01' as date) union all
select 100, cast('2003-10-01' as date) union all
select 100, cast('2004-12-01' as date)
)
select max(value) value, min(dt) startDate, max(end_dt) endDate
from (
select a.*, sum(brk) over(order by dt) grp
from (
select t.*,
case when value!=lag(value) over(order by dt) then 1 else 0 end brk,
DATEADD(DAY,-1,lead(dt,1,cast('2099-01-02' as date)) over(order by dt)) end_dt
from Test t
) a
) b
group by grp
order by startDate
How to form groups of consecutive dates allowing for a given maximum gap?
Counting the gaps (greater than given tolerance) in a second window function forms the group numbers you are after:
SELECT person_id, contact_day
, count(*) FILTER (WHERE gap > 3) OVER (PARTITION BY person_id ORDER BY contact_day) AS dash_group
FROM (
SELECT person_id, contact_day
, contact_day - lag(contact_day) OVER (PARTITION BY person_id ORDER BY contact_day) AS gap
FROM mydata
) sub
ORDER BY person_id, contact_day; -- optional
db<>fiddle here
About the aggregate FILTER
clause:
- Aggregate columns with additional (distinct) filters
It's short and intuitive, and typically fastest. See:
- For absolute performance, is SUM faster or COUNT?
It's the classic topic of "gaps and islands". Once you know to look for the tag gaps-and-islands, you'll find plenty of related or nigh identical questions and answers like:
- Select longest continuous sequence
- How to group timestamps into islands (based on arbitrary gap)?
- How to label groups in postgresql when group belonging depends on the preceding line?
Etc.
I tagged accordingly now.
Finding a consecutive date ranges in ms sql server
Try the following:
WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY PRODUCT_TIER5_DESC ORDER BY BEGIN_EFFECTIVE_DT) -
ROW_NUMBER() OVER (PARTITION BY PRODUCT_TIER5_DESC,PRODUCT_TIER4_DESC ORDER BY BEGIN_EFFECTIVE_DT) GRP
FROM test_0907
)
SELECT PRODUCT_TIER4_DESC, PRODUCT_TIER5_DESC,
MIN(BEGIN_EFFECTIVE_DT) BEGIN_EFFECTIVE_DT, MAX(END_EFFECTIVE_DT) END_EFFECTIVE_DT
FROM CTE
GROUP BY PRODUCT_TIER4_DESC, PRODUCT_TIER5_DESC,GRP
See a demo.
Group islands of contiguous dates, including missing weekends
I find that is easier to use lag()
and a window sum to implement the logic you want:
select employee, min(actionDate) actionStart, max(actionDate) actionEnd, count(*) actionLength
from (
select
a.*, sum(
case when actionDate = dateadd(day, 1, lagActionDate)
or (actionDate = dateadd(day, 3, lagActionDate) and datename(weekday, actionDate) = 'Monday')
then 0 else 1 end
) over(partition by employee order by actionDate) grp
from (
select
a.*,
lag(actionDate) over(partition by employee order by actionDate) lagActionDate
from actions a
) a
) a
group by employee, grp
Demo on DB Fiddle:
employee | actionStart | actionEnd | actionLength
:------- | :---------- | :--------- | -----------:
AA | 2019-01-03 | 2019-01-08 | 4
BB | 2019-08-01 | 2019-08-06 | 6
CC | 2019-09-09 | 2019-09-18 | 8
find start and stop date for contiguous dates in multiple rows
The old answer had a weakness: every row is checked only with the previous to verify if the period are overlapping, if an earlier row have a period that last more the logic will not consider it. For example:
Code | DateStart | DateFrom | Overlap
-----+---------------------+---------------------+---------
1006 | 2014-06-18 07:00:00 | 2014-06-18 19:00:00 | 0
1006 | 2014-06-18 08:10:00 | 2014-06-18 10:00:00 | 1
1006 | 2014-06-18 16:00:00 | 2014-06-18 20:30:00 | 0 <- don't overlap with
previous but overlap
with the first
To improve that PrevStop
need to become LastStop
and have the value of the max of the previous DateFrom
for the Code
With N AS (
SELECT Code, DateFrom, DateTo
, LastStop = MAX(DateTo)
OVER (PARTITION BY Code ORDER BY DateFrom, DateTo
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
FROM Table1
), B AS (
SELECT Code, DateFrom, DateTo
, Block = SUM(CASE WHEN LastStop Is Null Then 1
WHEN LastStop < DateFrom Then 1
ELSE 0
END)
OVER (PARTITION BY Code ORDER BY DateFrom, LastStop)
FROM N
)
SELECT Code
, MIN(DateFrom) DateFrom
, MAX(DateTo) DateTo
FROM B
GROUP BY Code, Block
ORDER BY Code, Block
SQLFiddle Demo
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
is needed to remove the current row from theMAX
.
Old Answer
This query works only every the period is not completely inside the precedent.
The idea is to check for every row if it is linked with the next/previous one.
If rows are linked they form a block and they will be grouped together to get the first DateFrom
and the last DateTo
With N AS (
SELECT Code, DateFrom, DateTo
, PrevStop = LAG(DateTo, 1, NULL)
OVER (PARTITION BY Code ORDER BY DateFrom)
FROM Table1
), B AS (
SELECT Code, DateFrom, DateTo
, Block = SUM(CASE WHEN PrevStop Is Null Then 1
WHEN PrevStop < DateFrom Then 1
ELSE 0
END)
OVER (PARTITION BY Code ORDER BY PrevStop)
FROM N
)
SELECT Code
, MIN(DateFrom) DateFrom
, MAX(DateTo) DateTo
FROM B
GROUP BY Code, Block
ORDER BY Code, Block
SQLFiddle demo with some data added to check with more block on the same code/day
The query search for the block starter checking every row if they are the first for the code (PrevStop IS NULL
) or if they are outside the previous one (PrevStop < DateFrom
).
The windowed SUM
retrieve only the previous row by the ORDER
to create costant value for block of linked data, for example with the test data we will get
Code | DateStart | DateFrom | Starter | Block
-----+---------------------+---------------------+---------+------
1006 | 2014-06-18 07:00:00 | 2014-06-18 09:00:00 | 1 | 1
1006 | 2014-06-18 08:10:00 | 2014-06-18 06:00:00 | 0 | 1
1006 | 2014-06-18 08:00:00 | 2014-06-18 08:30:00 | 0 | 1
1006 | 2014-06-18 07:00:00 | 2014-06-18 07:30:00 | 1 | 2
1006 | 2014-06-18 08:00:00 | 2014-06-18 08:30:00 | 1 | 3
1006 | 2014-06-18 08:10:00 | 2014-06-18 09:00:00 | 0 | 3
3006 | 2014-06-18 07:00:00 | 2014-06-18 08:00:00 | 1 | 1
3006 | 2014-06-18 09:00:00 | 2014-06-18 10:00:00 | 1 | 2
grouping by Code
and Block
get the result
Related Topics
Listagg Query "Ora-00937: Not a Single-Group Group Function"
SQL Case: Does the Order of the When Statements Matter
How to Parse Xml Tags in Bigquery Standard SQL
Deleting Duplicate Records Using a Temporary Table
How to Select Only the Latest Entry in a Table
Sql: Difference Between Two Dates
How to Create Sequence Using Starting Value from Query
As400 SQL Query with Parameter
Update Oracle Table with Values from CSV File
In SQL, What Is the Letter After a Table Name in a Select Statement
Listagg Alternative in Oracle 10G
Sql:Find Rows and Sort According to Number of Matching Columns
Query a Table and a Column Name Stored in a Table
Multiple Column Values in a Single Row
SQL Transform Crosstab Pivot Data
Help with SQL Server Trigger to Truncate Bad Data Before Insert