Group consecutive rows of same value using time spans
The query determines each rows EndTime
by using NOT EXISTS
to make sure no other class or course of a different type is scheduled between a course range's StartTime
and EndTime
and then uses MIN
and GROUP BY
to find the StartTime
.
The NOT EXISTS
part ensures that there aren't "breaks" between the StartTime
and EndTime
ranges by searching for any rows that have an EndTime
between StartTime
and EndTime
but belong to a different CourseName
or CourseRoom
.
SELECT
t0.ClassRoom,
t0.CourseName,
MIN(t0.StartTime),
t0.EndTime
FROM (
SELECT
t1.ClassRoom,
t1.CourseName,
t1.StartTime,
(
SELECT MAX(t2.EndTime)
FROM tableA t2
WHERE t2.CourseName = t1.CourseName
AND t2.ClassRoom = t1.ClassRoom
AND NOT EXISTS (SELECT 1 FROM tableA t3
WHERE t3.EndTime < t2.EndTime
AND t3.EndTime > t1.EndTime
AND (t3.CourseName <> t2.CourseName
OR t3.ClassRoom <> t2.ClassRoom)
)
) EndTime
FROM tableA t1
) t0 GROUP BY t0.ClassRoom, t0.CourseName, t0.EndTime
http://www.sqlfiddle.com/#!6/39d4b/9
How to group consecutive rows with same values in a result table into groups with date_from and date_until
If you enumerate the rows for each combination of f1
, f2
, and f3
, then subtract that number of days from datum
, then the value will be constant on adjacent days where the three columns are the same.
The rest is just aggregation:
select f1, f2, f3, min(datum), max(datum)
from (select t1.*,
row_number() over (partition by f1, f2, f3 order by datum) as seqnum
from test1 t1
) t1
group by f1, f2, f3, datum - seqnum * interval '1 day'
order by min(datum);
Here is a db<>fiddle.
Condense multiple consecutive rows using first and last row
This is a gaps and islands problem, where you want to group together consecutive rows that have the same status and adjacent periods.
You can use window functions; the idea is to define groups with a window sum that increments whenever there is a status change or a periods break:
select min(status) as status, min(starttime) as starttime, max(endtime) as endtime
from (
select t.*,
sum(case when starttime = lag_endtime and status = lag_status then 0 else 1 end) over(order by starttime) as grp
from (
select t.*,
lag(endtime) over(order by starttime) lag_endtime,
lag(status) over(order by starttime) lag_status
from mytable t
) t
) t
group by grp
Demo on DB Fiddle:
status | starttime | endtime
:----- | :---------------------- | :----------------------
State1 | 2020-11-01 13:00:29.000 | 2020-11-01 13:05:29.000
State2 | 2020-11-01 13:05:29.000 | 2020-11-01 13:11:31.000
State2 | 2020-11-01 16:19:35.000 | 2020-11-01 16:19:55.000
how to group consecutive rows?
If I understand the posted problem correctly, then your CTE effectively determines time buckets (or intervals) for all of your alarms. Your final select clause joins the actual alarm information with your alarm intervals.
Part of your problem is your alarming system will continue to log “Alarms Scanned” entries if your alarm remains active for prolonged periods (I assume longer than your alarm scan cycle) which effectively causes active alarms to be split.
If you have SQL Server 2012 or higher, then it is relatively easy to determine if the alarm event got split. You simply need to check if the end time of an alarm is equal to the start time of the next alarm of the same alarm type. You can achieve this with the use of the LAG windowing function in 2012.
The next step is to generate an ID that you can group your alarm by so that you can combine your split events. This is achieved via the SUM OVER clause.
The following example shows how this can be achieved:
;WITH AlarmTimeBuckets
AS
(
SELECT EventStart.Ev_Comment AS StartDateTime
,MIN(COALESCE (EventEnd.Ev_Comment, EventStart.Ev_Comment)) AS EndDateTime
,EventStart.Ev_Message As Machine
FROM A EventStart
INNER JOIN A EventEnd ON EventStart.Ev_Comment < EventEnd.Ev_Comment AND EventStart.Ev_Custom1 = 'Alarms Scanned' AND EventEnd.Ev_Custom1 = 'Alarms Scanned' AND EventStart.Ev_Message = EventEnd.Ev_Message
GROUP BY EventStart.Ev_Message, EventStart.Ev_Comment
),
AlarmsByTimeBucket
AS
(
SELECT AlarmTimeBuckets.Machine
,AlarmTimeBuckets.StartDateTime
,AlarmTimeBuckets.EndDateTime
,Alarm.Ev_Custom1 AS Alarm
,(
CASE
WHEN LAG(AlarmTimeBuckets.EndDateTime, 1, NULL) OVER (PARTITION BY Alarm.Ev_Custom1,Alarm.Ev_Message ORDER BY AlarmTimeBuckets.StartDateTime) = AlarmTimeBuckets.StartDateTime THEN 0
ELSE 1
END
) AS IsNewEvent
FROM A Alarm
INNER JOIN AlarmTimeBuckets ON Alarm.Ev_Message = AlarmTimeBuckets.Machine AND Alarm.Ev_Comment = AlarmTimeBuckets.StartDateTime
WHERE (Alarm.Ev_Custom1 <> 'Alarms Scanned')
)
,
AlarmsByGroupingID
AS
(
SELECT Machine
,StartDateTime
,EndDateTime
,Alarm
,SUM(IsNewEvent) OVER (ORDER BY Machine, Alarm, StartDateTime) AS GroupingID
FROM AlarmsByTimeBucket
)
SELECT MAX(Machine) AS Machine
,MIN(StartDateTime) AS StartDateTime
,MAX(EndDateTime) AS EndDateTime
,MAX(Alarm) AS Alarm
FROM AlarmsByGroupingID
GROUP BY GroupingID
ORDER BY StartDateTime
Group rows in data frame based on time difference between consecutive rows
Here is another possibility which groups rows where the time difference between consecutive rows is less than 4 days.
# create date variable
df$date <- with(df, as.Date(paste(YEAR, MONTH, DAY, sep = "-")))
# calculate succesive differences between dates
# and identify gaps larger than 4
df$gap <- c(0, diff(df$date) > 4)
# cumulative sum of 'gap' variable
df$group <- cumsum(df$gap) + 1
df
# YEAR MONTH DAY HOUR LON LAT date gap group
# 1 1860 10 3 13 -19.5 3 1860-10-03 0 1
# 2 1860 10 3 17 -19.5 4 1860-10-03 0 1
# 3 1860 10 3 21 -19.5 5 1860-10-03 0 1
# 4 1860 10 5 5 -20.5 6 1860-10-05 0 1
# 5 1860 10 5 13 -21.5 7 1860-10-05 0 1
# 6 1860 10 5 17 -21.5 8 1860-10-05 0 1
# 7 1860 10 6 1 -22.5 9 1860-10-06 0 1
# 8 1860 10 6 5 -22.5 10 1860-10-06 0 1
# 9 1860 12 5 9 -22.5 -7 1860-12-05 1 2
# 10 1860 12 5 18 -23.5 -8 1860-12-05 0 2
# 11 1860 12 5 22 -23.5 -9 1860-12-05 0 2
# 12 1860 12 6 6 -24.5 -10 1860-12-06 0 2
# 13 1860 12 6 10 -24.5 -11 1860-12-06 0 2
# 14 1860 12 6 18 -24.5 -12 1860-12-06 0 2
Disclaimer: the diff
& cumsum
part is inspired by this Q&A: How to partition a vector into groups of regular, consecutive sequences?.
Grouping consecutive date periods with criteria
Solution: Use binary grouping for "consecutive" date criteria (in this case < 7 days) and then use that grouping in a sum over with rows unbounded preceding
with cte as (
select
*,
COALESCE(DATEDIFF(dd, LAG(EndDate, 1, NULL) OVER (PARTITION BY [Type] ORDER BY StartDate), StartDate),0) AS GapDays
from
#data
),
cte2 as (
select
*,
case when GapDays < 7 then 0 else 1 end as group1
from
cte
),
cte3 as (
select
*,
sum(group1) over (partition by [type] order by startDate, endDate rows unbounded preceding) as group2
from
cte2
)
select
[TYPE],
MIN(StartDate) AS StartDate,
MAX(EndDate) AS EndDate,
SUM(Dollars) AS Dollars,
SUM(CASE WHEN GapDays > 7 THEN 0 ELSE GapDays END) AS GapDays
from
cte3
group by
[Type], group2
Related Topics
Editing Record Issues in Access/SQL (Write Conflict)
How to Count Decimal Places in SQL
How to Concatenate All Columns in a Select with SQL Server
How to Determine If a String Is Numeric in SQL
What Should I Name a Table That Maps Two Tables Together
Select Not in Multiple Columns
Postgresql: Full Text Search - How to Search Partial Words
No Unique or Exclusion Constraint Matching the on Conflict
How to Use Update Trigger to Update Another Table
Decode( ) Function in SQL Server
Replacing Null and Empty String Within Select Statement
SQL Error: Ora-02291: Integrity Constraint
Update Multiple Columns in a Trigger Function in Plpgsql
How to Pivot Dynamically with Date as Column
When to Use a View Instead of a Table