Group Consecutive Rows of Same Value Using Time Spans

Group consecutive rows of same value using time spans

The query determines each rows EndTime by using NOT EXISTS to make sure no other class or course of a different type is scheduled between a course range's StartTime and EndTime and then uses MIN and GROUP BY to find the StartTime.

The NOT EXISTS part ensures that there aren't "breaks" between the StartTime and EndTime ranges by searching for any rows that have an EndTime between StartTime and EndTime but belong to a different CourseName or CourseRoom.

SELECT    
    t0.ClassRoom,
    t0.CourseName,
    MIN(t0.StartTime),
    t0.EndTime
FROM (
    SELECT 
    t1.ClassRoom,
    t1.CourseName,
    t1.StartTime,
    (
        SELECT MAX(t2.EndTime)
        FROM tableA t2
        WHERE t2.CourseName = t1.CourseName
        AND t2.ClassRoom = t1.ClassRoom
        AND NOT EXISTS (SELECT 1 FROM tableA t3
            WHERE t3.EndTime < t2.EndTime 
            AND t3.EndTime > t1.EndTime
            AND (t3.CourseName <> t2.CourseName 
            OR t3.ClassRoom <> t2.ClassRoom)
        )
    ) EndTime
    FROM tableA t1
) t0 GROUP BY t0.ClassRoom, t0.CourseName, t0.EndTime

http://www.sqlfiddle.com/#!6/39d4b/9

How to group consecutive rows with same values in a result table into groups with date_from and date_until

If you enumerate the rows for each combination of f1, f2, and f3, then subtract that number of days from datum, then the value will be constant on adjacent days where the three columns are the same.

The rest is just aggregation:

select f1, f2, f3, min(datum), max(datum)
from (select t1.*,
             row_number() over (partition by f1, f2, f3 order by datum) as seqnum
      from test1 t1
     ) t1
group by f1, f2, f3, datum - seqnum * interval '1 day'
order by min(datum);

Here is a db<>fiddle.

Condense multiple consecutive rows using first and last row

This is a gaps and islands problem, where you want to group together consecutive rows that have the same status and adjacent periods.

You can use window functions; the idea is to define groups with a window sum that increments whenever there is a status change or a periods break:

select min(status) as status, min(starttime) as starttime, max(endtime) as endtime
from (
    select t.*,
        sum(case when starttime = lag_endtime and status = lag_status then 0 else 1 end) over(order by starttime) as grp
    from (
        select t.*,
            lag(endtime) over(order by starttime) lag_endtime,
            lag(status)  over(order by starttime) lag_status
        from mytable t
    ) t
) t
group by grp

Demo on DB Fiddle:


status | starttime               | endtime                
:----- | :---------------------- | :----------------------
State1 | 2020-11-01 13:00:29.000 | 2020-11-01 13:05:29.000
State2 | 2020-11-01 13:05:29.000 | 2020-11-01 13:11:31.000
State2 | 2020-11-01 16:19:35.000 | 2020-11-01 16:19:55.000

how to group consecutive rows?

If I understand the posted problem correctly, then your CTE effectively determines time buckets (or intervals) for all of your alarms. Your final select clause joins the actual alarm information with your alarm intervals.
Part of your problem is your alarming system will continue to log “Alarms Scanned” entries if your alarm remains active for prolonged periods (I assume longer than your alarm scan cycle) which effectively causes active alarms to be split.
If you have SQL Server 2012 or higher, then it is relatively easy to determine if the alarm event got split. You simply need to check if the end time of an alarm is equal to the start time of the next alarm of the same alarm type. You can achieve this with the use of the LAG windowing function in 2012.

The next step is to generate an ID that you can group your alarm by so that you can combine your split events. This is achieved via the SUM OVER clause.
The following example shows how this can be achieved:

;WITH AlarmTimeBuckets
AS 
(
    SELECT       EventStart.Ev_Comment AS StartDateTime 
                ,MIN(COALESCE (EventEnd.Ev_Comment, EventStart.Ev_Comment)) AS EndDateTime
                ,EventStart.Ev_Message As Machine
    FROM         A EventStart 
    INNER JOIN   A EventEnd ON EventStart.Ev_Comment < EventEnd.Ev_Comment AND EventStart.Ev_Custom1 = 'Alarms Scanned' AND EventEnd.Ev_Custom1 = 'Alarms Scanned' AND EventStart.Ev_Message = EventEnd.Ev_Message
    GROUP BY     EventStart.Ev_Message, EventStart.Ev_Comment
),
AlarmsByTimeBucket
AS
(
    SELECT      AlarmTimeBuckets.Machine
               ,AlarmTimeBuckets.StartDateTime
               ,AlarmTimeBuckets.EndDateTime 
               ,Alarm.Ev_Custom1 AS Alarm
               ,(
                 CASE
                    WHEN LAG(AlarmTimeBuckets.EndDateTime, 1, NULL) OVER (PARTITION BY Alarm.Ev_Custom1,Alarm.Ev_Message ORDER BY AlarmTimeBuckets.StartDateTime) = AlarmTimeBuckets.StartDateTime THEN 0
                    ELSE 1
                 END
                ) AS IsNewEvent
    FROM       A Alarm 
    INNER JOIN AlarmTimeBuckets  ON Alarm.Ev_Message = AlarmTimeBuckets.Machine AND  Alarm.Ev_Comment = AlarmTimeBuckets.StartDateTime
    WHERE     (Alarm.Ev_Custom1 <> 'Alarms Scanned')
)
,
AlarmsByGroupingID
AS
(
    SELECT   Machine
            ,StartDateTime
            ,EndDateTime
            ,Alarm
            ,SUM(IsNewEvent) OVER (ORDER BY Machine, Alarm, StartDateTime) AS GroupingID
    FROM    AlarmsByTimeBucket
)
SELECT       MAX(Machine) AS Machine
            ,MIN(StartDateTime) AS StartDateTime
            ,MAX(EndDateTime) AS EndDateTime
            ,MAX(Alarm) AS Alarm
FROM        AlarmsByGroupingID
GROUP BY    GroupingID
ORDER BY    StartDateTime

Group rows in data frame based on time difference between consecutive rows

Here is another possibility which groups rows where the time difference between consecutive rows is less than 4 days.

# create date variable
df$date <- with(df, as.Date(paste(YEAR, MONTH, DAY, sep = "-")))

# calculate succesive differences between dates
# and identify gaps larger than 4
df$gap <- c(0, diff(df$date) > 4)

# cumulative sum of 'gap' variable
df$group <- cumsum(df$gap) + 1

df    
#    YEAR MONTH DAY HOUR   LON LAT       date gap group
# 1  1860    10   3   13 -19.5   3 1860-10-03   0     1
# 2  1860    10   3   17 -19.5   4 1860-10-03   0     1
# 3  1860    10   3   21 -19.5   5 1860-10-03   0     1
# 4  1860    10   5    5 -20.5   6 1860-10-05   0     1
# 5  1860    10   5   13 -21.5   7 1860-10-05   0     1
# 6  1860    10   5   17 -21.5   8 1860-10-05   0     1
# 7  1860    10   6    1 -22.5   9 1860-10-06   0     1
# 8  1860    10   6    5 -22.5  10 1860-10-06   0     1
# 9  1860    12   5    9 -22.5  -7 1860-12-05   1     2
# 10 1860    12   5   18 -23.5  -8 1860-12-05   0     2
# 11 1860    12   5   22 -23.5  -9 1860-12-05   0     2
# 12 1860    12   6    6 -24.5 -10 1860-12-06   0     2
# 13 1860    12   6   10 -24.5 -11 1860-12-06   0     2
# 14 1860    12   6   18 -24.5 -12 1860-12-06   0     2

Disclaimer: the diff & cumsum part is inspired by this Q&A: How to partition a vector into groups of regular, consecutive sequences?.

Grouping consecutive date periods with criteria

Solution: Use binary grouping for "consecutive" date criteria (in this case < 7 days) and then use that grouping in a sum over with rows unbounded preceding

with cte as (
select
  *,
  COALESCE(DATEDIFF(dd, LAG(EndDate, 1, NULL) OVER (PARTITION BY [Type] ORDER BY StartDate), StartDate),0) AS GapDays
from
  #data
),
cte2 as (
select
  *,
  case when GapDays < 7 then 0 else 1 end as group1
from
  cte
),
cte3 as (
select
  *,
  sum(group1) over (partition by [type] order by startDate, endDate rows unbounded preceding) as group2
from
  cte2
)
select
  [TYPE],
  MIN(StartDate) AS StartDate,
  MAX(EndDate) AS EndDate,
  SUM(Dollars) AS Dollars,
  SUM(CASE WHEN GapDays > 7 THEN 0 ELSE GapDays END) AS GapDays
from
  cte3
group by
  [Type], group2

Group Consecutive Rows of Same Value Using Time Spans