How to Combine Overlapping Time Ranges (Time Ranges Union)

How to combine overlapping time ranges (time ranges union)

Given a function that returns truthy if two ranges overlap:

def ranges_overlap?(a, b)
a.include?(b.begin) || b.include?(a.begin)
end

(this function courtesy of sepp2k and steenslag)

and a function that merges two overlapping ranges:

def merge_ranges(a, b)
[a.begin, b.begin].min..[a.end, b.end].max
end

then this function, given an array of ranges, returns a new array with any overlapping ranges merged:

def merge_overlapping_ranges(overlapping_ranges)
overlapping_ranges.sort_by(&:begin).inject([]) do |ranges, range|
if !ranges.empty? && ranges_overlap?(ranges.last, range)
ranges[0...-1] + [merge_ranges(ranges.last, range)]
else
ranges + [range]
end
end
end

union/merge overlapping time-ranges

If you arrange on group and start (in that order) and unselect the indx column, this solution posted by David Arenburg works perfectly: How to flatten/merge overlapping time periods in R

library(dplyr)

df1 %>%
group_by(group) %>%
arrange(group, start) %>%
mutate(indx = c(0, cumsum(as.numeric(lead(start)) >
cummax(as.numeric(end)))[-n()])) %>%
group_by(group, indx) %>%
summarise(start = first(start), end = last(end)) %>%
select(-indx)

group start end
<chr> <dttm> <dttm>
1 A 2018-01-01 08:00:00 2018-01-01 08:20:00
2 A 2018-01-01 08:30:00 2018-01-01 09:00:00
3 A 2018-01-01 09:15:00 2018-01-01 09:30:00
4 B 2018-01-01 14:00:00 2018-01-01 15:30:00

How to flatten / merge overlapping time periods

Here's a possible solution. The basic idea here is to compare lagged start date with the maximum end date "until now" using the cummax function and create an index that will separate the data into groups

data %>%
arrange(ID, start) %>% # as suggested by @Jonno in case the data is unsorted
group_by(ID) %>%
mutate(indx = c(0, cumsum(as.numeric(lead(start)) >
cummax(as.numeric(end)))[-n()])) %>%
group_by(ID, indx) %>%
summarise(start = first(start), end = last(end))

# Source: local data frame [3 x 4]
# Groups: ID
#
# ID indx start end
# 1 A 0 2013-01-01 2013-01-06
# 2 A 1 2013-01-07 2013-01-11
# 3 A 2 2013-01-12 2013-01-15

Combine Time Ranges

You could have a look at this project which supports TimeRanges and intersection methods:

http://www.codeproject.com/Articles/168662/Time-Period-Library-for-NET

Sample Image

Compute the union time ranges from two arrays of time ranges (with moment-range)

I think I would go with concat and reduce:

const firstBusyPeriods = [{    start: "2017-04-05T10:00:00Z",    end: "2017-04-05T12:00:00Z"  },  {    start: "2017-04-05T14:00:00Z",    end: "2017-04-05T15:00:00Z"  }];
const secondBusyPeriods = [{ start: "2017-04-05T08:00:00Z", end: "2017-04-05T11:00:00Z" }, { start: "2017-04-05T16:00:00Z", end: "2017-04-05T17:00:00Z" }];
const isBetween = function(range, date) { return range.start < date && range.end > date;};
const rangesOverlap = function(rangeOne, rangeTwo) { return isBetween(rangeOne, rangeTwo.start) || isBetween(rangeOne, rangeTwo.end);};
const mergeRanges = function(rangeOne, rangeTwo) { let newRange = {}
if (isBetween(rangeOne, rangeTwo.start)) { newRange.start = rangeOne.start; } else { newRange.start = rangeTwo.start; } if (isBetween(rangeOne, rangeTwo.end)) { newRange.end = rangeOne.end; } else { newRange.end = rangeTwo.end; }
return newRange;};
const merge = function(rangeCollectionOne, rangeCollectionTwo) { let concatenatedCollections = rangeCollectionOne.concat(rangeCollectionTwo).sort((a,b) => a.start - b.start); let newCollection = concatenatedCollections.reduce((newCollection, range) => { let index = newCollection.findIndex(rangeToCheck => rangesOverlap(rangeToCheck, range)); if (index !== -1) { newCollection[index] = mergeRanges(newCollection[index], range); } else { newCollection.push(range); } return newCollection; }, []);
return newCollection;}
console.log(merge(firstBusyPeriods, secondBusyPeriods));

Merge Overlapping Intervals and Track Maximum Value in BigQuery SQL

Below is for BigQuery Standard SQL and I assume you stll working on the same use-case as in previous question, so I wanted to keep it inline with that solution - and you can extend it for when you also want to account for priorities for example

So, anyway:

#standardSQL
WITH check_times AS (
SELECT id, start_time AS TIME FROM `project.dataset.table` UNION DISTINCT
SELECT id, stop_time AS TIME FROM `project.dataset.table`
), distinct_intervals AS (
SELECT id, TIME AS start_time, LEAD(TIME) OVER(PARTITION BY id ORDER BY TIME) stop_time
FROM check_times
), deduped_intervals AS (
SELECT a.id, a.start_time, a.stop_time, MAX(some_value) some_value
FROM distinct_intervals a
JOIN `project.dataset.table` b
ON a.id = b.id
AND a.start_time BETWEEN b.start_time AND b.stop_time
AND a.stop_time BETWEEN b.start_time AND b.stop_time
GROUP BY a.id, a.start_time, a.stop_time
), combined_intervals AS (
SELECT id, MIN(start_time) start_time, MAX(stop_time) stop_time, MAX(some_value) some_value
FROM (
SELECT id, start_time, stop_time, some_value, COUNTIF(flag) OVER(PARTITION BY id ORDER BY start_time) grp
FROM (
SELECT id, start_time, stop_time, some_value,
start_time != IFNULL(LAG(stop_time) OVER(PARTITION BY id ORDER BY start_time), start_time) flag
FROM deduped_intervals
)
)
GROUP BY id, grp
)
SELECT *
FROM combined_intervals
-- ORDER BY id, start_time

If to apply to your sample data - result is

Row id  start_time  stop_time   some_value   
1 1 0 36 50
2 1 41 47 23

Is it possible to add one more column to the result which will show number of events during that time period

#standardSQL
WITH check_times AS (
SELECT id, start_time AS TIME FROM `project.dataset.table` UNION DISTINCT
SELECT id, stop_time AS TIME FROM `project.dataset.table`
), distinct_intervals AS (
SELECT id, TIME AS start_time, LEAD(TIME) OVER(PARTITION BY id ORDER BY TIME) stop_time
FROM check_times
), deduped_intervals AS (
SELECT a.id, a.start_time, a.stop_time, MAX(some_value) some_value, ANY_VALUE(To_JSON_STRING(b)) event_hash
FROM distinct_intervals a
JOIN `project.dataset.table` b
ON a.id = b.id
AND a.start_time BETWEEN b.start_time AND b.stop_time
AND a.stop_time BETWEEN b.start_time AND b.stop_time
GROUP BY a.id, a.start_time, a.stop_time
), combined_intervals AS (
SELECT id, MIN(start_time) start_time, MAX(stop_time) stop_time, MAX(some_value) some_value, COUNT(DISTINCT event_hash) events
FROM (
SELECT *, COUNTIF(flag) OVER(PARTITION BY id ORDER BY start_time) grp
FROM (
SELECT *,
start_time != IFNULL(LAG(stop_time) OVER(PARTITION BY id ORDER BY start_time), start_time) flag
FROM deduped_intervals
)
)
GROUP BY id, grp
)
SELECT *
FROM combined_intervals
-- ORDER BY id, start_time

with result

Row id  start_time  stop_time   some_value  events   
1 1 0 36 50 8
2 1 41 47 23 1

How to consolidate date ranges in a list in C#

You mention that dates never overlap but I think it is slightly simpler to write code that just merges overlapping dates. First step is to define the date range type:

class Interval
{
public DateTime From { get; set; }
public DateTime To { get; set; }
}

You can then define an extension method that checks if two intervals overlap:

static class IntervalExtensions
{
public static bool Overlaps(this Interval interval1, Interval interval2)
=> interval1.From <= interval2.From
? interval1.To >= interval2.From : interval2.To >= interval1.From;
}

Notice that this code assumes that From <= To so you might want to change Interval into an immutable type and verify this in the constructor.

You also need a way to merge two intervals:

public static Interval MergeWith(this Interval interval1, Interval interval2)
=> new Interval
{
From = new DateTime(Math.Min(interval1.From.Ticks, interval2.From.Ticks)),
To = new DateTime(Math.Max(interval1.To.Ticks, interval2.To.Ticks))
};

Next step is define another extension method that iterates a sequence of intervals and tries to merge consecutive overlapping intervals. This is best done using an iterator block:

public static IEnumerable<Interval> MergeOverlapping(this IEnumerable<Interval> source)
{
using (var enumerator = source.GetEnumerator())
{
if (!enumerator.MoveNext())
yield break;
var previousInterval = enumerator.Current;
while (enumerator.MoveNext())
{
var nextInterval = enumerator.Current;
if (!previousInterval.Overlaps(nextInterval))
{
yield return previousInterval;
previousInterval = nextInterval;
}
else
{
previousInterval = previousInterval.MergeWith(nextInterval);
}
}
yield return previousInterval;
}
}

If two consecutive intervals don't overlap it yields the previous interval. However, if they overlap it instead updates the previous interval by merging the two intervals and keep the merged interval as the previous interval for the next iteration.

Your sample data is not sorted so before merging the intervals you have to sort them:

var mergedIntervals = intervals.OrderBy(interval => interval.From).MergeOverlapping();

However, if the real data is sorted which you have indicated in a comment you can skip the sorting. The algorithm will do a single pass over the data and thus is O(n).



Related Topics



Leave a reply



Submit