Merging a list of time-range tuples that have overlapping time-ranges
A few ways to make it more efficient, Pythonic:
- Eliminate the
set()
construction, since the algorithm should prune out duplicates during in the main loop. - If you just need to iterate over the results, use
yield
to generate the values. - Reduce construction of intermediate objects, for example: move the
tuple()
call to the point where the final values are produced, saving you from having to construct and throw away extra tuples, and reuse a listsaved
for storing the current time range for comparison.
Code:
def merge(times):
saved = list(times[0])
for st, en in sorted([sorted(t) for t in times]):
if st <= saved[1]:
saved[1] = max(saved[1], en)
else:
yield tuple(saved)
saved[0] = st
saved[1] = en
yield tuple(saved)
data = [
[(1, 5), (2, 4), (3, 6)],
[(1, 3), (2, 4), (5, 8)]
]
for times in data:
print list(merge(times))
Merging time ranges in python
You need to start from a sorted set of intervals; if they are not yet sorted start with that:
from operator import itemgetter
sorted_times = sorted(times, key=itemgetter('time_open', 'time_close'))
You can then merge times simply by comparing start times to preceding end times; when they don't over lap yield the updated time:
def merge_times(times):
times = iter(times)
merged = next(times).copy()
for entry in times:
start, end = entry['time_open'], entry['time_close']
if start <= merged['time_close']:
# overlapping, merge
merged['time_close'] = max(merged['time_close'], end)
else:
# distinct; yield merged and start a new copy
yield merged
merged = entry.copy()
yield merged
This is a generator function, so merged times are yielded on demand. Use a loop to process these one by one, or use list()
on the generator to pull all results into a list object.
Demo using your sample data (which happens to be sorted already):
>>> for entry in merge_times(times):
... print(entry)
...
{'time_open': datetime.time(9, 0), 'time_close': datetime.time(12, 0)}
{'time_open': datetime.time(13, 0), 'time_close': datetime.time(19, 0)}
>>> list(merge_times(times))
[{'time_open': datetime.time(9, 0), 'time_close': datetime.time(12, 0)}, {'time_open': datetime.time(13, 0), 'time_close': datetime.time(19, 0)}]
How to correctly merge overlapping datetime ranges in Python
I guess this is what you want.
Give it a test and comment:
def merge_date_ranges(data):
result = []
t_old = data[0]
for t in data[1:]:
if t_old[1] >= t[0]: #I assume that the data is sorted already
t_old = ((min(t_old[0], t[0]), max(t_old[1], t[1])))
else:
result.append(t_old)
t_old = t
else:
result.append(t_old)
return result
I am assuming that the dates are already sorted.
BTW, I see that the only weird dates are thos that are single days, maybe you should fix your input data instead.
salida = merge_date_ranges(data)
for item in [t for t in data if t not in salida]:
print item
(datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0))
(datetime.datetime(2016, 1, 11, 14, 0), datetime.datetime(2016, 1, 11, 14, 0))
(datetime.datetime(2016, 1, 12, 14, 0), datetime.datetime(2016, 1, 12, 14, 0))
(datetime.datetime(2016, 1, 13, 14, 0), datetime.datetime(2016, 1, 13, 14, 0))
(datetime.datetime(2016, 1, 13, 22, 0), datetime.datetime(2016, 1, 13, 22, 0))
(datetime.datetime(2016, 1, 14, 14, 0), datetime.datetime(2016, 1, 14, 14, 0))
(datetime.datetime(2016, 1, 14, 22, 0), datetime.datetime(2016, 1, 14, 22, 0))
(datetime.datetime(2016, 1, 15, 14, 0), datetime.datetime(2016, 1, 15, 14, 0))
(datetime.datetime(2016, 1, 15, 22, 0), datetime.datetime(2016, 1, 15, 22, 0))
Merge consecutive and overlapping date ranges
- within customer
groupby("cust", as_index=False)
look for overlapping dates in rows cumsum()
sums booleans to generate a group for overlapping dates- finally simple case on min/max within groups
df.groupby("cust", as_index=False).apply(
lambda d: d.sort_values(["start_date", "end_date"])
.groupby(
["cust", (~(d["start_date"] <= (d["end_date"].shift() + pd.Timedelta(days=1)))).cumsum()],
as_index=False
)
.agg({"start_date": "min", "end_date": "max"})
).reset_index(drop=True)
cust | start_date | end_date | |
---|---|---|---|
0 | CUST123 | 2021-01-01 00:00:00 | 2021-01-31 00:00:00 |
1 | CUST123 | 2021-02-02 00:00:00 | 2021-02-28 00:00:00 |
2 | CUST456 | 2021-01-05 00:00:00 | 2021-01-31 00:00:00 |
Collapse a list of range tuples into the overlapping ranges
import operator
lst = [(0, 2), (1, 2), (0, 4), (2,6), (23, 2), (22, 6), (26, 2), (26, 2), (26, 2)]
lst.sort(key=operator.itemgetter(1))
for i in reversed(xrange(len(lst)-1)):
start, length = lst[i]
for j in xrange(i+1, len(lst)):
lstart, llength = lst[j]
if start >= lstart and start + length <= lstart + llength:
del lst[i]
break
print lst
#[(0, 4), (2, 6), (22, 6)]
How can I merge multiple overlapping date ranges and create new ones?
Here's a sketch of an algorithm to do this:
The data struct would be:
type Boundary struct {
Time time.Time
AddRemove int
Value int
}
A Boundary
would represent a Value
added or removed from the list of values at a given time. For a range:
[from,to]=number
you create two Boundary
objects:
b1:=Boundary{Time:from,AddRemove: 1, Value: number}
b2:=Boundary{Time:to,AddRemove:-1,Value:number}
You can then sort all boundary objects by time and AddRemove
. If times are equal, you should process adds first, then removes. Once this is done, you can process the boundary objects, and create your ranges:
last:=time.Time{}
values:=map[int]struct{}{}
for _,b:=range boundaries {
if last.IsZero() {
last=b.Time
values[b.Value]=struct{}{}
} else {
// Create a new range here with [last,b.Time] with values given in `values`
if b.AddRemove==1 {
values[b.Value]=struct{}{}
} else {
delete(values,b.Value)
}
last=b.Time
}
}
How can I append overlapping tuples in Python and list them in order?
Assuming your meeting times will not overlap each other because you cannot be in two places in the same time. Thus, (1,4) (3,5) cannot happen.
One way to do this is to iterate with a variable that remembers the previous tuple, in this case I'll name it "previous", I'll keep your variable "m" as it is.
Every time you go on the next tuple in your meeting times, you compare the ending meeting time from previous tuple with current starting meeting time. Then you remove the previous tuple because it will be a duplicate. If no condense operation, then you can just append to the list and then set "m" to "previous" for the next iteration.
# Variable containing list of tuples
meeting_times = [(0, 1), (3, 5), (4, 8), (10, 12), (9, 10)]
# Sort meetings by start time
ordered_list = sorted(meeting_times)
print('Given time tuple: ', ordered_list)
def condensed_meeting_time(append_meeting_time, previous):
# For each number in the variable
for m in ordered_list:
if previous: # This is to catch empty tuple that was initial
if m[0] == previous[1]: # If current starting meeting time is equal to previous ending meeting time ...
del append_meeting_time[-1] # Remove the previous element
append_meeting_time.append( (previous[0], m[1]) ) # Create new tuple to merge previous start time to current ending time
else: # If the time does not overlap, append normally and set current meeting time to 'previous' for next iteration
append_meeting_time.append(m)
previous = m
else: # Base case to append the first element.
append_meeting_time.append(m)
previous = m
return append_meeting_time
print('After condensed range: ',condensed_meeting_time([], ()))
This is the output I've got:
Given time tuple: [(0, 1), (3, 5), (4, 8), (9, 10), (10, 12)]
After condensed range: [(0, 1), (3, 5), (4, 8), (9, 12)]
I hope this helps.
Combine overlapping ranges of numbers
If I understand you correctly, you can do this:
from itertools import combinations
l = [[83,77],[103,97],[82,76],[101,95],[78,72],[97,91],[72,66],[89,83],[63,57],[78,72],[53,47],[65,59],[41,35],[50,44],[28,22],[34,28],[14,8],[16,10]]
def any_items_overlap(l):
# For each possible pair of lists in l
for item1, item2 in combinations(l, 2):
max1, min1 = item1
max2, min2 = item2
if min1 > max2 or max1 < min2:
# no overlap so ignore this pair
continue
else: # One of the combinations overlaps, so return them
return item1, item2
return None
while True:
if not any_items_overlap(l):
# No items overlapped - break the loop and finish
print(l)
break
else: # There are still overlaps
item1, item2 = any_items_overlap(l)
# Remove the items from the main list
l.remove(item1)
l.remove(item2)
# Replace them with a merged version
item_values = item1 + item2
l.append([max(item_values), min(item_values)])
# Start the loop again to check for any other overlaps
This gives:
[[41, 35], [103, 91], [65, 57], [53, 44], [34, 22], [16, 8], [89, 66]]
Related Topics
Using Psycopg2 with Lambda to Update Redshift (Python)
How to Annotate Types of Multiple Return Values
Exponentials in Python: X**Y VS Math.Pow(X, Y)
Real World Example About How to Use Property Feature in Python
Appending List But Error 'Nonetype' Object Has No Attribute 'Append'
What Is the Best Approach to Change Primary Keys in an Existing Django App
Unicodeencodeerror: 'Ascii' Codec Can't Encode Character at Special Name
Value Error Trying to Install Python for Windows Extensions
From ... Import or Import ... as for Modules
Python - Read File from and to Specific Lines of Text
How to Print a List with Integers Without the Brackets, Commas and No Quotes