How to Group on Continuous Ranges

How do I group on continuous ranges


WITH    q AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY crew, dayType ORDER BY [date]) AS rnd,
ROW_NUMBER() OVER (PARTITION BY crew ORDER BY [date]) AS rn
FROM mytable
)
SELECT MIN([date]), MAX([date]), crew AS name, dayType
FROM q
GROUP BY
crew, dayType, rnd - rn

This article may be of interest to you:

  • Things SQL needs: SERIES()

How to group data based on continuous date range?


the date range for price is 23.9 is not right because price not same for all the days in that range.

Because there are two same price in different overlapping date ranges, so you might get only one row when you used aggregate function.

This is a gap-and-island problem, we can try to use ROW_NUMBER window function to get the gap of overlapping date and then group by that.

SELECT  Product_Code,
min(Pricing_Date) AS Min_Date ,
max(Pricing_Date) AS Max_Date,
price
FROM (
SELECT *,
ROW_NUMBER() OVER(ORDER BY PRICING_DATE) - ROW_NUMBER() OVER(PARTITION BY PRODUCT_CODE,PRICE ORDER BY PRICING_DATE) grp
FROM PRICE_DATA
) t1
GROUP BY grp,Product_Code,price
ORDER BY min(Pricing_Date)

sqlfiddle

Explain

The gap-and-island problem is a feature

continuous(overlapping) data is that a set (continuous range of sequence) - (values ​​based on a certain order of conditions sequence) yields the same grouping.

so that We can use

  • ROW_NUMBER() OVER(ORDER BY PRICING_DATE) making a continuous range of values.
  • ROW_NUMBER() OVER(PARTITION BY PRODUCT_CODE,PRICE ORDER BY PRICING_DATE) making values ​​based on a certain order of conditions.

Then we will get a grouping column with overlapping data as sqlfiddle

How to group data on continuous date ranges

In MS Access, you can use a correlated subquery to generate a sequential number. Then subtract the sequential number from the date to identify the groups:

select id, min(date), max(date), att
from (select t.*,
(select count(*)
from t as t2
where t2.id = t.id and
t2.date <= t.date
) as seqnum
from t
) as t
group by id, att, dateadd("d", - seqnum, date)

How do I group on continuous ranges (mysql 5.7)

If I understand correctly, you can do this in MySQL by doing:

select user_id, min(created_at) as ca_0, created_at_1 as ca_1
from (select t.*,
(select min(t2.created_at)
from t t2
where t2.user_id = t.user_id and t2.down = 1 and
t2.created_at > t.created_at
) as created_at_1
from t
where t.down = 0
) tt
group by user_id, created_at_1;

I have no idea how to express this in Laravel.

SQL Group by continuous range of integer values

This is a type of "groups-and-islands" problem. You can do this by subtracting a sequence from number. The difference is constant when the numbers are sequential:

select owner, min(number) as from_number, max(number) as to_number
from (select t.*,
row_number() over (partition by owner order by number) as seqnum
from t
) t
group by owner, (number - seqnum);

How to group continuous ranges using MySQL

MySQL doesn't support analytic functions, but you can emulate such behaviour with user-defined variables:

SELECT   CatID, Begin, MAX(Date) AS End, Rate
FROM (
SELECT my_table.*,
@f:=CONVERT(
IF(@c<=>CatId AND @r<=>Rate AND DATEDIFF(Date, @d)=1, @f, Date), DATE
) AS Begin,
@c:=CatId, @d:=Date, @r:=Rate
FROM my_table JOIN (SELECT @c:=NULL) AS init
ORDER BY CatId, Rate, Date
) AS t
GROUP BY CatID, Begin, Rate

See it on sqlfiddle.

Identify groups of continuous numbers in a list

more_itertools.consecutive_groups was added in version 4.0.

Demo

import more_itertools as mit


iterable = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
[list(group) for group in mit.consecutive_groups(iterable)]
# [[2, 3, 4, 5], [12, 13, 14, 15, 16, 17], [20]]

Code

Applying this tool, we make a generator function that finds ranges of consecutive numbers.

def find_ranges(iterable):
"""Yield range of consecutive numbers."""
for group in mit.consecutive_groups(iterable):
group = list(group)
if len(group) == 1:
yield group[0]
else:
yield group[0], group[-1]


iterable = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
list(find_ranges(iterable))
# [(2, 5), (12, 17), 20]

The source implementation emulates a classic recipe (as demonstrated by @Nadia Alramli).

Note: more_itertools is a third-party package installable via pip install more_itertools.

Group rows by contiguous date ranges for groups of values

You can identify the groups by using the difference of row_numbers(). Consecutive values will have a constant.

select col1, col2, date1, min(date2), max(date2), rate
from (select t.*,
(row_number() over (partition by col1, col2, date1 order by date2) -
row_number() over (partition by col1, col2, date1, rate order by date2)
) as grp
from table t
) t
group by col1, col2, date1, rate, grp

Group consecutive ranges

I did this one which worked in my tests and almost all the main databases out there should normally run it... I underscored my columns... please, change the names before test:

SELECT 
r1.person_,
r1.name_,
r1.country_,
CASE
WHEN max(r2.begin_) = max(r1.begin_)
THEN max(r1.info_) ELSE '-'
END info_,
MAX(r2.begin_) begin_,
r1.end_
FROM stack_39626781 r1
INNER JOIN stack_39626781 r2 ON 1=1
AND r2.person_ = r1.person_
AND r2.begin_ <= r1.begin_ -- just optimizing...
LEFT JOIN stack_39626781 r3 ON 1=1
AND r3.person_ = r1.person_
-- matches when another range overlaps this range end
AND r3.end_ >= r1.end_ + 1
AND r3.begin_ <= r1.end_ + 1
LEFT JOIN stack_39626781 r4 ON 1=1
AND r4.person_ = r2.person_
-- matches when another range overlaps this range begin
AND r4.end_ >= r2.begin_ - 1
AND r4.begin_ <= r2.begin_ - 1
WHERE 1=1
-- get rows
-- with no overlaps on end range and
-- with no overlaps on begin range
AND r3.person_ IS NULL
AND r4.person_ IS NULL
GROUP BY
r1.person_,
r1.name_,
r1.country_,
r1.end_

This query is based on the fact that any range from output have no connections/overlaps. Lets say that, for an output of five ranges, five begins and five ends exists with no connections/overlaps. Find and associate them should be easier than generating all connections/overlaps. So, what this query does is:

  1. Find all ranges per person with no overlaps/connections on its end value;
  2. Find all ranges per person with no overlaps/connections on its begin value;
  3. These are the valid ranges, so associate them all to find the correct pair;
  4. For each person and end, the correct begin pair is the maximum one available which value is equal or lesser than this end... it's easy to validate this rule... first, you can't have a begin greater than an end... also, if you have two or more possible begins lesser than end, e. g., begin1 = end - 2 and begin2 = end - 5, selecting the lesser one (begin2) makes the greater one (begin1) an overlap of this range.

Hope it helps.



Related Topics



Leave a reply



Submit