Finding Continuous Ranges in a Set of Numbers

Finding continuous ranges in a set of numbers

Theoretically the items in a set have no particular value, so I'm assuming you also have some continuous ID column that defines the order of the numbers. Something like this:

ID  Number
1 1000
2 1001
3 1002
4 1010
5 1011
6 1012
7 1013
8 1020
9 1021
10 1022

You could create an extra column that contains the result of Number - ID:

ID  Number  Diff
1 1000 999
2 1001 999
3 1002 999
4 1010 1006
5 1011 1006
6 1012 1006
7 1013 1006
8 1020 1012
9 1021 1012
10 1022 1012

Numbers in the same range will have the same result in the Diff column.

Identify groups of continuous numbers in a list

more_itertools.consecutive_groups was added in version 4.0.

Demo

import more_itertools as mit

iterable = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
[list(group) for group in mit.consecutive_groups(iterable)]
# [[2, 3, 4, 5], [12, 13, 14, 15, 16, 17], [20]]

Code

Applying this tool, we make a generator function that finds ranges of consecutive numbers.

def find_ranges(iterable):
"""Yield range of consecutive numbers."""
for group in mit.consecutive_groups(iterable):
group = list(group)
if len(group) == 1:
yield group[0]
else:
yield group[0], group[-1]

iterable = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
list(find_ranges(iterable))
# [(2, 5), (12, 17), 20]

The source implementation emulates a classic recipe (as demonstrated by @Nadia Alramli).

Note: more_itertools is a third-party package installable via pip install more_itertools.

Searching for continuous range of numbers, while ignoring gaps = 5

After wondering a while why my last query works in SQL Fiddle, but not on my "real" MySQL database, I've found the solution.

When building the schema in SQL Fiddle, the thetime values are inserted in ascending order. However, the thetime values produced by the query in (3) are in random order. Because row_number depends on the order in which the rows are processed, the values have to be sorted before feeding them in the last query.

As a result, making the last query work requires the following change:

select *
from
(select min(thetime) as start, max(thetime) as einde, max(thetime) - min(thetime) + 1 as delta
from (select thetime, @curRow := @curRow + 1 as row_number
from (select * from db_tmp_ranges where thetime is not null order by thetime) p
join (select @curRow := 0) r) p
group by thetime - row_number) q
order by start

Finding contiguous ranges in arrays

I think that the following solution will work in O(n) time using O(n) space.

Begin by putting all of the entries in the array into a hash table. Next, create a second hash table which stores elements that we have "visited," which is initially empty.

Now, iterate across the array of elements one at a time. For each element, check if the element is in the visited set. If so, skip it. Otherwise, count up from that element upward. At each step, check if the current number is in the main hash table. If so, continue onward and mark the current value as part of the visited set. If not, stop. Next, repeat this procedure, except counting downward. This tells us the number of contiguous elements in the range containing this particular array value. If we keep track of the largest range found this way, we will have a solution to our problem.

The runtime complexity of this algorithm is O(n). To see this, note that we can build the hash table in the first step in O(n) time. Next, when we begin scanning to array to find the largest range, each range scanned takes time proportional to the length of that range. Since the total sum of the lengths of the ranges is the number of elements in the original array, and since we never scan the same range twice (because we mark each number that we visit), this second step takes O(n) time as well, for a net runtime of O(n).

EDIT: If you're curious, I have a Java implementation of this algorithm, along with a much more detailed analysis of why it works and why it has the correct runtime. It also explores a few edge cases that aren't apparent in the initial description of the algorithm (for example, how to handle integer overflow).

Hope this helps!

Find ranges from a series of numbers in SQL/Oracle

You could do it using ROW_NUMBER analytic function. See Find range of consecutive values in a sequence of numbers or dates.

For example,

Range

SQL> with data(num) as(
2 select 1 from dual union
3 select 2 from dual union
4 select 3 from dual union
5 select 5 from dual union
6 select 6 from dual union
7 select 7 from dual union
8 select 10 from dual union
9 select 11 from dual union
10 select 12 from dual union
11 select 20 from dual
12 )
13 select min(num)||'-'|| max(num) as "range"
14 from (select num,
15 num-Row_Number() over(order by num)
16 as rn
17 from data)
18 group by rn
19 order by min(num);

range
-------------------------------------------------
1-3
5-7
10-12
20-20

SQL>

List

SQL> with data(num) as(
2 select 1 from dual union
3 select 2 from dual union
4 select 3 from dual union
5 select 5 from dual union
6 select 6 from dual union
7 select 7 from dual union
8 select 10 from dual union
9 select 11 from dual union
10 select 12 from dual union
11 select 20 from dual
12 )
13 SELECT listagg(range, ',') WITHIN GROUP(
14 ORDER BY min_num) AS "list"
15 FROM
16 (SELECT MIN(num) min_num,
17 MIN(num)
18 ||'-'
19 || MAX(num) range
20 FROM
21 (SELECT num, num-Row_Number() over(order by num) AS rn FROM DATA
22 )
23 GROUP BY rn
24 );

list
-------------------------------------------------------------------------
1-3,5-7,10-12,20-20

SQL>

Update OP wants a solution in PL/SQL to store the list in a PL/SQL variable.

Setup

SQL> CREATE TABLE t AS
2 SELECT *
3 FROM
4 ( WITH data(num) AS
5 ( SELECT 1 FROM dual
6 UNION
7 SELECT 2 FROM dual
8 UNION
9 SELECT 3 FROM dual
10 UNION
11 SELECT 5 FROM dual
12 UNION
13 SELECT 6 FROM dual
14 UNION
15 SELECT 7 FROM dual
16 UNION
17 SELECT 10 FROM dual
18 UNION
19 SELECT 11 FROM dual
20 UNION
21 SELECT 12 FROM dual
22 UNION
23 SELECT 20 FROM dual
24 )
25 SELECT * FROM DATA);

Table created.

PL/SQL block

SQL> SET SERVEROUTPUT ON
SQL> DECLARE
2 v_list VARCHAR2(100);
3 BEGIN
4 SELECT listagg(RANGE, ',') WITHIN GROUP(
5 ORDER BY min_num)
6 INTO v_list
7 FROM
8 (SELECT MIN(num) min_num,
9 MIN(num)
10 ||'-'
11 || MAX(num) range
12 FROM
13 (SELECT num, num-Row_Number() over(order by num) AS rn FROM t
14 )
15 GROUP BY rn
16 );
17 dbms_output.put_line(v_list);
18 END;
19 /
1-3,5-7,10-12,20-20

PL/SQL procedure successfully completed.

SQL>

Identify groups of continuous numbers from consecutive list in python

L1 = [5,3,2,7,1]
L2 = [3,5,6,8,9,21,2]
L3 = [5,3,6,7,3,9]
cons_l = []
L = [L2] + [L3] #+[L4] #+ ...+ ..... ### Add any number of list here..

j = 0
for l1 in L1:
cons_l.append([])
cons_l[j].append(l1)
for l in range(0, len(L)):
if l1+l+1 in L[l]:
cons_l[j].append(l1+l+1)
else:
del cons_l[j]
j -= 1
break
j += 1
print cons_l

Identify groups of varying continuous numbers in a list

You can create an iterator to help grouping and try to pull the next element from the following group which will be the end of the previous group:

def ranges(lst):
it = iter(lst)
next(it) # move to second element for comparison
grps = groupby(lst, key=lambda x: (x - next(it, -float("inf"))))
for k, v in grps:
i = next(v)
try:
step = next(v) - i # catches single element v or gives us a step
nxt = list(next(grps)[1])
yield xrange(i, nxt.pop(0), step)
# outliers or another group
if nxt:
yield nxt[0] if len(nxt) == 1 else xrange(nxt[0], next(next(grps)[1]), nxt[1] - nxt[0])
except StopIteration:
yield i # no seq

which give you:

In [2]: l1 = [2, 3, 4, 5, 8, 10, 12, 14, 13, 14, 15, 16, 17, 20, 21]

In [3]: l2 = [2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20]

In [4]: l3 = [13, 14, 15, 16, 17, 18]

In [5]: s1 = [i + 10 for i in xrange(0, 11, 2)]

In [6]: s2 = [30]

In [7]: s3 = [i + 40 for i in xrange(45)]

In [8]: l4 = s1 + s2 + s3

In [9]: l5 = [1, 2, 5, 6, 9, 10]

In [10]: l6 = {1, 2, 3, 5, 6, 9, 10, 13, 19, 21, 22, 23, 24}

In [11]:

In [11]: for l in (l1, l2, l3, l4, l5, l6):
....: print(list(ranges(l)))
....:
[xrange(2, 5), xrange(8, 14, 2), xrange(13, 17), 20, 21]
[xrange(2, 8, 2), xrange(12, 17), 20]
[xrange(13, 18)]
[xrange(10, 20, 2), 30, xrange(40, 84)]
[1, 2, 5, 6, 9, 10]
[xrange(1, 3), 5, 6, 9, 10, 13, 19, xrange(21, 24)]

When the step is 1 it is not included in the xrange output.



Related Topics



Leave a reply



Submit