find length of sequences of identical values in a numpy array (run length encoding)
While not numpy
primitives, itertools
functions are often very fast, so do give this one a try (and measure times for various solutions including this one, of course):
def runs_of_ones(bits):
for bit, group in itertools.groupby(bits):
if bit: yield sum(group)
If you do need the values in a list, just can use list(runs_of_ones(bits)), of course; but maybe a list comprehension might be marginally faster still:
def runs_of_ones_list(bits):
return [sum(g) for b, g in itertools.groupby(bits) if b]
Moving to "numpy-native" possibilities, what about:
def runs_of_ones_array(bits):
# make sure all runs of ones are well-bounded
bounded = numpy.hstack(([0], bits, [0]))
# get 1 at run starts and -1 at run ends
difs = numpy.diff(bounded)
run_starts, = numpy.where(difs > 0)
run_ends, = numpy.where(difs < 0)
return run_ends - run_starts
Again: be sure to benchmark solutions against each others in realistic-for-you examples!
How to find Run length encoding in python
You can do this with groupby
from itertools import groupby
ar = [2,2,2,1,1,2,2,3,3,3,3]
print([(k, sum(1 for i in g)) for k,g in groupby(ar)])
# [(2, 3), (1, 2), (2, 2), (3, 4)]
getting ranges of sequences of identical entries with minimum length in a numpy array
One approach using np.diff
and np.where
-
# Append with `-1s` at either ends and get the differentiation
dfa = np.diff(np.hstack((-1,a,-1)))
# Get the positions of starts and stops of 1s in `a`
starts = np.where(dfa==2)[0]
stops = np.where(dfa==-2)[0]
# Get valid mask for pairs from starts and stops being of at least 3 in length
valid_mask = (stops - starts) >= 3
# Finally collect the valid pairs as the output
out = np.column_stack((starts,stops))[valid_mask].tolist()
Find Consecutive Repeats of Specific Length in NumPy
Approach #1
We could leverage 1D convolution
for a vectorized solution -
def consec_repeat_starts(a, n):
N = n-1
m = a[:-1]==a[1:]
return np.flatnonzero(np.convolve(m,np.ones(N, dtype=int))==N)-N+1
Sample runs -
In [286]: a
Out[286]:
array([ 0, 1, 2, 2, 3, 4, 5, 5, 6, 7, 8, 9, 9, 9, 10, 11, 12,
13, 13, 13, 14, 15])
In [287]: consec_repeat_starts(a, 2)
Out[287]: array([ 2, 6, 11, 12, 17, 18])
In [288]: consec_repeat_starts(a, 3)
Out[288]: array([11, 17])
In [289]: consec_repeat_starts(a, 4)
Out[289]: array([], dtype=int64)
Approach #2
We could also make use of binary-erosion
-
from scipy.ndimage.morphology import binary_erosion
def consec_repeat_starts_v2(a, n):
N = n-1
m = a[:-1]==a[1:]
return np.flatnonzero(binary_erosion(m,[1]*N))-(N//2)
How do I find the length of a run of numbers in a list? (Is there a faster way than what I'm doing?)
I might use itertools.groupby
for this one
lst = [ 1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0]
from itertools import groupby
from operator import itemgetter
for k,v in groupby(enumerate(lst),key=itemgetter(1)):
if k:
v = list(v)
print v[0][0],v[-1][0]
This will print the start and end indices of the groups of 1's
Count number of repeated elements in a row in a numpy array
You can use itertools.groupby
to perform the operation without invoking numpy
.
import itertools
X = [1,1,1,2,2,2,2,2,3,3,1,1,0,0,0,5]
Y = [(x, len(list(y))) for x, y in itertools.groupby(X)]
print(Y)
# [(1, 3), (2, 5), (3, 2), (1, 2), (0, 3), (5, 1)]
Match lengths of multiple Numpy arrays of unequal length
First we can do return the min
length
mlen = min(map(len, [a, b, c]))
8
Then
newl=[x[: mlen ] for x in [a,b,c]]
Related Topics
How to Scroll Frame Using Mouse Wheel & Adding Horizontal Scrollbar
How to Apply Piecewise Linear Fit in Python
Binary Representation of Float in Python (Bits Not Hex)
How to Use Multiprocessing Queue in Python
Python: Pandas Series - Why Use Loc
Libxml Install Error Using Pip
How to Print Bold Text in Python
How to Determine the Language of a Piece of Text
Cannot Install Lxml on MAC Os X 10.9
Split a Generator into Chunks Without Pre-Walking It
Get Fully Qualified Class Name of an Object in Python
Separation of Business Logic and Data Access in Django
How to Retrieve Items from a Dictionary in the Order That They'Re Inserted