How to Find The Indices Where There Are N Consecutive Zeroes in a Row

How to find the indices where there are n consecutive zeroes in a row

Here are two base R approaches:

1) rle First run rle and then compute ok to pick out the sequences of zeros that are more than 3 long. We then compute the starts and ends of all repeated sequences subsetting to the ok ones at the end.

with(rle(x), {
ok <- values == 0 & lengths > 3
ends <- cumsum(lengths)
starts <- ends - lengths + 1
data.frame(starts, ends)[ok, ]
})

giving:

  starts ends
1 6 17
2 34 58
3 72 89

2) gregexpr Take the sign of each number -- that will be 0 or 1 and then concatenate those into a long string. Then use gregexpr to find the location of at least 4 zeros. The result gives the starts and the ends can be computed from that plus the match.length attribute minus 1.

s <- paste(sign(x), collapse = "")
g <- gregexpr("0{4,}", s)[[1]]
data.frame(starts = 0, ends = attr(g, "match.length") - 1) + g

giving:

  starts ends
1 6 17
2 34 58
3 72 89

Find consecutive zeroes in a row

#Had to fix Client 4, one number was missing
DF <- read.table(text = 'Clients Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
"Client 1" 123 768 678 452 213 123 55 10 0 0 0 0
"Client 2" 549 542 21 321 31 59 998 0 546 980 0 987
"Client 3" 500 0 500 0 500 0 500 0 500 0 500 0
"Client 4" 126 545 2315 27 268 126 56 0 0 0 0 0
"Client 5" 546 546 0 0 0 328 486 326 0 0 66 0
"Client 6" 0 0 0 25 78 563 698 631 230 53 0 0', header = TRUE)

Loop over rows, reverse the order, and find which entry is the first non-zero; if the client never head a transaction return length(x):

n <- apply(DF[, -1], 1, function(x) if (any(x)) which.max(rev(x) != 0) - 1 else length(x))
#[1] 4 0 1 5 1 2

DF$Clients[n >= 3]
#[1] Client 1 Client 4
#Levels: Client 1 Client 2 Client 3 Client 4 Client 5 Client 6

Finding the consecutive zeros in a numpy array

Here's a fairly compact vectorized implementation. I've changed the requirements a bit, so the return value is a bit more "numpythonic": it creates an array with shape (m, 2), where m is the number of "runs" of zeros. The first column is the index of the first 0 in each run, and the second is the index of the first nonzero element after the run. (This indexing pattern matches, for example, how slicing works and how the range function works.)

import numpy as np

def zero_runs(a):
# Create an array that is 1 where a is 0, and pad each end with an extra 0.
iszero = np.concatenate(([0], np.equal(a, 0).view(np.int8), [0]))
absdiff = np.abs(np.diff(iszero))
# Runs start and end where absdiff is 1.
ranges = np.where(absdiff == 1)[0].reshape(-1, 2)
return ranges

For example:

In [236]: a = [1, 2, 3, 0, 0, 0, 0, 0, 0, 4, 5, 6, 0, 0, 0, 0, 9, 8, 7, 0, 10, 11]

In [237]: runs = zero_runs(a)

In [238]: runs
Out[238]:
array([[ 3, 9],
[12, 16],
[19, 20]])

With this format, it is simple to get the number of zeros in each run:

In [239]: runs[:,1] - runs[:,0]
Out[239]: array([6, 4, 1])

It's always a good idea to check the edge cases:

In [240]: zero_runs([0,1,2])
Out[240]: array([[0, 1]])

In [241]: zero_runs([1,2,0])
Out[241]: array([[2, 3]])

In [242]: zero_runs([1,2,3])
Out[242]: array([], shape=(0, 2), dtype=int64)

In [243]: zero_runs([0,0,0])
Out[243]: array([[0, 3]])

Finding the first number after consecutive zeros in data frame

We can use rle to select the first row after first consecutive zeroes in each group (ID).

library(dplyr)

data %>%
group_by(ID) %>%
slice(with(rle(event == 0), sum(lengths[1:which.max(values)])) + 1)

# ID time event
# <int> <int> <dbl>
#1 1 8 1
#2 2 6 1

Find instances within a column where consecutive rows are non zero?

IIUC using cumsum create the groupby key

s1=s[s==0].groupby(s.ne(0).cumsum()).transform('size')
n=5

s[(s==0)&(s1==n)]
Out[753]:
5 0
6 0
7 0
8 0
9 0
dtype: int64

Dput

l=[0,1,1,1,1,0,0,0,0,0,1,1,1,0,0,1,1,1,1,1,0,0,0]
s=pd.Series(l)

Python - Identify groups of consecutive 0's and replace them

I think this handles it.

newa = []
span = 0
for n in a:
# Is this number non-zero?
if n:
# Yes. Have we just passed a string of zeros?
if span:
# Yes. Average this value and the last non-zero value
# and duplicate for as many zeros as we saw.
avg = (newa[-1] + n) / 2
newa.extend( [avg] * span )
span = 0
# Always add this number to the new list.
newa.append( n )
else:
# No, this number was a zero. Just count it.
span += 1

Can this series end with a span of zeros? Only you know whether that's a concern or not.

EDIT to ignore series longer than 5.

newa = []
span = 0
for n in a:
# Is this number non-zero?
if n:
# Yes. Have we just passed a string of zeros?
if span:
# Yes. Average this value and the last non-zero value
# and duplicate for as many zeros as we saw.
if span > 5:
avg = 0
else:
avg = (newa[-1] + n) / 2
newa.extend( [avg] * span )
span = 0
# Always add this number to the new list.
newa.append( n )
else:
# No, this number was a zero. Just count it.
span += 1

get index of the first block of at least n consecutive False values in boolean array

I think for this linear search operation a python implementation is ok. My suggestion looks like this:

def find_block(arr, n_at_least=1):
current_index = 0
current_count = 0
for index, item in enumerate(arr):
if item:
current_count = 0
current_index = index + 1
else:
current_count += 1
if current_count == n_at_least:
return current_index
return None # Make sure this is outside for loop

Running this function yields the following outputs:

>>> import numpy
>>> w = numpy.array([True, False, True, True, False, False, False])
>>> find_block(w, n_at_least=1)
1
>>> find_block(w, n_at_least=3)
4
>>> find_block(w, n_at_least=4)
>>> # None

Lowest starting row indices for minimum 2 consecutive non-zero values per column

Minimum 2 consecutive non-zero values case

%// Mask of non-zeros in input, A
mask = A~=0

%// Find starting row indices alongwith boolean valid flags for minimum two
%// consecutive nonzeros in each column
[valid,idx] = max(mask(1:end-1,:) & mask(2:end,:),[],1)

%// Use the valid flags to set invalid row indices to zeros
out = idx.*valid

Sample run -

A =
0 0 0 0 -4 3
0 2 1 0 0 0
0 5 0 8 7 0
0 9 10 3 1 2
mask =
0 0 0 0 1 1
0 1 1 0 0 0
0 1 0 1 1 0
0 1 1 1 1 1
valid =
0 1 0 1 1 0
idx =
1 2 1 3 3 1
out =
0 2 0 3 3 0

Generic case

For generic case of minimum N consecutive non-zeros case, you can use 2D convolution with a kernel as a column vectors of N ones, like so -

mask = A~=0  %// Mask of non-zeros in input, A

%// Find starting row indices alongwith boolean valid flags for minimum N
%// consecutive nonzeros in each column
[valid,idx] = max(conv2(double(mask),ones(N,1),'valid')==N,[],1)

%// Use the valid flags to set invalid row indices to zeros
out = idx.*valid

Please note that the 2D convolution could be replaced by a separable convolution version as mentioned in the comments by Luis and that seems to be a bit faster. More info on this could be accessed at this link. So,

conv2(double(mask),ones(N,1),'valid') could be replaced by conv2(ones(N,1),1,double(mask),'valid').

Sample run -

A =
0 0 0 0 0 3
0 2 1 0 1 2
0 5 0 8 7 9
0 9 0 3 1 2
mask =
0 0 0 0 0 1
0 1 1 0 1 1
0 1 0 1 1 1
0 1 0 1 1 1
N =
3
valid =
0 1 0 0 1 1
idx =
1 2 1 1 2 1
out =
0 2 0 0 2 1

Find the lowest location within rows where there are non-zero elements for each column in a matrix in MATLAB

This should do it:

A = [0, 0, 0, 0, 4, 3;
0, 2, 1, 0, 0, 0;
0, 5, 0, 8, 7, 0;
8, 9, 10, 3, 0, 2];

indices = repmat((1:size(A))', [1, size(A, 2)]);
indices(A == 0) = NaN;
min(indices, [], 1)

Here indices is:

indices =

1 1 1 1 1 1
2 2 2 2 2 2
3 3 3 3 3 3
4 4 4 4 4 4

We then set every element of indices to NaN wherever A is zero, which gives us:

indices = 

NaN NaN NaN NaN 1 1
NaN 2 2 NaN NaN NaN
NaN 3 NaN 3 3 NaN
4 4 4 4 NaN 4

We then simply take the minimum of each column

How to list the index of all consecutive and single values in a row in R matrix

Here you go. I think this should work with your data:

val = 1;
counter = 1;
temp = matrix();

for (i in 1:nrow(mdata))
{
for (j in 1:ncol(mdata))
{
if (mdata[i,j] == -3)
{

while (j <= ncol(mdata))
{
if (mdata[i,j + val] == -3)
{
counter = counter + 1;
val = val + 1;
next;
}
else
{
break;

}

}

if (counter == 1)
{
#print(j);
#print(mdata[i, (j - 1):(j + 1)]);

temp <- t(as.matrix(mdata[i, (j - 1):(j + 1)]))
cat("\n This is with counter 1 \n")
print(temp)
cat("\n matrix: temp-1", temp[,1],"temp-2", temp[,3],"\n");
to.avg <- c(temp[,1], temp[,3]);
avg<-mean(to.avg)
mdata[i,j] = avg;
}
else
{

temp <- t(as.matrix(mdata[i,(j - 1):(j + counter)]))
cat("\n This is with multiple count \n")
cat(counter,"consecutive values were found, processing accordingly \n")
print(temp);

for (k in 0:(counter-1))
{
# cat("\n reading temp at the start \n")
# print(temp)
cat("\n K is ",(k+1), "and array is",length(temp),"long \n")
to.avg <- c(temp[,(k+1)], temp[,length(temp)]);
cat("averaging", temp[,(k+1)],"and", temp[,length(temp)]);
avg<-mean(to.avg)
cat("\n average =",avg);
temp[,(k+2)] = avg;
# cat("\n reading temp as this \n")
# print(temp)
mdata[i,j+k]=avg
}

}

}
else
{
mdata[i,j] = mdata[i,j];
}

val = 1;
counter = 1;

}

}


Related Topics



Leave a reply



Submit