Counting Consecutive Positive Values in Python/Pandas Array

Efficient method to count consecutive positive values in pandas dataframe

Use consecutiveCounts just once in an unstacked series. Then, stack back to data frame.

Using DSM's consecutiveCount, which I named c here for simplicity:

>>> c = lambda y: y * (y.groupby((y != y.shift()).cumsum()).cumcount() + 1)
>>> c(df.unstack()).unstack().T

a b
0 0 0
1 1 0
2 0 0
3 1 0
4 2 1
5 0 2
6 0 0
7 0 1
8 1 2
9 2 3
10 0 0
11 1 0
12 0 0

Timings

# df2 is (65, 40)
df2 = pd.concat([pd.concat([df]*20, axis=1)]*5).T.reset_index(drop=True).T.reset_index(drop=True)

%timeit c(df2.unstack()).unstack().T
5.54 ms ± 296 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df2.apply(c)
82.5 ms ± 2.19 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

How can I get the count of consecutive positive number in each column in 2 dimensional df in python/ Padas

Let us use cumsum def your function

def yourfun(x) : 
return x[x.ge(0)].groupby(x.lt(0).cumsum()).size().iloc[-1]
df.loc['Count'] = df.apply(yourfun)
df
Out[62]:
X y
a 1.0 -1.0
b -2.0 2.0
c 3.0 -3.0
d 2.1 4.0
Count 2.0 1.0

Count rows with positive values and reset if negative

You first want to mark the positions where new segments (i.e., groups) start:

>>> df['Count'] = df.Slope.lt(0)
>>> df.head(7)
Slope Count
0 -25.0 True
1 -15.0 True
2 17.0 False
3 6.0 False
4 0.1 False
5 5.0 False
6 -3.0 True

Now you need to label each group using the cumulative sum: as True is evaluated as 1 in mathematical equations, the cumulative sum will label each segment with an incrementing integer. (This is a very powerful concept in pandas!)

>>> df['Count'] = df.Count.cumsum()
>>> df.head(7)
Slope Count
0 -25.0 1
1 -15.0 2
2 17.0 2
3 6.0 2
4 0.1 2
5 5.0 2
6 -3.0 3

Now you can use groupby to access each segment, then all you need to do is generate an incrementing sequence starting at zero for each group. There are many ways to do that, I'd just use the (reset'ed) index of each group, i.e., reset the index, get the fresh RangeIndex starting at 0, and turn it into a series:

>>> df.groupby('Count').apply(lambda x: x.reset_index().index.to_series())
Count
1 0 0
2 0 0
1 1
2 2
3 3
4 4
3 0 0
1 1
2 2
3 3
4 0 0
5 0 0
1 1
6 0 0

This results in the expected counts, but note that the final index doesn't match the original dataframe, so we need another reset_index() with drop=True to discard the grouped index to put this into our original dataframe:

>>> df['Count'] = df.groupby('Count').apply(lambda x:x.reset_index().index.to_series()).reset_index(drop=True)

Et voilá:

>>> df
Slope Count
0 -25.0 0
1 -15.0 0
2 17.0 1
3 6.0 2
4 0.1 3
5 5.0 4
6 -3.0 0
7 5.0 1
8 1.0 2
9 3.0 3
10 -0.1 0
11 -0.2 0
12 1.0 1
13 -9.0 0

Pandas dataframe: count consecutive True / False values

You can get the group number of consecutive True/False by .cumsum() and put into g.

Then, group by g and get the size/count of each group by .transform() + .size(). Set the sign by multiplying the return value (1 or -1) of np.where(), as follows:

g = df['Mask'].ne(df['Mask'].shift()).cumsum()

df['Count'] = df.groupby(g)['Mask'].transform('size') * np.where(df['Mask'], 1, -1)

Result:

print(df)

Mask Count
0 True 3
1 True 3
2 True 3
3 False -2
4 False -2
5 True 1
6 False -2
7 False -2

Python Pandas: Compute Consecutive Window Count of Positive Numbers

IIUC you just need to count backwards:

s = df["Col3"][::-1]

df["New"] = s.groupby((s<0).cumsum()).apply(lambda d: (d>=0).cumsum())

print (df)

Col1 Col2 Col3 Col4 New
0 A 0.532 -0.234 2020-01-01 05:00:00 0
1 B 0.242 0.224 2020-01-01 06:00:00 1
2 A 0.152 -0.753 2020-01-01 08:00:00 0
3 C 0.149 0.983 2020-01-01 08:00:00 4
4 A 0.635 0.429 2020-01-01 09:00:00 3
5 A 0.938 0.365 2020-01-01 10:00:00 2
6 C 0.293 0.956 2020-01-02 05:00:00 1
7 A 0.294 -0.234 2020-01-02 06:00:00 0
8 E 0.294 0.394 2020-01-02 07:00:00 5
9 D 0.294 0.258 2020-01-02 08:00:00 4
10 A 0.687 0.666 2020-01-03 05:00:00 3
11 C 0.232 0.494 2020-01-03 06:00:00 2
12 D 0.575 0.845 2020-01-03 07:00:00 1

Count consecutive positive and negative values in a list

Count consecutive groups of positive/negative values using groupby:

s = pd.Series(y)
v = s.gt(0).ne(s.gt(0).shift()).cumsum()

pd.DataFrame(
v.groupby(v).count().values.reshape(-1, 2), columns=['pos', 'neg']
)

pos neg
0 1 2
1 4 2

Count Positive Consecutive Elements in Dataframe

Here's a similar approach in Pandas

In [792]: df_p = df > 0

In [793]: df_p
Out[793]:
0 1 2 3 4
0 False False True True True
1 True True False True True
2 True True True True False
3 False False True False True
4 False False True False False

In [794]: df_p['0'] * (df_p < df_p.shift(1, axis=1)).idxmax(axis=1).astype(int)
Out[794]:
0 0
1 2
2 4
3 0
4 0
dtype: int32

How to count consecutive repetitions in a pandas series

Here is another approach using fillna to handle NaN values:

s = df.id.fillna('nan')
mask = s.ne(s.shift())

ids = s[mask].to_numpy()
counts = s.groupby(mask.cumsum()).cumcount().add(1).groupby(mask.cumsum()).max().to_numpy()

# Convert 'nan' string back to `NaN`
ids[ids == 'nan'] = np.nan
ser_out = pd.Series(counts, index=ids, name='counts')

[out]

nan    2
1.0 2
2.0 3
nan 2
1.0 3
nan 1
Name: counts, dtype: int64


Related Topics



Leave a reply



Submit