How to add sequential counter column on groups using Pandas groupby
use cumcount()
, see docs here
In [4]: df.groupby(['c1', 'c2']).cumcount()
Out[4]:
0 0
1 1
2 0
3 1
4 0
5 1
6 2
7 0
8 0
9 0
10 1
11 2
dtype: int64
If you want orderings starting at 1
In [5]: df.groupby(['c1', 'c2']).cumcount()+1
Out[5]:
0 1
1 2
2 1
3 2
4 1
5 2
6 3
7 1
8 1
9 1
10 2
11 3
dtype: int64
How to add sequential counter column on groups using Pandas groupby?
Try groupby
with transform
:
x = df.groupby('c1')['c2']
df['Ct_X'] = x.transform(lambda x: x.eq('X').cumsum())
df['Ct_Y'] = x.transform(lambda x: x.eq('Y').cumsum())
print(df)
Output:
c1 c2 seq Ct_X Ct_Y
0 A X 1 1 0
1 A X 2 2 0
2 A Y 1 2 1
3 A Y 2 2 2
4 B X 1 1 0
5 B X 2 2 0
6 B X 3 3 0
7 B Y 1 3 1
8 C X 1 1 0
9 C Y 1 1 1
10 C Y 2 1 2
11 C Y 6 1 3
add column to dataframe with sequence of integers depending on another column
You can use cumcount()
grouping by B
df = pd.DataFrame({'A':[3,5,2,5,4,2,5,2,3,1,4,1], 'B':['x','y','x','x','y','z','z','x','y','y','x','z']})
df['C'] = df.groupby('B').cumcount() + 1
Add sequential counter to group within dataframe but skip increment when condition is met
I made a code snippet following what I believe is what you want, you can definitely reuse to adapt if something is not really exactly as you expected.
I think the key thing here is:
1) iterate on the pairs of (previousRow, currentRow) so you can easily acess last row information
2) specific if conditions that matches what you expect.
3) try to update the count in the if conditions and set the value afterwards
import pandas as pd
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
from itertools import zip_longest
d1 = {'col1': [1, 1, 1, 2, 2, 3], 'col2': ['A', 'A', 'B', 'A', 'A', 'B']}
df1 = pd.DataFrame(data=d1)
df1['count'] = 0
df1_previterrows = df1.iterrows()
df1_curriterrows = df1.iterrows()
df1_curriterrows.__next__()
groups_counter = {}
df1_firstRow = df1.iloc[0]
if df1_firstRow["col2"] == "A":
groups_counter[df1_firstRow['col1']]=1
df1.set_value(0, 'count', 1)
elif df1_firstRow["col2"] == "B":
groups_counter["B"]=1
df1.set_value(0, 'count', 0)
zip_list = zip_longest(df1_previterrows, df1_curriterrows)
for (prevRow_idx, prevRow), Curr in zip_list:
if not (Curr is None):
(currRow_idx, currRow) = Curr
if((currRow["col1"] == prevRow["col1"]) and (currRow["col2"] == "A")):
count = groups_counter.get(currRow["col1"],False)
if not count:
groups_counter[currRow["col1"]]=0
groups_counter[currRow["col1"]]+=1
elif((currRow["col1"] != prevRow["col1"]) and (currRow["col2"] == "A")):
groups_counter[currRow["col1"]]=1
elif((currRow["col1"] == prevRow["col1"]) and (currRow["col2"] == "B")):
if not groups_counter.get(currRow["col1"],False):
groups_counter[curr["col1"]] = 1
elif((currRow["col1"] != prevRow["col1"]) and (currRow["col2"] == "B")):
groups_counter[currRow["col1"]]=0
df1.set_value(currRow_idx, 'count', groups_counter[currRow["col1"]])
print(df1)
OUTPUT:
col1 col2 count
0 1 A 1
1 1 A 2
2 1 B 2
3 2 A 1
4 2 A 2
5 3 B 0
pandas group by and sequence
Use cumcount:
df['sequence'] = (df.groupby('category').cumcount() % 3) + 1
print(df)
Output
id category sequence
0 1 a 1
1 2 a 2
2 3 a 3
3 4 a 1
4 5 a 2
5 6 a 3
6 7 b 1
7 8 b 2
8 9 b 3
9 10 b 1
10 11 b 2
11 12 b 3
As an alternative:
df['sequence'] = df.groupby('category').cumcount().mod(3).add(1)
python create count column for each value in row
Use groupby
+ cumcount
:
df['Count'] = df.groupby('SerialNumber').cumcount() + 1
df
SerialNumber ... Count
0 1111 ... 1
1 2222 ... 1
2 1111 ... 2
3 3333 ... 1
4 1111 ... 3
[5 rows x 6 columns]
Pandas groupby and create a unique ID column for every row
If you need count per group by row we have cumcount
:
df['new'] = df.groupby('fruit').cumcount()
df
Out[346]:
fruit count new
0 apple 1 0
1 apple 20 1
2 apple 21 2
3 mango 31 0
4 mango 17 1
Or:
df['new'] = df.assign(new=1).groupby('fruit')['new'].cumsum()-1
df
Out[352]:
fruit count new
0 apple 1 0
1 apple 20 1
2 apple 21 2
3 mango 31 0
4 mango 17 1
Related Topics
Open() in Python Does Not Create a File If It Doesn't Exist
Python Spawn Off a Child Subprocess, Detach, and Exit
Run Multiple Python Scripts Concurrently
How to Terminate Process from Python Using Pid
How to Install Pyodbc on Linux
Simulating Key Press Event Using Python For Linux
Show Matplotlib Plots (And Other Gui) in Ubuntu (Wsl1 & Wsl2)
Run Interactive Bash With Popen and a Dedicated Tty Python
Calling a Python Script from Command Line Without Typing "Python" First
Django [Errno 13] Permission Denied: '/Var/Www/Media/Animals/User_Uploads'
Tkinter.Photoimage Doesn't Not Support Png Image
Cross-Platform Space Remaining on Volume Using Python
Force Python to Use an Older Version of Module (Than What I Have Installed Now)
How to Update a Python Package