Pandas Add Column to Groupby Dataframe

pandas add column to groupby dataframe

Use transform to add a column back to the orig df from a groupby aggregation, transform returns a Series with its index aligned to the orig df:

In [123]:
g = df.groupby('c')['type'].value_counts().reset_index(name='t')
g['size'] = df.groupby('c')['type'].transform('size')
g

Out[123]:
   c type  t  size
0  1    m  1     3
1  1    n  1     3
2  1    o  1     3
3  2    m  2     4
4  2    n  2     4

Adding column to pandas dataframe using group name in function when iterating through groupby

Use lambda function:

df['ycalc'] = df.groupby(['a','b'])['x'].transform(lambda x: func(x, p[x.name]))

Pandas DataFrame adding column after groupby

You're using pd.groupby on the wrong colums.

Your question suggests that "country" and "account" are the same for all "sku". In this case you should use:

df.groupby(['sku', 'country', 'account'], as_index=False).quantity.sum()
Out []:
                sku country account  quantity
0    CB-BB-AMB12-CA     usa     hch         2
1    CB-BB-CLR12-CA     usa     hch         2
2  CHG-FOOD1COMP-CA     usa     hch         3
3  CHG-FOOD2COMP-CA     usa     hch         2
4  CHG-FOODCONT1-CA     usa     hch         2
5  CHG-FRY-12PT5-CA     usa     hch         4
6   CHG-FRY-9PT5-CA     usa     hch         1
7   Q7-QDH0-EBB5-CA     usa     hch         3

Note: I removed two lines from your example where there is no "sku" nor "quantity". It these cases should be handled, just tell is in comment.

Pandas create new column with count from groupby

That's not a new column, that's a new DataFrame:

In [11]: df.groupby(["item", "color"]).count()
Out[11]:
             id
item  color
car   black   2
truck blue    1
      red     2

To get the result you want is to use reset_index:

In [12]: df.groupby(["item", "color"])["id"].count().reset_index(name="count")
Out[12]:
    item  color  count
0    car  black      2
1  truck   blue      1
2  truck    red      2

To get a "new column" you could use transform:

In [13]: df.groupby(["item", "color"])["id"].transform("count")
Out[13]:
0    2
1    2
2    2
3    1
4    2
dtype: int64

I recommend reading the split-apply-combine section of the docs.

Make a new column based on group by conditionally in Python

Almost there. Change filter to transform and use a condition:

df['new_group'] = df.groupby("id")["group"] \
                    .transform(lambda x: 'two' if (x.nunique() == 2) else x)
print(df)

# Output:
   id group new_group
0  x1     A       two
1  x1     B       two
2  x2     A         A
3  x2     A         A
4  x3     B         B

Pandas add column to df after group_by and value_counts

Alternatively join counts on group and color:

counts = df.groupby('group')['color'].value_counts(normalize=True)
df = df.join(counts.rename('freq'), on=['group', 'color'])

   group  color      freq
0      A    red  0.400000
1      A    red  0.400000
2      A  green  0.400000
3      A   blue  0.200000
4      A  green  0.400000
5      B    red  0.750000
6      B    red  0.750000
7      B    red  0.750000
8      B  green  0.250000
9      C   blue  0.333333
10     C  green  0.333333
11     C    red  0.333333

Or calculate normalized value counts manually with counting group + color counts vs group counts via groupby transform:

df['freq'] = (
        df.groupby(['group', 'color'])['color'].transform('count') /
        df.groupby('group')['group'].transform('count')
)

   group  color      freq
0      A    red  0.400000
1      A    red  0.400000
2      A  green  0.400000
3      A   blue  0.200000
4      A  green  0.400000
5      B    red  0.750000
6      B    red  0.750000
7      B    red  0.750000
8      B  green  0.250000
9      C   blue  0.333333
10     C  green  0.333333
11     C    red  0.333333

Pandas - Add Column Name to Results of groupby

Method 1:

use the argument as_index = False in your groupby:

df2 = df.groupby(['timeIndex'], as_index=False)['isZero'].sum()

>>> df2
   timeIndex  isZero
0          1       1
1          2       0

>>> df2['isZero']
0    1
1    0
Name: isZero, dtype: int64

Method 2:

You can use to_frame with your desired column name and then reset_index:

df2 = df.groupby(['timeIndex'])['isZero'].sum().to_frame('isZero').reset_index()

>>> df2
   timeIndex  isZero
0          1       1
1          2       0

>>> df2['isZero']
0    1
1    0
Name: isZero, dtype: int64

Add column with previous values by group

use shift

df2['PreviousValues'] = df2['FN'].shift()

output:


        Date       FN   AuM PreviousValues
0       01012021    A   10  NaN
1       01012021    B   20  A
2       02012021    A   12  B
3       02012021    B   23  A

Pandas Add Column to Groupby Dataframe