Adding a Column of Means by Group to Original Data

Adding a column of means by group to original data

This is what the ave function is for.

df1$Y.New <- ave(df1$Y, df1$X)

Create new column for mean by group in original dataframe in R

We can use mutate instead of summarise

library(dplyr)
df <- df %>%
        group_by(unit_id) %>%
        mutate(mean = mean(outcome))

Creating a new column based on the mean of other values in group

Compute the means of all other values within each group using a double groupby:

sum all the values within the group
subtract the current (focal) value
divide by one less than the number of items in the group

Assign the shift-ed means to a new column:

means = df.groupby("group").apply(lambda x: x.groupby("col2")["col3"].transform("sum").sub(x["col3"]).div(len(x["col1"].unique())-1)).droplevel(0)

df["mean"] = means.shift().where(df["col1"].eq(df["col1"].shift()),0)

>>> df
   col1  col2  col3  group  mean
0     A  2015    10     10   0.0
1     A  2016    20     10   9.0
2     A  2017    25     10  10.5
3     B  2015    10     10   0.0
4     B  2016    12     10   9.0
5     B  2017    14     10  14.5
6     c  2015     8     10   0.0
7     c  2016     9     10  10.0
8     c  2017    10     10  16.0
9     d  2015    50     20   0.0
10    d  2016    60     20  40.0
11    d  2017    70     20  50.0
12    e  2015    40     20   0.0
13    e  2016    50     20  50.0
14    e  2017    60     20  60.0

Add a column with mean values for groups based on another column

Can use groupby transform to calculate the mean on the desired columns then join back to the initial DataFrame to add the newly created columns:

df = df.join(
    df.groupby('area')[['prod_a', 'prod_b']]
        .transform('mean')  # Calculate the mean for each group
        .rename(columns='mean {} for the area'.format)  # Rename columns 
)

df:

entity	area	prod_a	prod_b	mean prod_a for the area	mean prod_b for the area
001	A	1	3	1.5	4.5
002	B	2	4	4	4.5
003	A	2	6	1.5	4.5
004	C	7	2	5.5	5
005	C	4	8	5.5	5
006	B	6	5	4	4.5

Dataframe: adding a column with mean by other column group

Another alternative with pd.eval and transform with mean

data['av_state'] = (data.assign(state=pd.eval(data['state']).astype(int))
                       .groupby("group")['state'].transform('mean'))

print(data)

  id group  state  value  av_state
0  1     1   True     11  0.666667
1  2     1  False     12  0.666667
2  3     2  False      5  0.500000
3  4     1   True      8  0.666667
4  5     2   True      3  0.500000

Add a column to the original pandas data frame after grouping by 2 columns and taking dot product of two other columns

One way using pandas.DataFrame.prod:

df["Avg Price"] = df[["Weights", "Price"]].prod(1)
df["Avg Price"] = df.groupby(["Date", "Issuer"])["Avg Price"].transform("sum")
print(df)

Output:

         Date Issuer  Weights  Price  Avg Price
0  2019-11-12      A      0.4    100      120.0
1  2019-15-12      B      0.5    100      100.0
2  2019-11-12      A      0.2    200      120.0
3  2019-15-12      B      0.3    100      100.0
4  2019-11-12      A      0.4    100      120.0
5  2019-15-12      B      0.2    100      100.0

Aggregate by group AND add column to data frame in R

Since you have a tibble, first a dplyr solution. Next a base R version.

using dplyr:

df1 %>% 
  group_by(place) %>% 
  mutate(sum_num = sum(number))

# A tibble: 11 x 4
# Groups:   place [4]
   place animal number sum_num
   <chr> <chr>   <dbl>   <dbl>
 1 a     cat         5      11
 2 a     bear        6      11
 3 b     cat         7      22
 4 b     bear        4      22
 5 b     pig         5      22
 6 b     goat        6      22
 7 c     cat         8      16
 8 c     bear        5      16
 9 c     goat        3      16
10 d     goat        7      11
11 d     bear        4      11

using base R:

df1$sum_num <- ave(df1$number, df1$place, FUN = sum)

# A tibble: 11 x 4
   place animal number sum_num
   <chr> <chr>   <dbl>   <dbl>
 1 a     cat         5      11
 2 a     bear        6      11
 3 b     cat         7      22
 4 b     bear        4      22
 5 b     pig         5      22
 6 b     goat        6      22
 7 c     cat         8      16
 8 c     bear        5      16
 9 c     goat        3      16
10 d     goat        7      11
11 d     bear        4      11