How to Calculate Mean of All Columns, by Group

How to calculate mean of all columns, by group?

Edit2: Recent version of dplyr suggests using regular summarise with across function, as in:

library(dplyr)
mtcars %>% 
group_by(cyl, gear) %>%
summarise(across(everything(), mean))

What you're looking for is either ?summarise_all or ?summarise_each from dplyr

Edit: full code:

library(dplyr)
mtcars %>% 
    group_by(cyl, gear) %>%
    summarise_all("mean")

# Source: local data frame [8 x 11]
# Groups: cyl [?]
# 
#     cyl  gear    mpg     disp       hp     drat       wt    qsec    vs    am     carb
#   <dbl> <dbl>  <dbl>    <dbl>    <dbl>    <dbl>    <dbl>   <dbl> <dbl> <dbl>    <dbl>
# 1     4     3 21.500 120.1000  97.0000 3.700000 2.465000 20.0100   1.0  0.00 1.000000
# 2     4     4 26.925 102.6250  76.0000 4.110000 2.378125 19.6125   1.0  0.75 1.500000
# 3     4     5 28.200 107.7000 102.0000 4.100000 1.826500 16.8000   0.5  1.00 2.000000
# 4     6     3 19.750 241.5000 107.5000 2.920000 3.337500 19.8300   1.0  0.00 1.000000
# 5     6     4 19.750 163.8000 116.5000 3.910000 3.093750 17.6700   0.5  0.50 4.000000
# 6     6     5 19.700 145.0000 175.0000 3.620000 2.770000 15.5000   0.0  1.00 6.000000
# 7     8     3 15.050 357.6167 194.1667 3.120833 4.104083 17.1425   0.0  0.00 3.083333
# 8     8     5 15.400 326.0000 299.5000 3.880000 3.370000 14.5500   0.0  1.00 6.000000

How to average all columns in dataset by group

You can use summarise_all instead of multiple uses of summarise:

library(dplyr)

data %>%
  group_by(ID) %>% 
  summarise_all(mean)

# A tibble: 3 x 4
     ID   Tr1   Tr2   Tr3
  <int> <dbl> <dbl> <dbl>
1     1  4     4.33  8   
2     4  3.5   3.5   6   
3     6  3.67  5.33  6.33

Mean per group in a data.frame

This type of operation is exactly what aggregate was designed for:

d <- read.table(text=
'Name     Month  Rate1     Rate2
Aira       1      12        23
Aira       2      18        73
Aira       3      19        45
Ben        1      53        19
Ben        2      22        87
Ben        3      19        45
Cat        1      22        87
Cat        2      67        43
Cat        3      45        32', header=TRUE)

aggregate(d[, 3:4], list(d$Name), mean)

  Group.1    Rate1    Rate2
1    Aira 16.33333 47.00000
2     Ben 31.33333 50.33333
3     Cat 44.66667 54.00000

Here we aggregate columns 3 and 4 of data.frame d, grouping by d$Name, and applying the mean function.

Or, using a formula interface:

aggregate(. ~ Name, d[-2], mean)

Group pandas dataframe and calculate mean for multiple columns

df.groupby("category", as_index=False).mean()

Group by columns under conditions to calculate average

Use DataFrame.pivot_table with helper column new by copy like ColB, then flatten MultiIndex and add ouput to new DataFrame created by aggregate sum:

df1 = (df.assign(new=df['ColB'])
         .pivot_table(index=['ColA', 'ColB'], 
                      columns='new', 
                      values=['interval','duration'], 
                      fill_value=0,
                      aggfunc='mean'))
df1.columns = df1.columns.map(lambda x: f'{x[0]}{x[1]}')
df = (df.groupby(['ColA','ColB'])['Counter']
        .sum()
        .to_frame(name='SumCounter')
        .join(df1).reset_index())
print (df)
  ColA ColB  SumCounter  durationSD  durationUD  intervalSD  intervalUD
0    A   SD           3         2.5         0.0         3.5           0
1    A   UD          10         0.0         2.0         0.0           1
2    B   SD          32         2.0         0.0         3.5           0
3    B   UD           4         0.0         1.5         0.0           2

How to calculate mean values grouped on another column in Pandas

You could groupby on StationID and then take mean() on BiasTemp. To output Dataframe, use as_index=False

In [4]: df.groupby('StationID', as_index=False)['BiasTemp'].mean()
Out[4]:
  StationID  BiasTemp
0        BB       5.0
1     KEOPS       2.5
2    SS0279      15.0

Without as_index=False, it returns a Series instead

In [5]: df.groupby('StationID')['BiasTemp'].mean()
Out[5]:
StationID
BB            5.0
KEOPS         2.5
SS0279       15.0
Name: BiasTemp, dtype: float64

Read more about groupby in this pydata tutorial.

Means multiple columns by multiple groups

We can use dplyr with summarise_at to get mean of the concerned columns after grouping by the column of interest

library(dplyr)
airquality %>%
   group_by(City, year) %>% 
   summarise_at(vars("PM25", "Ozone", "CO2"), mean)

Or using the devel version of dplyr (version - ‘0.8.99.9000’)

airquality %>%
     group_by(City, year) %>%
     summarise(across(PM25:CO2, mean))

How to Calculate Mean of All Columns, by Group