Pandas Get Column Average/Mean

pandas get column average/mean

If you only want the mean of the weight column, select the column (which is a Series) and call .mean():

In [479]: df
Out[479]: 
         ID  birthyear    weight
0    619040       1962  0.123123
1    600161       1963  0.981742
2  25602033       1963  1.312312
3    624870       1987  0.942120

In [480]: df["weight"].mean()
Out[480]: 0.83982437500000007

How to average column values every n rows in pandas

IIUC, DataFrame.melt + mean for each site with GroupBy.mean

# df_tmp = df_tmp.astype(int) # get correct result
df_tmp.melt('site').groupby('site')['value'].mean()

Or:

# df_tmp = df_tmp.astype(int) # get correct result
df_tmp.set_index('site').stack().groupby(level=0).mean()
#df_tmp.set_index('site').stack().mean(level=0) # .mean(level=0) deprecated

Output

site
1    3.333333
2    7.333333
Name: value, dtype: float64

compute column average based on conditions pandas

You can use .groupby() and .mean(), followed by rename column by .rename(), as follows:

df2 = df.groupby(['names', 'subject'], as_index=False)['value'].mean().rename({'value': 'average'}, axis=1)

Result:

print(df2)

  names subject    average
0     A       X  10.000000
1     A       Y  15.666667
2     B       P  12.250000
3     B       Q  10.000000

Calculating mean of column based on the occurence of a number in another column Pandas dataframe Python

Try this

df[df['s1']==5]['s2'].mean()

pandas get column average for rows with a certain value?

Use pandas.core.groupby.GroupBy.mean:

df.groupby("city")["timeDiff"].mean()

how to get the average of dataframe column values

Simply using df.mean() will Do The Right Thing(tm) with respect to NaNs:

>>> df
                 A      B
DATE                     
2013-05-01  473077  71333
2013-05-02   35131  62441
2013-05-03     727  27381
2013-05-04     481   1206
2013-05-05     226   1733
2013-05-06     NaN   4064
2013-05-07     NaN  41151
2013-05-08     NaN   8144
2013-05-09     NaN     23
2013-05-10     NaN     10
>>> df.mean(axis=1)
DATE
2013-05-01    272205.0
2013-05-02     48786.0
2013-05-03     14054.0
2013-05-04       843.5
2013-05-05       979.5
2013-05-06      4064.0
2013-05-07     41151.0
2013-05-08      8144.0
2013-05-09        23.0
2013-05-10        10.0
dtype: float64

You can use df[["A", "B"]].mean(axis=1) if there are other columns to ignore.

How to calculate mean of specific rows in python dataframe?

You should avoid as much as possible to iterate rows in a dataframe, because it is very unefficient...

groupby is the way to go when you want to apply the same processing to various groups of rows identified by their values in one or more columns. Here what you want is (*):

df.groupby('TagName')['Sample_value'].mean().reset_index()

it gives as expected:

     TagName  Sample_value
0      Steam  1.081447e+06
1  Utilities  3.536931e+05

Details on the magic words:

groupby: identifies the column(s) used to group the rows (same values)
['Sample_values']: restrict the groupby object to the column of interest
mean(): computes the mean per group
reset_index(): by default the grouping columns go into the index, which is fine for the mean operation. reset_index make them back normal columns

Compute row average in pandas

You can specify a new column. You also need to compute the mean along the rows, so use axis=1.

df['mean'] = df.mean(axis=1)
>>> df
       Y1961      Y1962      Y1963      Y1964      Y1965 Region       mean
0  82.567307  83.104757  83.183700  83.030338  82.831958     US  82.943612
1   2.699372   2.610110   2.587919   2.696451   2.846247     US   2.688020
2  14.131355  13.690028  13.599516  13.649176  13.649046     US  13.743824
3   0.048589   0.046982   0.046583   0.046225   0.051750     US   0.048026
4   0.553377   0.548123   0.582282   0.577811   0.620999     US   0.576518

For a column in pandas dataframe, calculate mean of column values in previous 4th, 8th and 12th row from the present row?

.shift() is your missing part. We can use it to access previous rows from the existing row in a Pandas dataframe.

Let's use .groupby(), .apply() and .shift() as follows:

df['New column'] = df.groupby((df['Row number'] - 1) // 13)['Existing column'].apply(lambda x: (x.shift(4) + x.shift(8) + x.shift(12)) / 3)

Here, rows are partitioned into groups of 13 rows by grouping them under different group numbers set by (df['Row number'] - 1) // 13

Then within each group, we use .apply() on the column Existing column and use .shift() to get the previous 4th, 8th and 12th entries within the group.

Test Run

data = {'Row number' : np.arange(1, 40), 'Existing column': np.arange(11, 50) }
df = pd.DataFrame(data)

print(df)

    Row number  Existing column
0            1               11
1            2               12
2            3               13
3            4               14
4            5               15
5            6               16
6            7               17
7            8               18
8            9               19
9           10               20
10          11               21
11          12               22
12          13               23
13          14               24
14          15               25
15          16               26
16          17               27
17          18               28
18          19               29
19          20               30
20          21               31
21          22               32
22          23               33
23          24               34
24          25               35
25          26               36
26          27               37
27          28               38
28          29               39
29          30               40
30          31               41
31          32               42
32          33               43
33          34               44
34          35               45
35          36               46
36          37               47
37          38               48
38          39               49

df['New column'] = df.groupby((df['Row number'] - 1) // 13)['Existing column'].apply(lambda x: (x.shift(4) + x.shift(8) + x.shift(12)) / 3)

print(df)

    Row number  Existing column  New column
0            1               11         NaN
1            2               12         NaN
2            3               13         NaN
3            4               14         NaN
4            5               15         NaN
5            6               16         NaN
6            7               17         NaN
7            8               18         NaN
8            9               19         NaN
9           10               20         NaN
10          11               21         NaN
11          12               22         NaN
12          13               23        15.0
13          14               24         NaN
14          15               25         NaN
15          16               26         NaN
16          17               27         NaN
17          18               28         NaN
18          19               29         NaN
19          20               30         NaN
20          21               31         NaN
21          22               32         NaN
22          23               33         NaN
23          24               34         NaN
24          25               35         NaN
25          26               36        28.0
26          27               37         NaN
27          28               38         NaN
28          29               39         NaN
29          30               40         NaN
30          31               41         NaN
31          32               42         NaN
32          33               43         NaN
33          34               44         NaN
34          35               45         NaN
35          36               46         NaN
36          37               47         NaN
37          38               48         NaN
38          39               49        41.0

Pandas Get Column Average/Mean