Calculating the Mean of Each Month by Year in Python

Calculating the mean of each month by year in Python

IIUC, you can use pd.Grouper. I took the liberty of adding a few rows to your dataframe (with different months) to show:

>>> df
              ds       y
1256  2000-01-03  1.8050
1257  2000-01-04  1.8405
1258  2000-01-05  1.8560
1259  2000-01-06  1.8400
1260  2000-01-07  1.8310
1261  2000-01-10  1.8190
1262  2000-01-11  1.8225
1263  2000-01-12  1.8350
1263  2000-02-12  1.8350
1263  2000-02-15  2.9450
5844  2018-04-09  3.3950
5845  2018-04-10  3.4146
5846  2018-04-11  3.3955
5847  2018-04-12  3.3902
5848  2018-04-13  3.4088
5849  2018-04-16  3.4282
5850  2018-04-17  3.4022
5851  2018-04-18  3.3844
5852  2018-04-19  3.4028
5853  2018-04-20  3.4121
5854  2018-04-23  3.4463
5855  2018-04-24  3.4685
5856  2018-04-25  3.5090
5857  2018-04-26  3.4992

# first cast ds to datetime
df['ds'] = pd.to_datetime(df['ds'])
# then group by month, and get the mean:
df.groupby(pd.Grouper(key='ds', freq='M')).mean().dropna()

                       y
    ds                  
    2000-01-31  1.831125
    2000-02-29  2.390000
    2018-04-30  3.425486

The resulting Series shows the mean value of y for each month, showing the date of the final day of that month.

Python - Aggregate by month and calculate average

Probably the simplest approach is to use the resample command. First, when you read in your data make sure you parse the dates and set the date column as your index (ignore the StringIO part and the header=True ... I am reading in your sample data from a multi-line string):

>>> df = pd.read_csv(StringIO(data),header=True,parse_dates=['Date'],
                     index_col='Date')
>>> df

            Sentiment
Date
2014-01-03       0.40
2014-01-04      -0.03
2014-01-09       0.00
2014-01-10       0.07
2014-01-12       0.00
2014-02-24       0.00 
2014-02-25       0.00
2014-02-25       0.00
2014-02-26       0.00
2014-02-28       0.00
2014-03-01       0.10
2014-03-02      -0.50
2014-03-03       0.00
2014-03-08      -0.06
2014-03-11      -0.13
2014-03-22       0.00
2014-03-23       0.33
2014-03-23       0.30
2014-03-25      -0.14
2014-03-28      -0.25


>>> df.resample('M').mean()

            Sentiment
2014-01-31      0.088
2014-02-28      0.000
2014-03-31     -0.035

And if you want a month counter, you can add it after your resample:

>>> agg = df.resample('M',how='mean')
>>> agg['cnt'] = range(len(agg))
>>> agg

            Sentiment  cnt
2014-01-31      0.088    0
2014-02-28      0.000    1
2014-03-31     -0.035    2

You can also do this with the groupby method and the TimeGrouper function (group by month and then call the mean convenience method that is available with groupby).

>>> df.groupby(pd.TimeGrouper(freq='M')).mean()

            Sentiment
2014-01-31      0.088
2014-02-28      0.000
2014-03-31     -0.035

Pandas, how to calculate mean values of the past n years for every month

I could not guess what were the columns and indexes in your dataframe. So assuming that it is:

df = pd.DataFrame({'year': [1999.0, 1999.0, 1999.0, 2000.0, 2000.0, 2000.0,
                            2001.0, 2001.0, 2001.0, 2002.0, 2002.0, 2002.0,
                            2003.0, 2003.0, 2003.0],
                   'Month': ['1', '2', '3', '1', '2', '3', '1', '2', '3',
                             '1', '2', '3', '1', '2', '3'],
                   'value': ['6', '9', '7', '5', '7', '6', '4', '6', '8',
                             '7', '9', '8', '5', '7', '7']})

giving:

0   year Month value
1   1999     1     6
2   1999     2     9
3   1999     3     7
4   2000     1     5
5   2000     2     7
6   2000     3     6
7   2001     1     4
8   2001     2     6
9   2001     3     8
10  2002     1     7
11  2002     2     9
12  2002     3     8
13  2003     1     5
14  2003     2     7
15  2003     3     7

You can group by month and use a rolling windows of size 3 to compute the rolling sum of the last 3 years per month, and shift the result to align it:

df['average_past_3_years'] = df.groupby('Month').rolling(3).agg(
                      {'value':'mean', 'year': 'max'}).reset_index(level=0).groupby(
                      'Month').transform('shift')['value']

It will give as expected:

0   year Month value  average_past_3_years
1   1999     1     6                   NaN
2   1999     2     9                   NaN
3   1999     3     7                   NaN
4   2000     1     5                   NaN
5   2000     2     7                   NaN
6   2000     3     6                   NaN
7   2001     1     4                   NaN
8   2001     2     6                   NaN
9   2001     3     8                   NaN
10  2002     1     7              5.000000
11  2002     2     9              7.333333
12  2002     3     8              7.000000
13  2003     1     5              5.333333
14  2003     2     7              7.333333
15  2003     3     7              7.333333

Get monthly average in pandas

We can convert your datetime column into a PeriodIndex on monthly frequency, then take the mean using GroupBy.mean:

df.groupby(pd.PeriodIndex(df['Date'], freq="M"))['Value'].mean()
    
Date
2006-01    14.6
2019-12    38.2
Freq: M, Name: Value, dtype: float64

df.groupby(pd.PeriodIndex(df['Date'], freq="M"))['Value'].mean().reset_index()

      Date  Value
0  2006-01   14.6
1  2019-12   38.2

One caveat of this approach is that missing months are not shown. If that's important, use set_index and resample.mean in the same way.

How do I calculate mean value for each month in the dataset?

Try:

df.index = pd.to_datetime(df.index)
df.groupby([df.index.year, df.index.month]).mean()

             RPT        VAL        ROS  ...        CLO        BEL     MAL
DATE DATE                                   ...                              
1961 1     12.373333   9.333333  11.043333  ...   7.906667   8.833333  11.960
     2     12.230000  12.020000   8.560000  ...   9.210000  15.290000  15.125
     3     10.580000   6.630000  11.750000  ...   5.880000   5.460000  10.880
1962 3     13.330000  13.250000  11.420000  ...  10.340000  12.920000  11.830
     6     13.210000   8.120000   9.960000  ...   7.500000   8.120000  13.170
1968 7     12.230000  12.020000   8.560000  ...   9.210000  15.290000  15.125
1976 8     11.955000   9.940000  11.585000  ...   8.110000   9.190000  11.355
1978 9     13.355000  11.205000   9.730000  ...   7.730000  11.040000  13.480
     12    10.960000   9.750000   7.620000  ...  10.460000  16.620000  16.460

How to group and calculate monthly average in pandas dataframe

Convert values to datetimes first, then aggregate sum per name and months by Grouper and last get mean per first level name:

data['time'] = pd.to_datetime(data['time'])

df = (data.groupby(['name', pd.Grouper(freq='m', key='time')])['values'].sum()
          .groupby(level=0)
          .mean()
          .reset_index(name='Monthly Average'))
print (df)
  name  Monthly Average
0    A               25
1    B               30

With months period solution is if change Grouper to Series.dt.to_period:

data['time'] = pd.to_datetime(data['time'])

df = (data.groupby(['name', data['time'].dt.to_period('m')])['values']
          .sum()
          .groupby(level=0)
          .mean()
          .reset_index(name='Monthly Average'))
print (df)
  name  Monthly Average
0    A               25
1    B               30

How to calculate the Monthly Average over Multiple Years with multiple Latitude and Longitude - Pandas - Xarray

If I understand, you're after the long-term mean for each month. If so, you can use xarray with groupby() instead of resample() to calculate these climatologies.

climatology = Multidata.groupby("time.month").mean("time")

See xarray docs here calculating monthly anomalies.

Calculating the Mean of Each Month by Year in Python