Pandas Dataframe Groupby Datetime Month

pandas dataframe groupby datetime month

Managed to do it:

b = pd.read_csv('b.dat')
b.index = pd.to_datetime(b['date'],format='%m/%d/%y %I:%M%p')
b.groupby(by=[b.index.month, b.index.year])

b.groupby(pd.Grouper(freq='M'))  # update for v0.21+

Pandas groupby month and year

You can use either resample or Grouper (which resamples under the hood).

First make sure that the datetime column is actually of datetimes (hit it with pd.to_datetime). It's easier if it's a DatetimeIndex:

In [11]: df1
Out[11]:
            abc  xyz
Date
2013-06-01  100  200
2013-06-03  -20   50
2013-08-15   40   -5
2014-01-20   25   15
2014-02-21   60   80

In [12]: g = df1.groupby(pd.Grouper(freq="M"))  # DataFrameGroupBy (grouped by Month)

In [13]: g.sum()
Out[13]:
            abc  xyz
Date
2013-06-30   80  250
2013-07-31  NaN  NaN
2013-08-31   40   -5
2013-09-30  NaN  NaN
2013-10-31  NaN  NaN
2013-11-30  NaN  NaN
2013-12-31  NaN  NaN
2014-01-31   25   15
2014-02-28   60   80

In [14]: df1.resample("M", how='sum')  # the same
Out[14]:
            abc  xyz
Date
2013-06-30   40  125
2013-07-31  NaN  NaN
2013-08-31   40   -5
2013-09-30  NaN  NaN
2013-10-31  NaN  NaN
2013-11-30  NaN  NaN
2013-12-31  NaN  NaN
2014-01-31   25   15
2014-02-28   60   80

Note: Previously pd.Grouper(freq="M") was written as pd.TimeGrouper("M"). The latter is now deprecated since 0.21.

I had thought the following would work, but it doesn't (due to as_index not being respected? I'm not sure.). I'm including this for interest's sake.

If it's a column (it has to be a datetime64 column! as I say, hit it with to_datetime), you can use the PeriodIndex:

In [21]: df
Out[21]:
        Date  abc  xyz
0 2013-06-01  100  200
1 2013-06-03  -20   50
2 2013-08-15   40   -5
3 2014-01-20   25   15
4 2014-02-21   60   80

In [22]: pd.DatetimeIndex(df.Date).to_period("M")  # old way
Out[22]:
<class 'pandas.tseries.period.PeriodIndex'>
[2013-06, ..., 2014-02]
Length: 5, Freq: M

In [23]: per = df.Date.dt.to_period("M")  # new way to get the same

In [24]: g = df.groupby(per)

In [25]: g.sum()  # dang not quite what we want (doesn't fill in the gaps)
Out[25]:
         abc  xyz
2013-06   80  250
2013-08   40   -5
2014-01   25   15
2014-02   60   80

To get the desired result we have to reindex...

Pandas groupby month and year (date as datetime64[ns]) and summarized by count

you can groupby and get the dt.year and the dt.month_name from the column date.

print (df.groupby([df['date'].dt.year.rename('year'), 
                   df['date'].dt.month_name().rename('month')])
         ['rides'].sum().reset_index())
   year    month    rides
0  2019  January  2596765
1  2020    March   880003

How can I group by month from a date field using Python and Pandas?

Try this:

In [6]: df['date'] = pd.to_datetime(df['date'])

In [7]: df
Out[7]:
        date  Revenue
0 2017-06-02      100
1 2017-05-23      200
2 2017-05-20      300
3 2017-06-22      400
4 2017-06-21      500

In [59]: df.groupby(df['date'].dt.strftime('%B'))['Revenue'].sum().sort_values()
Out[59]:
date
May      500
June    1000

pandas dataframe - groupby dataframe by datetime (last 12 months) and agreeing two columns, the answer to be like that ↓

Assuming date column's type is datetime, you can extract months to a different column:

df["month"] = df["date"].dt.month

Then group by month column and find the averages:

df.groupby("month").agg(wealth_avg=("wealth", "mean"), state_money_avg=("state_money", "mean"))

Convert year-month into Date while GroupBy

Your question wasn't totally clear as didn't have a workable example but I've had a crack at it here for you with data I made up:

import pandas as pd

data = {'period':['202201','202201','202201','202201','202202','202202','202203'], 'actuals':[10,20,30,40,50,60,70]}
    
df = pd.DataFrame(data)
print("BEFORE:")

This gives period as you described but it's stored as object and not datetime:

BEFORE:
   period  actuals
0  202201       10
1  202201       20
2  202201       30
3  202201       40
4  202202       50
5  202202       60
6  202203       70
print(df)

Here format='%Y%m' converts it to datetime (%Y%m means search for YYYYMM in the incoming string). Then .dt.strftime('%Y/%m') converts it back to an object format type but in the date format you require.

df['period'] = pd.to_datetime(df['period'], format='%Y%m').dt.strftime('%Y/%m')

print("AFTER:")
groupedresults = df.groupby('period')['actuals'].sum()
print(groupedresults)

And here's your output. Change the date format of period to suit your needs:

AFTER:
period
2022/01    100
2022/02    110
2022/03     70
Name: actuals, dtype: int64

How to groupby specifically datetime index in a multiindex column by month

datadf1 = datadf.drop(columns='Unnamed: 0')
prac = datadf1
prac =prac.set_index('ArrDate')
prac_dates = prac.copy()

prac = prac.resample('D').apply({'ShipName':'count','ComoQty':'sum'}).reset_index()

prac_dates = ((prac_dates.resample('M').apply({'ComoQty':'sum'}))/1000).reset_index()
prac_dates['Month'] = pd.DatetimeIndex(prac_dates['ArrDate']).strftime('%B')
del prac_dates['ArrDate']
# prac_dates

prac['Month'] = pd.DatetimeIndex(prac['ArrDate']).strftime('%B')
# prac['Month'] = pd.to_datetime(prac['Month'], format='%B')
prac['ArrDate'] = pd.DatetimeIndex(prac['ArrDate']).strftime('%d')

Pandas Dataframe Groupby Datetime Month