Group by Week in Pandas

group by week in pandas

First, convert column date to_datetime and subtract one week as we want the sum for the week ahead of the date and not the week before that date.

Then use groupby with Grouper by W-MON and aggregate sum:

df['Date'] = pd.to_datetime(df['Date']) - pd.to_timedelta(7, unit='d')
df = df.groupby(['Name', pd.Grouper(key='Date', freq='W-MON')])['Quantity']
.sum()
.reset_index()
.sort_values('Date')
print (df)
  Name       Date  Quantity
0 Apple 2017-07-10 90
3 orange 2017-07-10 20
1 Apple 2017-07-17 30
2 Orange 2017-07-24 40

Pandas Group by date weekly

Use DataFrame.resample by W with sum:

#convert date column to datetimes
df['date'] = pd.to_datetime(df['date'])

df1 = df.resample('W', on='date')['count1','count2'].sum()

Or use Grouper:

df1 = df.groupby(pd.Grouper(freq='W', key='date'))['count1','count2'].sum()

print (df1)
count1 count2
date
2019-12-15 3 75
2019-12-22 4 43

Group data by week in Pandas

import pandas as pd 

Name = ["Apple", "Orange", "Apple", "Orange", "Apple", "Banana", "Apple","Orange"]
Date = ["2022-03-15","2022-03-16","2022-03-17","2022-03-18","2022-03-19","2022-03-20","2019-12-19","2004-01-07"]
author = ["sahil_1","sahil_2","sahil_3","sahil_1","sahil_2","sahil_3","sahil_3","sahil_1"]

df = pd.DataFrame(zip(Name,Date,author), columns=["Name", "Date", "Author"])
df['Date'] = pd.to_datetime(df['Date']) - pd.to_timedelta(7, unit='d')
x = df.groupby(['Name', pd.Grouper(key='Date', freq='W-MON')])['Name'].count()
print(x)

Pandas - Group data by week and add column for count of rows in group

If your dataset is a dataframe, you can use:

df.assign(Count=1).groupby('Date')['Count'].count()

If it's a series:

series.to_frame().assign(Count=1).groupby('Date')['Count'].count()

For example:

df = pd.DataFrame({'Date':['2015-09-05',
'2015-09-05',
'2015-07-08',
'2017-09-05',
'2018-09-05',
'2018-09-05']})
df.assign(Count=1).groupby('Date')['Count'].count().reset_index()

Returns:

         Date  Count
0 2015-07-08 1
1 2015-09-05 2
2 2017-09-05 1
3 2018-09-05 2

How to group by week (start is Thursday) using pandas?

First convert columns to numbers and datetimes by to_datetime:

df_users['Users_gain'] = df_users['Users_gain'].astype(int)
df_users['Date'] = pd.to_datetime(df_users['Date'], format='%d.%m.%Y')

Then aggregate by DataFrame.resample or with Grouper by day Wednesday:

df_users = df_users.resample('W-Wed',on='Date')['Users_gain'].sum().reset_index()
#alternative
#df_users = df_users.groupby(pd.Grouper(key='Date', freq='W-Wed')).sum().reset_index()

Last change format of datetimes with subtract 6 days and Series.dt.strftime:

s = (df_users['Date'] - pd.offsets.DateOffset(days=6)).dt.strftime('%d.%m.%Y-')
df_users['Date'] = s + df_users['Date'].dt.strftime('%d.%m.%Y')

print (df_users)
Date Users_gain
0 13.02.2020-19.02.2020 6
1 20.02.2020-26.02.2020 10

groupby week - pandas dataframe

Try extract the iso calendar (year-week-day), then groupby:

s = dt.index.isocalendar()

dt.groupby([s.year, s.week]).sum()

You would get something like this:

            a   b   c   d   e
year week
2019 1 18 33 31 26 25
2 36 31 25 28 31
3 33 22 44 22 29
4 36 36 35 33 31
5 27 30 26 31 36

Pandas grouping by week

May be with the caveat of the definition of the first day of a week, you could take something in the following code.

df = pd.DataFrame(data=d)
df['Date']=pd.to_datetime(df['Date'])

I. Discontinuous index

Monday is chosen as the first day of week

#(1) Build a series of first_day_of_week, monday is chosen as the first day of week
weeks_index = df['Date'] - df['Date'].dt.weekday * np.timedelta64(1, 'D')

#(2) Groupby and some tidying
df2 = ( df.groupby([df['Name'], weeks_index])
.count()
.rename(columns={'Date':'Count'})

.swaplevel() # weeks to first level
.sort_index()
.unstack(1).fillna(0.0)

.astype(int)
.rename_axis('first_day_of_week')
)

>>> print(df2)
Name A B C D K M R
first_day_of_week
2021-08-30 1 0 0 0 0 0 0
2021-09-06 0 0 3 1 0 0 0
2021-09-13 0 0 0 0 1 0 0
2021-09-20 0 0 0 1 0 0 1
2021-09-27 0 0 0 0 1 1 0
2021-10-18 0 1 0 0 0 0 0

II. Continuous index

This part does not differ much of the previous one.

We build a continuous version of the index to be use to reindex

Monday is chosen as the first day of week (obviouly for the two indices)

#(1a) Build a series of first_day_of_week, monday is chosen as the 
weeks_index = df['Date'] - df['Date'].dt.weekday * np.timedelta64(1, 'D')
#(1b) Build a continuous series of first_day_of_week
continuous_weeks_index = pd.date_range(start=weeks_index.min(),
end=weeks_index.max(),
freq='W-MON') # monday

#(2) Groupby, unstack, reindex, and some tidying
df2 = ( df
# groupby and count
.groupby([df['Name'], weeks_index])
.count()
.rename(columns={'Date':'Count'})

# unstack on weeks
.swaplevel() # weeks to first level
.sort_index()
.unstack(1)

# reindex to insert weeks with no data
.reindex(continuous_weeks_index) # new index

# clean up
.fillna(0.0)
.astype(int)
.rename_axis('first_day_of_week')
)

>>>print(df2)
Name A B C D K M R
first_day_of_week
2021-08-30 1 0 0 0 0 0 0
2021-09-06 0 0 3 1 0 0 0
2021-09-13 0 0 0 0 1 0 0
2021-09-20 0 0 0 1 0 0 1
2021-09-27 0 0 0 0 1 1 0
2021-10-04 0 0 0 0 0 0 0
2021-10-11 0 0 0 0 0 0 0
2021-10-18 0 1 0 0 0 0 0

Last step if needed

df2.stack()

Aggregating weekly data by group into monthly sums in pandas

'Week' is not in the year_month format you need in your expected output, so you need to first convert them into year_month by:

date = df['Week'].str.split(' ', expand=True)[0]
year_month = pd.to_datetime(date, errors='coerce').dt.strftime('%Y-%b').fillna(date)

before you use groupby:

df.groupby([year_month, 'Clinic']).sum()

pandas group by week

Probably you have date column as a string.

In order to use it in a Grouper with a frequency, start from converting this column to DateTime:

df['date'] = pd.to_datetime(df['date'])

Then, as date column is an "ordinary" data column (not the index), use key='date' parameter and a frequency.

To sum up, below you have a working example:

import pandas as pd

d = [['2018-08-19 19:08:19', 'pga', 'yes'],
['2018-08-19 19:09:27', 'pga', 'no'],
['2018-08-19 19:10:45', 'lry', 'no'],
['2018-09-07 19:12:31', 'lry', 'yes'],
['2018-09-19 19:13:07', 'pga', 'yes'],
['2018-10-22 19:13:20', 'lry', 'no']]
df = pd.DataFrame(data=d, columns=['date', 'user', 'answer'])
df['date'] = pd.to_datetime(df['date'])
gr = df.groupby(pd.Grouper(key='date',freq='W'))
for name, group in gr:
print(' ', name)
if len(group) > 0:
print(group)

Note that the group key (name) is the ending date of a week, so dates from group members are earlier or equal to the date printed above.

You can change it passing label='left' parameter to Grouper.



Related Topics



Leave a reply



Submit