Pandas Resample Documentation

pandas resample documentation

B         business day frequency
C custom business day frequency (experimental)
D calendar day frequency
W weekly frequency
M month end frequency
SM semi-month end frequency (15th and end of month)
BM business month end frequency
CBM custom business month end frequency
MS month start frequency
SMS semi-month start frequency (1st and 15th)
BMS business month start frequency
CBMS custom business month start frequency
Q quarter end frequency
BQ business quarter endfrequency
QS quarter start frequency
BQS business quarter start frequency
A year end frequency
BA, BY business year end frequency
AS, YS year start frequency
BAS, BYS business year start frequency
BH business hour frequency
H hourly frequency
T, min minutely frequency
S secondly frequency
L, ms milliseconds
U, us microseconds
N nanoseconds

See the timeseries documentation. It includes a list of offsets (and 'anchored' offsets), and a section about resampling.

Note that there isn't a list of all the different how options, because it can be any NumPy array function and any function that is available via groupby dispatching can be passed to how by name.

Python Pandas Frequency documentation

http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases

And, almost immediately below that: W-SAT and others.

I'll admit, links to this particular piece of documentation are pretty scarce. More general frequencies can be represented by supplying a DateOffset instance. Even more general resamplings can be done via groupby.

pandas resample - 5 minute blocks (not every 5th minute of the hour)

Use origin parameter by first value of index:

rng = pd.date_range('2017-04-03 12:07:00', periods=10, freq='min')
df = pd.DataFrame({'a': range(10)}, index=rng)

df = df.resample("5T", origin=df.index[0]).mean()
print (df)
a
2017-04-03 12:07:00 2
2017-04-03 12:12:00 7

Pandas resample monthly data into custom frequency (seasonal) data

Try mapping each month value to a season value then groupby resample on each season:

df['season'] = df['time'].dt.month.map({
12: 0, 1: 0, 2: 0,
3: 1, 4: 1, 5: 1,
6: 2, 7: 2, 8: 2, 9: 2,
10: 3, 11: 3
})

df = df.groupby('season').resample('Y', on='time')['data'].sum().reset_index()

df:

   season       time      data
0 0 2015-12-31 0.221993
1 0 2016-12-31 1.077451
2 1 2016-12-31 2.018766
3 2 2016-12-31 1.768848
4 3 2016-12-31 0.080741

To consider the previous December as part of the next year add MonthBegin from pandas.tseries.offsets to offset December 2015 to January 2016, then adjust all Season values forward one month:

df['time'] = df['time'] + MonthBegin(1)
df['season'] = df['time'].dt.month.map({
1: 0, 2: 0, 3: 0,
4: 1, 5: 1, 6: 1,
7: 2, 8: 2, 9: 2, 10: 2,
11: 3, 12: 3
})

df = df.groupby('season').resample('Y', on='time')['data'].sum().reset_index()

df:

   season       time      data
0 0 2016-12-31 1.299445
1 1 2016-12-31 2.018766
2 2 2016-12-31 1.768848
3 3 2016-12-31 0.080741

Sample Data Used:

np.random.seed(5)
dti = pd.date_range("2015-12-31", periods=11, freq="M")
df = pd.DataFrame({'time': dti,
'data': np.random.rand(len(dti))})

df:

         time      data
0 2015-12-31 0.221993
1 2016-01-31 0.870732
2 2016-02-29 0.206719
3 2016-03-31 0.918611
4 2016-04-30 0.488411
5 2016-05-31 0.611744
6 2016-06-30 0.765908
7 2016-07-31 0.518418
8 2016-08-31 0.296801
9 2016-09-30 0.187721
10 2016-10-31 0.080741


Related Topics



Leave a reply



Submit