pandas resample documentation
B business day frequency
C custom business day frequency (experimental)
D calendar day frequency
W weekly frequency
M month end frequency
SM semi-month end frequency (15th and end of month)
BM business month end frequency
CBM custom business month end frequency
MS month start frequency
SMS semi-month start frequency (1st and 15th)
BMS business month start frequency
CBMS custom business month start frequency
Q quarter end frequency
BQ business quarter endfrequency
QS quarter start frequency
BQS business quarter start frequency
A year end frequency
BA, BY business year end frequency
AS, YS year start frequency
BAS, BYS business year start frequency
BH business hour frequency
H hourly frequency
T, min minutely frequency
S secondly frequency
L, ms milliseconds
U, us microseconds
N nanoseconds
See the timeseries documentation. It includes a list of offsets (and 'anchored' offsets), and a section about resampling.
Note that there isn't a list of all the different how
options, because it can be any NumPy array function and any function that is available via groupby dispatching can be passed to how
by name.
Python Pandas Frequency documentation
http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
And, almost immediately below that: W-SAT and others.
I'll admit, links to this particular piece of documentation are pretty scarce. More general frequencies can be represented by supplying a DateOffset
instance. Even more general resamplings can be done via groupby
.
pandas resample - 5 minute blocks (not every 5th minute of the hour)
Use origin
parameter by first value of index
:
rng = pd.date_range('2017-04-03 12:07:00', periods=10, freq='min')
df = pd.DataFrame({'a': range(10)}, index=rng)
df = df.resample("5T", origin=df.index[0]).mean()
print (df)
a
2017-04-03 12:07:00 2
2017-04-03 12:12:00 7
Pandas resample monthly data into custom frequency (seasonal) data
Try mapping each month value to a season value then groupby resample
on each season:
df['season'] = df['time'].dt.month.map({
12: 0, 1: 0, 2: 0,
3: 1, 4: 1, 5: 1,
6: 2, 7: 2, 8: 2, 9: 2,
10: 3, 11: 3
})
df = df.groupby('season').resample('Y', on='time')['data'].sum().reset_index()
df
:
season time data
0 0 2015-12-31 0.221993
1 0 2016-12-31 1.077451
2 1 2016-12-31 2.018766
3 2 2016-12-31 1.768848
4 3 2016-12-31 0.080741
To consider the previous December as part of the next year add MonthBegin
from pandas.tseries.offsets
to offset December 2015 to January 2016, then adjust all Season values forward one month:
df['time'] = df['time'] + MonthBegin(1)
df['season'] = df['time'].dt.month.map({
1: 0, 2: 0, 3: 0,
4: 1, 5: 1, 6: 1,
7: 2, 8: 2, 9: 2, 10: 2,
11: 3, 12: 3
})
df = df.groupby('season').resample('Y', on='time')['data'].sum().reset_index()
df
:
season time data
0 0 2016-12-31 1.299445
1 1 2016-12-31 2.018766
2 2 2016-12-31 1.768848
3 3 2016-12-31 0.080741
Sample Data Used:
np.random.seed(5)
dti = pd.date_range("2015-12-31", periods=11, freq="M")
df = pd.DataFrame({'time': dti,
'data': np.random.rand(len(dti))})
df
:
time data
0 2015-12-31 0.221993
1 2016-01-31 0.870732
2 2016-02-29 0.206719
3 2016-03-31 0.918611
4 2016-04-30 0.488411
5 2016-05-31 0.611744
6 2016-06-30 0.765908
7 2016-07-31 0.518418
8 2016-08-31 0.296801
9 2016-09-30 0.187721
10 2016-10-31 0.080741
Related Topics
Converting a String to a List of Words
Inverse Dictionary Lookup in Python
Read Excel Cell Value and Not the Formula Computing It -Openpyxl
Tkinter Gui Layout Using Frames and Grid
Making All Possible Combinations of a List
How to Use an Image for the Background in Tkinter
Pandas Groupby Multiple Fields Then Diff
Using Backslash in Python (Not to Escape)
How to Check Mousebuttonpress Event in Pyqt6
Calling Filter Returns <Filter Object at ... >
Remove Punctuation from Unicode Formatted Strings
Large, Persistent Dataframe in Pandas