Adding Months to a Pandas Object in Python

Adding months to a pandas object in Python

I believe you need loop with f-strings for new columns names:

for i in range(1,4):
df[f'Date_added_{i}_months'] = df['Date'] + pd.offsets.MonthBegin(i)
print (df)
Date Date_added_1_months Date_added_2_months Date_added_3_months
0 2018-12-14 2019-01-01 2019-02-01 2019-03-01
1 2019-01-11 2019-02-01 2019-03-01 2019-04-01
2 2019-01-25 2019-02-01 2019-03-01 2019-04-01
3 2019-02-08 2019-03-01 2019-04-01 2019-05-01
4 2019-02-22 2019-03-01 2019-04-01 2019-05-01
5 2019-07-26 2019-08-01 2019-09-01 2019-10-01

Or:

for i in range(1,4):
df[f'Date_added_{i}_months'] = df['Date'] + pd.offsets.MonthOffset(i)
print (df)
Date Date_added_1_months Date_added_2_months Date_added_3_months
0 2018-12-14 2019-01-14 2019-02-14 2019-03-14
1 2019-01-11 2019-02-11 2019-03-11 2019-04-11
2 2019-01-25 2019-02-25 2019-03-25 2019-04-25
3 2019-02-08 2019-03-08 2019-04-08 2019-05-08
4 2019-02-22 2019-03-22 2019-04-22 2019-05-22
5 2019-07-26 2019-08-26 2019-09-26 2019-10-26

Add months to a date in Pandas

You could use pd.DateOffset

In [1756]: df.date + pd.DateOffset(months=plus_month_period)
Out[1756]:
0 2017-01-11
1 2017-02-01
Name: date, dtype: datetime64[ns]

Another way using pd.offsets.MonthOffset

In [1785]: df.date + pd.offsets.MonthOffset(plus_month_period)
Out[1785]:
0 2016-10-14
1 2016-11-04
Name: date, dtype: datetime64[ns]

Details

In [1757]: df
Out[1757]:
date
0 2016-10-11
1 2016-11-01

In [1758]: plus_month_period
Out[1758]: 3

Add n months to a pandas Period object?

Add a pd.offsets.MonthEnd object:

pd.Period('2018-11', 'M') + pd.offsets.MonthEnd(3)

Add months to a datetime column in pandas

This is a vectorized way to do this, so should be quite performant. Note that it doesn't handle month crossings / endings (and doesn't deal well with DST changes. I believe that's why you get the times).

In [32]: df['START_DATE'] + df['MONTHS'].values.astype("timedelta64[M]")
Out[32]:
0 2035-03-20 20:24:00
1 2035-03-20 20:24:00
2 2035-03-20 20:24:00
3 2035-03-20 20:24:00
4 2035-03-20 20:24:00
5 2024-12-31 10:12:00
6 2036-12-31 20:24:00
7 NaT
8 NaT
9 NaT
Name: START_DATE, dtype: datetime64[ns]

If you need exact MonthEnd/Begin handling, this is an appropriate method. (Use MonthsOffset to get the same day)

In [33]: df.dropna().apply(lambda x: x['START_DATE'] + pd.offsets.MonthEnd(x['MONTHS']), axis=1)
Out[33]:
0 2035-02-28
1 2035-02-28
2 2035-02-28
3 2035-02-28
4 2035-02-28
5 2024-12-31
6 2036-12-31
dtype: datetime64[ns]

Adding months to a date which is bigger than the limit of Timestamp type

You can use Periods, as per the Representing out-of-bounds spans section of the guide on timestamps posted by @HenryEcker in comments. To convert the column simply use .dt.to_period():

>>> df['date'].dt.to_period(freq='M')
0 2017-01
1 2018-05
2 2016-03
3 2007-05
Name: date, dtype: period[M]

The rest is easy, adding the int64 months can even be done without conversion:

>>> df['shifted_date'] = df['date'].dt.to_period(freq='M') + df['months']
>>> df
date months shifted_date
0 2017-01-28 9999 2850-04
1 2018-05-13 9999 2851-08
2 2016-03-22 9999 2849-06
3 2007-05-12 9999 2840-08
>>> df['shifted_date']
0 2850-04
1 2851-08
2 2849-06
3 2840-08
Name: shifted_date, dtype: period[M]

Based on the dates you have you could use a smaller granularity period:

>>> df['shifted_date'].astype('Period[D]')
0 2850-04-30
1 2851-08-31
2 2849-06-30
3 2840-08-31
Name: shifted_date, dtype: period[D]

Going back to datetimes would trigger the overflow you’re trying to avoid:

>>> df['shifted_date'].dt.start_time
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.8/site-packages/pandas/core/accessor.py", line 78, in _getter
return self._delegate_property_get(name)
File "/usr/lib64/python3.8/site-packages/pandas/core/indexes/accessors.py", line 70, in _delegate_property_get
result = getattr(values, name)
File "/usr/lib64/python3.8/site-packages/pandas/core/arrays/period.py", line 420, in start_time
return self.to_timestamp(how="start")
File "/usr/lib64/python3.8/site-packages/pandas/core/arrays/period.py", line 465, in to_timestamp
new_data = libperiod.periodarr_to_dt64arr(new_data.asi8, base)
File "pandas/_libs/tslibs/period.pyx", line 977, in pandas._libs.tslibs.period.periodarr_to_dt64arr
File "pandas/_libs/tslibs/conversion.pyx", line 246, in pandas._libs.tslibs.conversion.ensure_datetime64ns
File "pandas/_libs/tslibs/np_datetime.pyx", line 113, in pandas._libs.tslibs.np_datetime.check_dts_bounds
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2850-04-01 00:00:00

How to add month column to a date column in python?

import pandas as pd
import datetime

#Convert the date column to date format
date['date_format'] = pd.to_datetime(date['Maturity_date'])

#Add a month column
date['Month'] = date['date_format'].apply(lambda x: x.strftime('%b'))


Related Topics



Leave a reply



Submit