Python/Pandas - Convert type from pandas period to string
You can use to_series
and then convert to string
:
print df
# Seasonal
#Date
#2014-12 -1.089744
#2015-01 -0.283654
#2015-02 0.158974
#2015-03 0.461538
print df.index
#PeriodIndex(['2014-12', '2015-01', '2015-02', '2015-03'],
# dtype='int64', name=u'Date', freq='M')
df.index=df.index.to_series().astype(str)
print df
# Seasonal
#Date
#2014-12 -1.089744
#2015-01 -0.283654
#2015-02 0.158974
#2015-03 0.461538
print df.index
#Index([u'2014-12', u'2015-01', u'2015-02', u'2015-03'], dtype='object', name=u'Date')
Converting dtype: period[M] to string format
Use Series.dt.strftime
for set Series
to strings in last step:
df["Date_Modified"]= df["Date_Modified"].dt.strftime('%Y-%m')
Or set it before groupby
, then converting to month period is not necessary:
df['Date_Modified'] = pd.to_datetime(df['Collection_End_Date']).dt.strftime('%Y-%m')
df = df.groupby(["Date_Modified", "Entity"]).sum().reset_index()
Converting period[Q-DEC] column into a dataframe to a string in Python
Just .astype(str)
df.Quarter=df.Quarter.astype(str)
df.dtypes
Month int64
Year int64
Quarter object
dtype: object
print(df)
Month Year Quarter
0 1 2015 2015Q1
1 8 2020 2020Q3
Convert period_range to list of string
You can use PeriodIndex.strftime
with '%y%m'
to indicate "YYMM" format:
p.strftime('%y%m')
# Index(['1905', '1906', '1907', '1908', '1909', '1910', '1911', '1912', '2001',
# '2002', '2003'],
# dtype='object')
How to convert Period string to actual Period type
You can use pd.PeriodIndex() method.
Assume you have the following DF:
In [517]: x
Out[517]:
str_col
0 1971q1
1 1971q2
2 1971q3
3 1971q4
4 1972q1
5 1972q2
6 1972q3
7 1972q4
In [518]: x.dtypes
Out[518]:
str_col object
dtype: object
Let's create a new 'period' column:
In [519]: x['period'] = pd.PeriodIndex(x.str_col, freq='Q')
In [520]: x
Out[520]:
str_col period
0 1971q1 1971Q1
1 1971q2 1971Q2
2 1971q3 1971Q3
3 1971q4 1971Q4
4 1972q1 1972Q1
5 1972q2 1972Q2
6 1972q3 1972Q3
7 1972q4 1972Q4
In [521]: x.dtypes
Out[521]:
str_col object
period object
dtype: object
Now we can do "time algebra", for example let's subtract one quarter from each period:
In [525]: x.period - 1
Out[525]:
0 1970Q4
1 1971Q1
2 1971Q2
3 1971Q3
4 1971Q4
5 1972Q1
6 1972Q2
7 1972Q3
Name: period, dtype: object
Alternatively you can cast the str_col
column to regular Pandas/NumPy datetime
:
In [527]: pd.to_datetime(x.str_col, errors='coerce')
Out[527]:
0 1971-01-01
1 1971-04-01
2 1971-07-01
3 1971-10-01
4 1972-01-01
5 1972-04-01
6 1972-07-01
7 1972-10-01
Name: str_col, dtype: datetime64[ns]
Convert a string column to period in pandas preserving the string
Let's say there is a dummy dataframe, similiar with yours:
dictionary = {'company' : ['Facebook', 'Facebook', 'Facebook_Total','Google','Google_Total'],
'date' : ['2019-09-14 09:00:08.279000+09:00 Total',
'2020-09-14 09:00:08.279000+09:00 Total',
'-',
'2021-09-14 09:00:08.279000+09:00 Total',
'-'],
'revenue' : [10,20,30,40,50]}
df = pd.DataFrame(dictionary)
I used regex
module to delete Total behind the year column as following:
substring = ' Total'
for i in range(len(df)):
if re.search(substring, df['date'][i] , flags=re.IGNORECASE):
df['date'][i] = df['date'][i].replace(' Total','')
else: pass
Then, I used pd.PeriodIndex
as following:
for i in range(len(df)) :
if df['date'][i] == '-':
pass
else:
df['date'][i] = pd.PeriodIndex(pd.Series(df['date'][i]), freq='Y')[0]
for i in range(len(df)):
if df['date'][i] == '-':
pass
else:
df['date'][i] = str(df['date'][i]) + ' Total'
The above code returns :
Out[1]:
company date revenue
0 Facebook 2019 Total 10
1 Facebook 2020 Total 20
2 Facebook_Total - 30
3 Google 2021 Total 40
4 Google_Total - 50
How do I convert Pandas DateTime Quarter object into string
Just call str()
on it
>> p = pd.Period('2001Q1')
>> str(p)
'2001Q1'
How can I convert string of year and quarter number into the period[Q-DEC] datatype in Python?
Convert to the Pandas Period type like this:
import pandas as pd
d = "20151"
t = d[:-1] + "Q" + d[-1:]
month = pd.Period(t, freq="M")
print(month)
returns:
2015-01
Conversely, if you need the PeriodIndex:
values = ["2015Q1", "2015Q2", "2015Q3", "2015Q4"]
index = pd.PeriodIndex(values, freq="Q")
print(index)
Will return:
PeriodIndex(['2015Q1', '2015Q2', '2015Q3', '2015Q4'], dtype='period[Q-DEC]', freq='Q-DEC')
Pandas Date Functionality: Extracting Period Index information as String
Okay so I found an answer while going through the docs. The function datetime.strftime() can be used for this:
In[3]: df_qtr.columns.strftime('%YQ%q')
Out[3]: array(['2008Q4', '2009Q1', '2009Q2'], dtype='<U6')
Turns out it can be used with datetime, timestamp and period indexes. To know more read here: strftime() and strptime() Behavior
Convert a column of datetime and strings to period in pandas
If need Periods
only cannot mixing with strings:
df['booking_date'] = pd.to_datetime(df['booking_date'], errors='coerce').dt.to_period('m')
print (df)
booking_date ... credit debit
0 NaT ... 10185.00 -10185.00
1 2017-01 ... 1796.00 0.00
2 2018-07 ... 7423.20 -11.54
3 2017-04 ... 1704.00 0.00
4 2017-12 ... 1938.60 -1938.60
5 2018-12 ... 1403.47 -102.01
6 2018-01 ... 2028.00 -76.38
7 2019-01 ... 800.00 -256.98
8 NaT ... 10185.00 -10185.00
But it is possible:
orig = df['booking_date']
df['booking_date'] = pd.to_datetime(df['booking_date'], errors='coerce').dt.to_period('m')
df.loc[df['booking_date'].isna(), 'booking_date'] = orig
print (df)
booking_date ... credit debit
0 None ... 10185.00 -10185.00
1 2017-01 ... 1796.00 0.00
2 2018-07 ... 7423.20 -11.54
3 2017-04 ... 1704.00 0.00
4 2017-12 ... 1938.60 -1938.60
5 2018-12 ... 1403.47 -102.01
6 2018-01 ... 2028.00 -76.38
7 2019-01 ... 800.00 -256.98
8 Total ... 10185.00 -10185.00
print (df['booking_date'].apply(type))
0 <class 'NoneType'>
1 <class 'pandas._libs.tslibs.period.Period'>
2 <class 'pandas._libs.tslibs.period.Period'>
3 <class 'pandas._libs.tslibs.period.Period'>
4 <class 'pandas._libs.tslibs.period.Period'>
5 <class 'pandas._libs.tslibs.period.Period'>
6 <class 'pandas._libs.tslibs.period.Period'>
7 <class 'pandas._libs.tslibs.period.Period'>
8 <class 'str'>
Name: booking_date, dtype: object
new = pd.to_datetime(df['booking_date'], errors='coerce').dt.to_period('m')
df['booking_date'] = np.where(new.isna(), df['booking_date'], new)
print (df)
booking_date ... credit debit
0 None ... 10185.00 -10185.00
1 2017-01 ... 1796.00 0.00
2 2018-07 ... 7423.20 -11.54
3 2017-04 ... 1704.00 0.00
4 2017-12 ... 1938.60 -1938.60
5 2018-12 ... 1403.47 -102.01
6 2018-01 ... 2028.00 -76.38
7 2019-01 ... 800.00 -256.98
8 Total ... 10185.00 -10185.00
print (df['booking_date'].apply(type))
0 <class 'NoneType'>
1 <class 'pandas._libs.tslibs.period.Period'>
2 <class 'pandas._libs.tslibs.period.Period'>
3 <class 'pandas._libs.tslibs.period.Period'>
4 <class 'pandas._libs.tslibs.period.Period'>
5 <class 'pandas._libs.tslibs.period.Period'>
6 <class 'pandas._libs.tslibs.period.Period'>
7 <class 'pandas._libs.tslibs.period.Period'>
8 <class 'str'>
Name: booking_date, dtype: object
Related Topics
Java 8: Difference Between Method Reference Bound Receiver and Unbound Receiver
Should I Declare Jackson's Objectmapper as a Static Field
How to Find the Sum of All the Numbers in an Array in Java
Import Package.* VS Import Package.Specifictype
How to Accept Date Params in a Get Request to Spring MVC Controller
Windows Shortcut (.Lnk) Parser in Java
Incompatible Magic Value 1008813135
Executors.Newcachedthreadpool() Versus Executors.Newfixedthreadpool()
Java Generate Random Number Between Two Given Values
Best Way to Encode Text Data for Xml in Java
How to Use .Jar Files in Netbeans
How to Read JSON File into Java with Simple JSON Library
Increasing the Jvm Maximum Heap Size for Memory Intensive Applications