Period to String

Python/Pandas - Convert type from pandas period to string

You can use to_series and then convert to string:

print df

# Seasonal
#Date
#2014-12 -1.089744
#2015-01 -0.283654
#2015-02 0.158974
#2015-03 0.461538

print df.index

#PeriodIndex(['2014-12', '2015-01', '2015-02', '2015-03'],
# dtype='int64', name=u'Date', freq='M')

df.index=df.index.to_series().astype(str)
print df

# Seasonal
#Date
#2014-12 -1.089744
#2015-01 -0.283654
#2015-02 0.158974
#2015-03 0.461538

print df.index

#Index([u'2014-12', u'2015-01', u'2015-02', u'2015-03'], dtype='object', name=u'Date')

Converting dtype: period[M] to string format

Use Series.dt.strftime for set Series to strings in last step:

df["Date_Modified"]= df["Date_Modified"].dt.strftime('%Y-%m')

Or set it before groupby, then converting to month period is not necessary:

df['Date_Modified'] = pd.to_datetime(df['Collection_End_Date']).dt.strftime('%Y-%m')
df = df.groupby(["Date_Modified", "Entity"]).sum().reset_index()

Converting period[Q-DEC] column into a dataframe to a string in Python

Just .astype(str)

df.Quarter=df.Quarter.astype(str)
df.dtypes

Month int64
Year int64
Quarter object
dtype: object

print(df)
Month Year Quarter
0 1 2015 2015Q1
1 8 2020 2020Q3

Convert period_range to list of string

You can use PeriodIndex.strftime with '%y%m' to indicate "YYMM" format:

p.strftime('%y%m')
# Index(['1905', '1906', '1907', '1908', '1909', '1910', '1911', '1912', '2001',
# '2002', '2003'],
# dtype='object')

How to convert Period string to actual Period type

You can use pd.PeriodIndex() method.

Assume you have the following DF:

In [517]: x
Out[517]:
str_col
0 1971q1
1 1971q2
2 1971q3
3 1971q4
4 1972q1
5 1972q2
6 1972q3
7 1972q4

In [518]: x.dtypes
Out[518]:
str_col object
dtype: object

Let's create a new 'period' column:

In [519]: x['period'] = pd.PeriodIndex(x.str_col, freq='Q')

In [520]: x
Out[520]:
str_col period
0 1971q1 1971Q1
1 1971q2 1971Q2
2 1971q3 1971Q3
3 1971q4 1971Q4
4 1972q1 1972Q1
5 1972q2 1972Q2
6 1972q3 1972Q3
7 1972q4 1972Q4

In [521]: x.dtypes
Out[521]:
str_col object
period object
dtype: object

Now we can do "time algebra", for example let's subtract one quarter from each period:

In [525]: x.period - 1
Out[525]:
0 1970Q4
1 1971Q1
2 1971Q2
3 1971Q3
4 1971Q4
5 1972Q1
6 1972Q2
7 1972Q3
Name: period, dtype: object

Alternatively you can cast the str_col column to regular Pandas/NumPy datetime:

In [527]: pd.to_datetime(x.str_col, errors='coerce')
Out[527]:
0 1971-01-01
1 1971-04-01
2 1971-07-01
3 1971-10-01
4 1972-01-01
5 1972-04-01
6 1972-07-01
7 1972-10-01
Name: str_col, dtype: datetime64[ns]

Convert a string column to period in pandas preserving the string

Let's say there is a dummy dataframe, similiar with yours:

dictionary = {'company' : ['Facebook', 'Facebook', 'Facebook_Total','Google','Google_Total'],
'date' : ['2019-09-14 09:00:08.279000+09:00 Total',
'2020-09-14 09:00:08.279000+09:00 Total',
'-',
'2021-09-14 09:00:08.279000+09:00 Total',
'-'],
'revenue' : [10,20,30,40,50]}
df = pd.DataFrame(dictionary)

I used regex module to delete Total behind the year column as following:

substring = ' Total'
for i in range(len(df)):
if re.search(substring, df['date'][i] , flags=re.IGNORECASE):
df['date'][i] = df['date'][i].replace(' Total','')
else: pass

Then, I used pd.PeriodIndex as following:

for i in range(len(df)) :
if df['date'][i] == '-':
pass
else:
df['date'][i] = pd.PeriodIndex(pd.Series(df['date'][i]), freq='Y')[0]

for i in range(len(df)):
if df['date'][i] == '-':
pass
else:
df['date'][i] = str(df['date'][i]) + ' Total'

The above code returns :

Out[1]: 
company date revenue
0 Facebook 2019 Total 10
1 Facebook 2020 Total 20
2 Facebook_Total - 30
3 Google 2021 Total 40
4 Google_Total - 50

How do I convert Pandas DateTime Quarter object into string

Just call str() on it

>> p = pd.Period('2001Q1')
>> str(p)
'2001Q1'

How can I convert string of year and quarter number into the period[Q-DEC] datatype in Python?

Convert to the Pandas Period type like this:

import pandas as pd

d = "20151"
t = d[:-1] + "Q" + d[-1:]
month = pd.Period(t, freq="M")
print(month)

returns:

2015-01

Conversely, if you need the PeriodIndex:

values = ["2015Q1", "2015Q2", "2015Q3", "2015Q4"]
index = pd.PeriodIndex(values, freq="Q")
print(index)

Will return:

PeriodIndex(['2015Q1', '2015Q2', '2015Q3', '2015Q4'], dtype='period[Q-DEC]', freq='Q-DEC')

Pandas Date Functionality: Extracting Period Index information as String

Okay so I found an answer while going through the docs. The function datetime.strftime() can be used for this:

In[3]: df_qtr.columns.strftime('%YQ%q')
Out[3]: array(['2008Q4', '2009Q1', '2009Q2'], dtype='<U6')

Turns out it can be used with datetime, timestamp and period indexes. To know more read here: strftime() and strptime() Behavior

Convert a column of datetime and strings to period in pandas

If need Periods only cannot mixing with strings:

df['booking_date'] = pd.to_datetime(df['booking_date'], errors='coerce').dt.to_period('m')
print (df)
booking_date ... credit debit
0 NaT ... 10185.00 -10185.00
1 2017-01 ... 1796.00 0.00
2 2018-07 ... 7423.20 -11.54
3 2017-04 ... 1704.00 0.00
4 2017-12 ... 1938.60 -1938.60
5 2018-12 ... 1403.47 -102.01
6 2018-01 ... 2028.00 -76.38
7 2019-01 ... 800.00 -256.98
8 NaT ... 10185.00 -10185.00

But it is possible:

orig = df['booking_date']

df['booking_date'] = pd.to_datetime(df['booking_date'], errors='coerce').dt.to_period('m')

df.loc[df['booking_date'].isna(), 'booking_date'] = orig
print (df)
booking_date ... credit debit
0 None ... 10185.00 -10185.00
1 2017-01 ... 1796.00 0.00
2 2018-07 ... 7423.20 -11.54
3 2017-04 ... 1704.00 0.00
4 2017-12 ... 1938.60 -1938.60
5 2018-12 ... 1403.47 -102.01
6 2018-01 ... 2028.00 -76.38
7 2019-01 ... 800.00 -256.98
8 Total ... 10185.00 -10185.00

print (df['booking_date'].apply(type))
0 <class 'NoneType'>
1 <class 'pandas._libs.tslibs.period.Period'>
2 <class 'pandas._libs.tslibs.period.Period'>
3 <class 'pandas._libs.tslibs.period.Period'>
4 <class 'pandas._libs.tslibs.period.Period'>
5 <class 'pandas._libs.tslibs.period.Period'>
6 <class 'pandas._libs.tslibs.period.Period'>
7 <class 'pandas._libs.tslibs.period.Period'>
8 <class 'str'>
Name: booking_date, dtype: object


new = pd.to_datetime(df['booking_date'], errors='coerce').dt.to_period('m')

df['booking_date'] = np.where(new.isna(), df['booking_date'], new)
print (df)
booking_date ... credit debit
0 None ... 10185.00 -10185.00
1 2017-01 ... 1796.00 0.00
2 2018-07 ... 7423.20 -11.54
3 2017-04 ... 1704.00 0.00
4 2017-12 ... 1938.60 -1938.60
5 2018-12 ... 1403.47 -102.01
6 2018-01 ... 2028.00 -76.38
7 2019-01 ... 800.00 -256.98
8 Total ... 10185.00 -10185.00

print (df['booking_date'].apply(type))
0 <class 'NoneType'>
1 <class 'pandas._libs.tslibs.period.Period'>
2 <class 'pandas._libs.tslibs.period.Period'>
3 <class 'pandas._libs.tslibs.period.Period'>
4 <class 'pandas._libs.tslibs.period.Period'>
5 <class 'pandas._libs.tslibs.period.Period'>
6 <class 'pandas._libs.tslibs.period.Period'>
7 <class 'pandas._libs.tslibs.period.Period'>
8 <class 'str'>
Name: booking_date, dtype: object


Related Topics



Leave a reply



Submit