How to Convert Column With Dtype as Object to String in Pandas Dataframe

Convert object data type to string issue in python

object is the default container capable of holding strings, or any combination of dtypes.

If you are using a version of pandas < '1.0.0' this is your only option. If you are using pd.__version__ >= '1.0.0' then you can use the new experimental pd.StringDtype() dtype. Being experimental, the behavior is subject to change in future versions, so use at your own risk.

df.dtypes
#country object

# .astype(str) and .astype('str') keep the column as object.
df['country'] = df['country'].astype(str)
df.dtypes
#country object

df['country'] = df['country'].astype(pd.StringDtype())
df.dtypes
#country string

Pandas: convert object to str

I realize that object is not a problem, instead is the type that pandas use for string or mixed types (https://pbpython.com/pandas_dtypes.html). More precisely:































Pandas dtypePython typeNumPy typeUsage
objectstr or mixedstring_, unicode_, mixed typesText or mixed numeric and non-numeric values
int64intint_, int8, int16, int32, int64, uint8, uint16, uint32, uint64Integer numbers
float64floatfloat_, float16, float32, float64Floating point numbers

Pandas: change data type of Series to String

A new answer to reflect the most current practices: as of now (v1.2.4), neither astype('str') nor astype(str) work.

As per the documentation, a Series can be converted to the string datatype in the following ways:

df['id'] = df['id'].astype("string")

df['id'] = pandas.Series(df['id'], dtype="string")

df['id'] = pandas.Series(df['id'], dtype=pandas.StringDtype)

Cannot convert pandas column to string

That is how pandas define the column type , there is not string type column, it belong to object

df.column1.apply(type)
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
3 <class 'str'>
4 <class 'str'>
5 <class 'str'>
Name: column1, dtype: object

DataFrame dose not str.replace

You should do

df.replace({'...':'...'}) 

Or

df['column1']=df['column1'].str.replace()

Pandas converting dtype object to string

Converting your dates into a DateTime will allow you to easily compare a user inputted date with the dates in your data.

#Load in the data
dt = pd.read_csv('data/Tesla.csv')

#Change the 'Date' column into DateTime
dt['Date']=pd.to_datetime(dt['Date'])

#Find a Date using strings
np.where(dt['Date']=='2014-02-28')
#returns (array([0]),)

np.where(dt['Date']=='2014-02-21')
#returns (array([5]),)

#To get the entire row's information
index = np.where(dt['Date']=='2014-02-21')[0][0]
dt.iloc[index]

#returns:
Date 2014-02-21 00:00:00
Open 211.64
High 213.98
Low 209.19
Close 209.6
Volume 7818800
Adj Close 209.6
Name: 5, dtype: object

So if you wanted to do a for loop, you could create a list or numpy array of dates, then iterate through them, replacing the date in the index with your value:

input = np.array(['2014-02-21','2014-02-28'])
for i in input:
index = np.where(dt['Date']==i)[0][0]
dt.iloc[index]


Related Topics



Leave a reply



Submit