Convert object data type to string issue in python
object
is the default container capable of holding strings, or any combination of dtypes.
If you are using a version of pandas < '1.0.0'
this is your only option. If you are using pd.__version__ >= '1.0.0'
then you can use the new experimental pd.StringDtype()
dtype. Being experimental, the behavior is subject to change in future versions, so use at your own risk.
df.dtypes
#country object
# .astype(str) and .astype('str') keep the column as object.
df['country'] = df['country'].astype(str)
df.dtypes
#country object
df['country'] = df['country'].astype(pd.StringDtype())
df.dtypes
#country string
Pandas: convert object to str
I realize that object
is not a problem, instead is the type that pandas use for string or mixed types (https://pbpython.com/pandas_dtypes.html). More precisely:
Pandas dtype | Python type | NumPy type | Usage |
---|---|---|---|
object | str or mixed | string_, unicode_, mixed types | Text or mixed numeric and non-numeric values |
int64 | int | int_, int8, int16, int32, int64, uint8, uint16, uint32, uint64 | Integer numbers |
float64 | float | float_, float16, float32, float64 | Floating point numbers |
Pandas: change data type of Series to String
A new answer to reflect the most current practices: as of now (v1.2.4), neither astype('str')
nor astype(str)
work.
As per the documentation, a Series can be converted to the string datatype in the following ways:
df['id'] = df['id'].astype("string")
df['id'] = pandas.Series(df['id'], dtype="string")
df['id'] = pandas.Series(df['id'], dtype=pandas.StringDtype)
Cannot convert pandas column to string
That is how pandas
define the column type , there is not string type column, it belong to object
df.column1.apply(type)
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
3 <class 'str'>
4 <class 'str'>
5 <class 'str'>
Name: column1, dtype: object
DataFrame dose not str.replace
You should do
df.replace({'...':'...'})
Or
df['column1']=df['column1'].str.replace()
Pandas converting dtype object to string
Converting your dates into a DateTime will allow you to easily compare a user inputted date with the dates in your data.
#Load in the data
dt = pd.read_csv('data/Tesla.csv')
#Change the 'Date' column into DateTime
dt['Date']=pd.to_datetime(dt['Date'])
#Find a Date using strings
np.where(dt['Date']=='2014-02-28')
#returns (array([0]),)
np.where(dt['Date']=='2014-02-21')
#returns (array([5]),)
#To get the entire row's information
index = np.where(dt['Date']=='2014-02-21')[0][0]
dt.iloc[index]
#returns:
Date 2014-02-21 00:00:00
Open 211.64
High 213.98
Low 209.19
Close 209.6
Volume 7818800
Adj Close 209.6
Name: 5, dtype: object
So if you wanted to do a for loop, you could create a list or numpy array of dates, then iterate through them, replacing the date in the index with your value:
input = np.array(['2014-02-21','2014-02-28'])
for i in input:
index = np.where(dt['Date']==i)[0][0]
dt.iloc[index]
Related Topics
Pandas, Remove Everything After Last '_'
Numpy: Checking If a Value Is Nat
Missing 1 Required Positional Argument - Issue
Python Flask Threaded True Not Working
How to Set Proxy Authentication (User & Password) Using Python + Selenium
Check If a Python Script Is Already Running in Windows
Pandas - Find Index of Value Anywhere in Dataframe
Fbprophet Installation Error - Failed Building Wheel for Fbprophet
Fitting a Straight Line to a Log-Log Curve in Matplotlib
Comparing Two Xml Files in Python
Dividing Each Row by the Previous One
How to Update/Delete Rows in Bigquery from the Python API
How to Code My Bot to Generate Random Images from One Command
Most Efficient Way to Forward-Fill Nan Values in Numpy Array
How to Use and Print the Pandas Dataframe Name
Visual Studio Code Intellisense Is Very Slow - Is There Anything I Can Do