converting currency with $ to numbers in Python pandas
@EdChum's answer is clever and works well. But since there's more than one way to bake a cake.... why not use regex? For example:
df[df.columns[1:]] = df[df.columns[1:]].replace('[\$,]', '', regex=True).astype(float)
To me, that is a little bit more readable.
Convert a column of dollar values in a dataframe to integer
Use Series.replace
with convert to float
s by Series.astype
:
df2.PRICE = df2.PRICE.replace('[\$,]','', regex=True).astype(float)
print (df2)
PRICE
0 179000.0
1 110000.0
2 275000.0
3 140000.0
4 180000.0
564611 85500.0
564612 80800.0
564613 74500.0
564614 75900.0
564615 66700.0
If there are always integers
:
df2.PRICE = df2.PRICE.replace('[\$,]','', regex=True).astype(float).astype(int)
print (df2)
PRICE
0 179000
1 110000
2 275000
3 140000
4 180000
564611 85500
564612 80800
564613 74500
564614 75900
564615 66700
If failed converting to floats use to_numeric
with errors='coerce'
for missing values if cannot convert to number:
df2.PRICE = pd.to_numeric(df2.PRICE.replace('[\$,]','', regex=True), errors='coerce')
How to convert currency in a database using pandas
For the answer below I'll assume that you have a dataframe, df
, with columns named currency
and amount
.
I have cobbled together a demo jupyter notebook to illustrate the method.
Work out what currencies you have in your dataframe
You'll need an exchange rate for every currency you have in your dataframe, so you need to know what currencies you have.
currencies = df.currency.unique().tolist()
currencies = dict.fromkeys(currencies, pd.NA)Define an exchange rate for every currency
Exchange rates vary over time, and can vary depending on who you ask, so you'll need to define a set exchange rate to use. You can define these yourself manually:
currencies['CAD'] = 1.23
currencies['GBP'] = 0.72Alternatively, as It_is_Chris pointed out, you could use the CurrencyConverter library to source these automatedly in real time:
from currency_converter import CurrencyConverter
c = CurrencyConverter()
for key in currencies:
try:
currencies[key] = c.convert(1, 'USD', key)
except:
passConvert the currencies in your dataframe
Try and avoid looping through pandas dataframes; the built in methods are much faster. In this case you can use
apply()
:df['amount_conv'] = df.apply(lambda x: x.amount / currencies[x.currency], axis=1 )
Pandas DataFrame: Converting between currencies
Let's try replace
and pd.eval
:
df["price"].replace({"SEK": "*1", "EUR": "*10"}, regex=True).map(pd.eval)
Output:
0 42000
1 12000
2 22000
Name: price, dtype: int64
This works nicely assuming you have no NaNs, and that there are only two currencies with one of them needing conversion. If you do have NaNs, fill them first. Finally, assign this back to the column to update the DataFrame.
Pandas dataframe currency to numeric
You are trying to replace ,
with .
but the resulting string can not be converted to float. For example, 2.553.00 contains two dots and when converting it to float an exception will be thrown.
Change the code to:
data['Gross'] = data['Gross'].fillna(0.0).str.replace('[$,]', '').astype('float')
Python Pandas df, best way to replace $, M and K in currency amount to change to int
Updated Solution:
New solution: Using .replace()
and astype()
only.
Without relying on pd.eval
for formula evaluation:
You can translate M
, K
to the corresponding magnitudes in exponential format:
K
converted to e+03
in scientific notation
M
converted to e+06
in scientific notation
(supports integer
as well as float
numbers in any number of decimal places)
Then, convert the text in scientific notation to float type, followed by casting to integer for final required format, as follows:
df['Value'] = df['Value'].replace({'€': '', ' ': '', 'M': 'e+06', 'K': 'e+03'}, regex=True).astype(float).astype(int)
Input data:
Value
0 €8.5M
1 €0
2 €9.5M
3 €2M
4 €21M
16534 €1.8M
16535 €1.1M
16536 €550K
16537 €650K
16538 €1.1M
Output:
print(df)
Value
0 8500000
1 0
2 9500000
3 2000000
4 21000000
16534 1800000
16535 1100000
16536 550000
16537 650000
16538 1100000
Old Solution:
You can convert M
, K
to formula and then use pd.eval
to evaluate the numeric values.
K
converted to formula * 1000
M
converted to formula * 1000000
In this way, we can support the base values with any number of decimal points (with or without decimal point and how long the fractional part could be). We can just get the correct results from the formulas for all lengths of fractional parts after decimal points.
df['Value'] = df['Value'].str.replace('€', '')
df['Value'] = df['Value'].str.replace('M', ' * 1000000')
df['Value'] = df['Value'].str.replace('K', ' * 1000')
df['Value'] = df['Value'].map(pd.eval).astype(int)
Or simplified code in one line, thanks to @MustafaAydın's suggestion:
df['Value'] = df['Value'].replace({"€": "", "M": "*1E6", "K": "*1E3"}, regex=True).map(pd.eval).astype(int)
Result:
print(df)
Value
0 8500000
1 0
2 9500000
3 2000000
4 21000000
16534 1800000
16535 1100000
16536 550000
16537 650000
16538 1100000
With the input sample data as follows:
Value
0 €8.5M
1 €0
2 €9.5M
3 €2M
4 €21M
16534 €1.8M
16535 €1.1M
16536 €550K
16537 €650K
16538 €1.1M
Before the last step, we got:
Value
0 8.5 * 1000000
1 0
2 9.5 * 1000000
3 2 * 1000000
4 21 * 1000000
16534 1.8 * 1000000
16535 1.1 * 1000000
16536 550 * 1000
16537 650 * 1000
16538 1.1 * 1000000
Then we feed it to pd.eval
for it to evaluate and convert to numeric value (in float) where we can further cast it to integer.
Convert currency to float (and parentheses indicate negative amounts)
Just add )
to the existing command, and then convert (
to -
to make numbers in parentheses negative. Then convert to float.
(df['Currency'].replace( '[\$,)]','', regex=True )
.replace( '[(]','-', regex=True ).astype(float))
Currency
0 1
1 2000
2 -3000
Related Topics
I'm Getting "Typeerror: 'List' Object Is Not Callable". How to Fix This Error
"Ssl: Certificate_Verify_Failed" Error When Scraping Https://Www.Thenewboston.Com/
Multiprocessing VS Multithreading VS Asyncio in Python 3
Convert Bytes to Bits in Python
How to Get All of the Output from My .Exe Using Subprocess and Popen
Catch Exception and Continue Try Block in Python
Recursive Definitions in Pandas
Importerror: No Module Named <Something>
How to Rotate a Matplotlib Plot Through 90 Degrees
Repeat Rows in Data Frame N Times
Region: Ioerror: [Errno 22] Invalid Mode ('W') or Filename
How to Set Selenium Webdriver from Headless Mode to Normal Mode Within the Same Session
How to Change UI in Same Window Using Pyqt5
How to Set Xlim and Ylim for a Subplot in Matplotlib
Create a Main Loop with Tkinter
Factorize a Column of Strings in Pandas
Valueerror: Could Not Broadcast Input Array from Shape (224,224,3) into Shape (224,224)