Converting Currency with $ to Numbers in Python Pandas

converting currency with $ to numbers in Python pandas

@EdChum's answer is clever and works well. But since there's more than one way to bake a cake.... why not use regex? For example:

df[df.columns[1:]] = df[df.columns[1:]].replace('[\$,]', '', regex=True).astype(float)

To me, that is a little bit more readable.

Convert a column of dollar values in a dataframe to integer

Use Series.replace with convert to floats by Series.astype:

df2.PRICE = df2.PRICE.replace('[\$,]','', regex=True).astype(float)
print (df2)
PRICE
0 179000.0
1 110000.0
2 275000.0
3 140000.0
4 180000.0
564611 85500.0
564612 80800.0
564613 74500.0
564614 75900.0
564615 66700.0

If there are always integers:

df2.PRICE = df2.PRICE.replace('[\$,]','', regex=True).astype(float).astype(int)
print (df2)
PRICE
0 179000
1 110000
2 275000
3 140000
4 180000
564611 85500
564612 80800
564613 74500
564614 75900
564615 66700

If failed converting to floats use to_numeric with errors='coerce' for missing values if cannot convert to number:

df2.PRICE = pd.to_numeric(df2.PRICE.replace('[\$,]','', regex=True), errors='coerce')

How to convert currency in a database using pandas

For the answer below I'll assume that you have a dataframe, df, with columns named currency and amount.

I have cobbled together a demo jupyter notebook to illustrate the method.

  1. Work out what currencies you have in your dataframe

    You'll need an exchange rate for every currency you have in your dataframe, so you need to know what currencies you have.

    currencies = df.currency.unique().tolist()
    currencies = dict.fromkeys(currencies, pd.NA)
  2. Define an exchange rate for every currency

    Exchange rates vary over time, and can vary depending on who you ask, so you'll need to define a set exchange rate to use. You can define these yourself manually:

    currencies['CAD'] = 1.23
    currencies['GBP'] = 0.72

    Alternatively, as It_is_Chris pointed out, you could use the CurrencyConverter library to source these automatedly in real time:

    from currency_converter import CurrencyConverter
    c = CurrencyConverter()
    for key in currencies:
    try:
    currencies[key] = c.convert(1, 'USD', key)
    except:
    pass
  3. Convert the currencies in your dataframe

    Try and avoid looping through pandas dataframes; the built in methods are much faster. In this case you can use apply():

    df['amount_conv'] = df.apply(lambda x: x.amount / currencies[x.currency], axis=1 )

Pandas DataFrame: Converting between currencies

Let's try replace and pd.eval:

df["price"].replace({"SEK": "*1", "EUR": "*10"}, regex=True).map(pd.eval)

Output:

0    42000
1 12000
2 22000
Name: price, dtype: int64

This works nicely assuming you have no NaNs, and that there are only two currencies with one of them needing conversion. If you do have NaNs, fill them first. Finally, assign this back to the column to update the DataFrame.

Pandas dataframe currency to numeric

You are trying to replace , with . but the resulting string can not be converted to float. For example, 2.553.00 contains two dots and when converting it to float an exception will be thrown.

Change the code to:

data['Gross'] = data['Gross'].fillna(0.0).str.replace('[$,]', '').astype('float')

Python Pandas df, best way to replace $, M and K in currency amount to change to int

Updated Solution:

New solution: Using .replace() and astype() only.

Without relying on pd.eval for formula evaluation:

You can translate M, K to the corresponding magnitudes in exponential format:

K converted to e+03 in scientific notation

M converted to e+06 in scientific notation

(supports integer as well as float numbers in any number of decimal places)

Then, convert the text in scientific notation to float type, followed by casting to integer for final required format, as follows:

df['Value'] = df['Value'].replace({'€': '', ' ': '', 'M': 'e+06', 'K': 'e+03'}, regex=True).astype(float).astype(int)

Input data:

         Value
0 €8.5M
1 €0
2 €9.5M
3 €2M
4 €21M
16534 €1.8M
16535 €1.1M
16536 €550K
16537 €650K
16538 €1.1M

Output:

print(df)

Value
0 8500000
1 0
2 9500000
3 2000000
4 21000000
16534 1800000
16535 1100000
16536 550000
16537 650000
16538 1100000

Old Solution:

You can convert M, K to formula and then use pd.eval to evaluate the numeric values.

K converted to formula * 1000

M converted to formula * 1000000

In this way, we can support the base values with any number of decimal points (with or without decimal point and how long the fractional part could be). We can just get the correct results from the formulas for all lengths of fractional parts after decimal points.

df['Value'] = df['Value'].str.replace('€', '')
df['Value'] = df['Value'].str.replace('M', ' * 1000000')
df['Value'] = df['Value'].str.replace('K', ' * 1000')
df['Value'] = df['Value'].map(pd.eval).astype(int)

Or simplified code in one line, thanks to @MustafaAydın's suggestion:

df['Value'] = df['Value'].replace({"€": "", "M": "*1E6", "K": "*1E3"}, regex=True).map(pd.eval).astype(int)

Result:

print(df)

Value
0 8500000
1 0
2 9500000
3 2000000
4 21000000
16534 1800000
16535 1100000
16536 550000
16537 650000
16538 1100000

With the input sample data as follows:

         Value
0 €8.5M
1 €0
2 €9.5M
3 €2M
4 €21M
16534 €1.8M
16535 €1.1M
16536 €550K
16537 €650K
16538 €1.1M

Before the last step, we got:

               Value
0 8.5 * 1000000
1 0
2 9.5 * 1000000
3 2 * 1000000
4 21 * 1000000
16534 1.8 * 1000000
16535 1.1 * 1000000
16536 550 * 1000
16537 650 * 1000
16538 1.1 * 1000000

Then we feed it to pd.eval for it to evaluate and convert to numeric value (in float) where we can further cast it to integer.

Convert currency to float (and parentheses indicate negative amounts)

Just add ) to the existing command, and then convert ( to - to make numbers in parentheses negative. Then convert to float.

(df['Currency'].replace( '[\$,)]','', regex=True )
.replace( '[(]','-', regex=True ).astype(float))

Currency
0 1
1 2000
2 -3000


Related Topics



Leave a reply



Submit