How to Efficiently Handle European Decimal Separators Using the Pandas Read_CSV Function

How to efficiently handle European decimal separators using the pandas read_csv function?

You can use the converters kw in read_csv. Given /tmp/data.csv like this:

"x","y"                                                                         
"one","1.234,56"
"two","2.000,00"

you can do:

In [20]: pandas.read_csv('/tmp/data.csv', converters={'y': lambda x: float(x.replace('.','').replace(',','.'))})
Out[20]:
x y
0 one 1234.56
1 two 2000.00

How to read csv with redundant characters as dataframe?

You just need to specify decimal=","

from io import StringIO

file = '''ID,columnA,columnB
A,0,"15,6"
B,"1,2",0
C,0,'''

df = pd.read_csv(StringIO(file), decimal=",")
print(df)

Output:

  ID  columnA  columnB
0 A 0.0 15.6
1 B 1.2 0.0
2 C 0.0 NaN

how to handle decimal separator in float using Pandas?

Pandas read_csv function has a thousands argument, which you can specify to be , instead of the default .

df.read_csv('file', thousands=',')

Convert commas decimal separators to dots within a Dataframe

pandas.read_csv has a decimal parameter for this: doc

I.e. try with:

df = pd.read_csv(Input, delimiter=";", decimal=",")

pandas reading CSV data formatted with comma for thousands separator

Pass param thousands=',' to read_csv to read those values as thousands:

In [27]:
import pandas as pd
import io

t="""id;value
0;123,123
1;221,323,330
2;32,001"""
pd.read_csv(io.StringIO(t), thousands=r',', sep=';')

Out[27]:
id value
0 0 123123
1 1 221323330
2 2 32001

Pandas: Read csv with quoted values, comma as decimal separator, and period as digit grouping symbol

What about that ?

import pandas

table = pandas.read_csv("data.csv", sep=";", decimal=",")

print(table["Amount"][0]) # -36.37
print(type(table["Amount"][0])) # <class 'numpy.float64'>
print(table["Amount"][0] + 36.37) # 0.0

Pandas automatically detects a number and converts it to numpy.float64.



Edit:

As @bweber discovered, some values in data.csv ​​contained more than 3 digits, and used a digit grouping symbol '.'. In order to convert the String to Integer, the symbol used must be passed to the read_csv() method:

table = pandas.read_csv("data.csv", sep=";", decimal=",", thousands='.')

how covert string Greek format in float

You can do it this way:
pandas.read_csv('your_file.csv', decimal=',')

From the doc:

decimalstr, default ‘.’

Character to recognize as decimal point (e.g. use ‘,’ for European
data).

https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

How to convert multiple columns with european numbers (comma as decimal separator) to float?

Read the colums as strings and then use translate:

tt = str.maketrans(',', '.', '.€%')
df.col1 = df.col1.str.translate(tt).astype(float)

PS: you may need to adopt the third argument with the characters to remove as needed.



Related Topics



Leave a reply



Submit