Convert commas decimal separators to dots within a Dataframe
pandas.read_csv
has a decimal
parameter for this: doc
I.e. try with:
df = pd.read_csv(Input, delimiter=";", decimal=",")
Replace comma with dot Pandas
You need to assign the result of your operate back as the operation isn't inplace, besides you can use apply
or stack
and unstack
with vectorised str.replace
to do this quicker:
In [5]:
df.apply(lambda x: x.str.replace(',','.'))
Out[5]:
1-8 1-7
H0 0.140711 0.140711
H1 0.0999 0.0999
H2 0.001 0.001
H3 0.140711 0.140711
H4 0.140711 0.140711
H5 0.140711 0.140711
H6 0 0
H7 0 0
H8 0.140711 0.140711
H9 0.140711 0.140711
H10 0.140711 0.1125688
H11 0.140711 0.1125688
H12 0.140711 0.1125688
H13 0.140711 0.1125688
H14 0.140711 0.140711
H15 0.140711 0.140711
H16 0.140711 0.140711
H17 0.140711 0.140711
H18 0.140711 0.140711
H19 0.140711 0.140711
H20 0.140711 0.140711
H21 0.140711 0.140711
H22 0.140711 0.140711
H23 0.140711 0.140711
In [4]:
df.stack().str.replace(',','.').unstack()
Out[4]:
1-8 1-7
H0 0.140711 0.140711
H1 0.0999 0.0999
H2 0.001 0.001
H3 0.140711 0.140711
H4 0.140711 0.140711
H5 0.140711 0.140711
H6 0 0
H7 0 0
H8 0.140711 0.140711
H9 0.140711 0.140711
H10 0.140711 0.1125688
H11 0.140711 0.1125688
H12 0.140711 0.1125688
H13 0.140711 0.1125688
H14 0.140711 0.140711
H15 0.140711 0.140711
H16 0.140711 0.140711
H17 0.140711 0.140711
H18 0.140711 0.140711
H19 0.140711 0.140711
H20 0.140711 0.140711
H21 0.140711 0.140711
H22 0.140711 0.140711
H23 0.140711 0.140711
the key thing here is to assign back the result:
df = df.stack().str.replace(',','.').unstack()
Loop to convert decimal comma (,) into dot (.) to change class of data.frame columns
Attempt 1: gsub
does not modify strings in place - you need to assign it back to df[,i]
.
df[,i] <- gsub(",", ".", df[ , i])
Attempt 2: Right idea. But x[nm]
gives you a data frame, while gsub
takes vectors. Better to do x[,nm]
, with optional drop = TRUE
(this is default). Also, you have the arguments of your function moved around. You want to apply fc
over the different values of inx
, keeping x = df
fixed.
Try:
inx = 1:4
fc <- function(x, inx){
nm <- names(x)[inx]
gsub(pattern = ",", replacement = ".", x = x[,nm])
}
sapply(inx, fc, x = df)
This returns a matrix because sapply
will try to simplify. If this is not desired, use lapply
and wrap it in a data frame.
data.frame(lapply(inx, fc, x = df))
Or to do it in one line with an anonymous function. Data frames are fundamentally lists, so you can iterate over the columns with lapply
like so.
data.frame(lapply(df, function(x) gsub(",", ".", x, fixed = TRUE)))
data frame with commas as decimal separator
When you read in the .csv file, you can specify the sep
and dec
parameters based on the file-type:
# assuming file uses ; for separating columns and , for decimal point
# Using base functions
read.csv(filename, sep = ";", dec = ",")
# Using data.table
library(data.table)
fread(filename, sep = ";", dec = ",")
You should attempt to address the source of the issue first, regular expressions and other work-arounds should be used only if that fails to get the desired result.
replace commas to decimal points in DataFrame columns to make them numeric
import re
for col in ['b', 'c', 'd']:
df[col] = pd.to_numeric(df[col].apply(lambda x: re.sub(',', '.', str(x))))
Search and replace dots and commas in pandas dataframe
The best is use if possible parameters in read_csv
:
df = pd.read_csv(file, thousands='.', decimal=',')
If not possible, then replace
should help:
df['col2'] = (df['col2'].replace('\.','', regex=True)
.replace(',','.', regex=True)
.astype(float))
Replacing dot with comma from a dataframe using Python
Where does the dataframe come from - how was it generated? Was it imported from a CSV file?
Your code works if you apply it to columns which are strings, as long as you remember to dodf = df.apply()
and not just df.apply()
, e.g.:
import pandas as pd
df = pd.DataFrame()
df['a'] =['some . text', 'some . other . text']
df = df.apply(lambda x: x.str.replace('.', ','))
print(df)
However, you are trying to do this with numbers, not strings.
To be precise, the other question is: what are the dtypes of your dataframe?
If you type
df.dtypes
what's the output?
I presume your columns are numeric and not strings, right? After all, if they are numbers they should be stored as such in your dataframe.
The next question: how are you exporting this table to Excel?
If you are saving a csv file, pandas' to_csv()
method has a decimal
argument which lets you specify what should be the separator for the decimals (tyipically, dot in the English-speaking world and comma in many countries in continental Europe). Look up the syntax.
If you are using the to_excel() method, it shouldn't matter because Excel should treat it internally as a number, and how it displays it (whether with a dot or comma for decimal separator) will typically depend on the options set in your computer.
Please clarify how you are exporting the data and what happens when you open it in Excel: does Excel treat it as a string? Or as a number, but you would like to see a different separator for the decimals?
Also look here for how to change decimal separators in Excel: https://www.officetooltips.com/excel_2016/tips/change_the_decimal_point_to_a_comma_or_vice_versa.html
UPDATEOP, you have still not explained where the dataframe comes from. Do you import it from an external source? Do you create it/ calculate it yourself?
The fact that the columns are objects makes me think they are either stored as strings, or maybe some rows are numeric and some are not.
What happens if you try to convert a column to float?
df['Open'] = df['Open'].astype('float64')
If the entire column should be numeric but it's not, then start by cleansing your data.
Second question: what happens when you use Excel to open the file you have just created? Excel displays a comma, but what character Excel sues to separate decimals depends on the Windows/Mac/Excel settings, not on how pandas created the file. Have you tried the link I gave above, can you change how Excel displays decimals? Also, does Excel treat those numbers as numbers or as strings?
Convert commas to dots in txt with python that also contains scientific number formatting
You may try reading the entire file into a Python string, and then doing a global replacement of comma to dot:
data = ""
with open('nums.csv', 'r') as file:
data = file.read().replace(',', '.').replace(' ', ';')
with open("nums_out.csv", "w") as out_file:
out_file.write(data)
For a possibly more robust solution, should there exist the possibility that two columns could be separated by multiple whitespace characters, use re.sub
:
data = ""
with open('nums.csv', 'r') as file:
data = file.read().replace(',', '.')
data = re.sub(r'(?<=\n|^)[^\S\r\n]+', '', data)
data = re.sub('(?<=\S)[^\S\r\n]+', ';', data)
Related Topics
Get Discord User Id from Username
Find and Replace Specific Values Within 2D Array
Clicking Links With Python Beautifulsoup
Discord Bot Messaging a User With a Specific User Id
Making a Matrix in Python 3 Without Numpy Using Inputs
Creating New Dataframes in Loop in Python
How to Compare 2 Indexes in Same List in Python
Save Variables in Every Iteration of for Loop and Load Them Later
Join Two Dataframes from a Conditional Row
How to Read Multiple Lines of Raw Input
How to Put a Space Between Two String Items in Python
Python Format Size Application (Converting B to Kb, Mb, Gb, Tb)
How to Get Max Output from a While Loop
How to Delete the Words Between Two Delimiters
How to Remove Empty Cell from Data Frame Row Wise
Passing Multiple Arguments from Django Template Href Link to View