How to Convert Entire Dataframe to Numeric While Preserving Decimals

How to convert entire dataframe to numeric while preserving decimals?

You might need to do some checking. You cannot safely convert factors directly to numeric. as.character must be applied first. Otherwise, the factors will be converted to their numeric storage values. I would check each column with is.factor then coerce to numeric as necessary.

df1[] <- lapply(df1, function(x) {
if(is.factor(x)) as.numeric(as.character(x)) else x
})
sapply(df1, class)
# a b
# "numeric" "numeric"

How to convert an entire data.frame to numeric

In base R we can do :

df[] <- lapply(df, as.numeric)

or

df[cols_to_convert]  <- lapply(df[cols_to_convert], as.numeric)

Here's a benchmark of the solutions (ignoring the considerations about factors) :

DF <- data.frame(a = 1:10000, b = letters[1:10000],
c = seq(as.Date("2004-01-01"), by = "week", len = 10000),
stringsAsFactors = TRUE)
DF <- setNames(do.call(cbind,replicate(50,DF,simplify = F)),paste0("V",1:150))

dim(DF)
# [1] 10000 150

library(dplyr)
n1tk <- function(x) data.frame(data.matrix(x))
mm <- function(x) {x[] <- lapply(x,as.numeric); x}
akrun <- function(x) mutate_all(x, as.numeric)
mo <- function(x) {for(i in 1:150){ x[, i] <- as.numeric(x[, i])}}

microbenchmark::microbenchmark(
akrun = akrun(DF),
n1tk = n1tk(DF),
mo = mo(DF),
mm = mm(DF)
)

# Unit: milliseconds
# expr min lq mean median uq max neval
# akrun 152.9837 177.48150 198.292412 190.38610 206.56800 432.2679 100
# n1tk 10.8700 14.48015 22.632782 17.43660 21.68520 89.4694 100
# mo 9.3512 11.41880 15.313889 14.71970 17.66530 37.6390 100
# mm 4.8294 5.91975 8.906348 7.80095 10.11335 71.2647 100

R data frame: convert all data frame elements from characters to numerics while keeping decimals

Do you mean something like below?

df[] <- lapply(
df,
function(x) {
eval(str2lang(gsub("d","e",x)))
}
)

and you will see

> df
V1 V2 V3 V4 V5 V6
1 1 0.007 0.73 4.165 1438.8 6050

> str(df)
'data.frame': 1 obs. of 6 variables:
$ V1: num 1
$ V2: num 0.007
$ V3: num 0.73
$ V4: num 4.16
$ V5: num 1439
$ V6: num 6050

data

> dput(df)
structure(list(V1 = 1L, V2 = "0.007d0", V3 = "0.73d0", V4 = "4.165d0",
V5 = "1438.8d0", V6 = 6050L), class = "data.frame", row.names = "1")

Dataframe column: Remove quotes, change decimals and turn into numeric

Using chartr():

as.numeric(chartr('",', ' .', test$v1))
# [1] 2.60 1.30 850.00 1000.00 57.25 98.00

Decimal places deleted in R

try running this option first

options(digits = 15)


Related Topics



Leave a reply



Submit