An Elegant Way to Change Columns Type in Dataframe in R

An Elegant way to change columns type in dataframe in R

I think elegant code is sometimes subjective. For me, this is elegant but it may be less efficient compared to the OP's code. However, as the question is about elegant code, this can be used.

test.data[] <- lapply(test.data, function(x) if(is.integer(x)) as.numeric(x) else x)

Also, another elegant option is dplyr

library(dplyr)
library(magrittr)
test.data %<>%
mutate_each(funs(if(is.integer(.)) as.numeric(.) else .))

Change the class from factor to numeric of many columns in a data frame

Further to Ramnath's answer, the behaviour you are experiencing is that due to as.numeric(x) returning the internal, numeric representation of the factor x at the R level. If you want to preserve the numbers that are the levels of the factor (rather than their internal representation), you need to convert to character via as.character() first as per Ramnath's example.

Your for loop is just as reasonable as an apply call and might be slightly more readable as to what the intention of the code is. Just change this line:

stats[,i] <- as.numeric(stats[,i])

to read

stats[,i] <- as.numeric(as.character(stats[,i]))

This is FAQ 7.10 in the R FAQ.

HTH

How to change data type of column in Data frame to Date from Char

Use

data$Date <- as.Date(data$date, "%m/%d/%Y")

and then to extract month

data$Month <- format(data$Date, "%m")

We can also use lubridate

data$date <- lubridate::mdy(data$date)

and use month to extract the month.

data$month <- month(data$date)

and with anytime

data$Date <- anytime::anydate(data$Date)

R: change/match data types of common columns between two data frames

Maybe something using match.fun would work:

str(df_1) ## The source classes...
# 'data.frame': 5 obs. of 4 variables:
# $ a: int 4 2 5 9 8
# $ b: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5
# $ c: logi FALSE FALSE TRUE FALSE FALSE
# $ d: int 6 7 8 9 10

str(df_2) ## Before conversion
# 'data.frame': 5 obs. of 5 variables:
# $ a : chr "8" "10" "9" "3" ...
# $ foo: int 1 2 3 4 5
# $ b : chr "a" "b" "c" "d" ...
# $ bar: num 0.294 0.34 0.372 0.459 0.736
# $ c : chr "FALSE" "TRUE" "TRUE" "FALSE" ...

This is the conversion step:

df_2[common] <- lapply(common, function(x) {
match.fun(paste0("as.", class(df_1[[x]])))(df_2[[x]])
})

str(df_2) ## After conversion
# 'data.frame': 5 obs. of 5 variables:
# $ a : int 8 10 9 3 1
# $ foo: int 1 2 3 4 5
# $ b : Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5
# $ bar: num 0.294 0.34 0.372 0.459 0.736
# $ c : logi FALSE TRUE TRUE FALSE FALSE

Change data types using a list of data type names

An option would be

library(tidyverse)
my_df[] <- map2(my_df, str_c("as.", my_types), ~ get(.y)(.x))

Or in base R

my_df[] <- Map(function(x, y) get(y)(x), my_df, paste0("as.", my_types))

-checking the class again

sapply(my_df, class)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# "factor" "character" "numeric" "logical" "character"

Changing Class of Column Across Multiple Dataframes

We can get the datasets loaded into a list with mget (assuming the dataset objects are already created in the global environment) and then loop over the list with map, change the class of 'Name' column in mutate and row bind with suffix _dfr in map

library(dplyr)
library(purrr)
out <- map_dfr(mget(dts), ~ .x %>%
mutate(Name = as.character(Name)))

If there are many columns that are different class. May be, it is better to convert to a single class for all the columns and then bind

out <- map_dfr(mget(dts), ~ .x %>%
mutate(across(everything(), as.character)))
out <- type.convert(out, as.is = TRUE)

If the dplyr version is < 1.0.0, use mutate_all

out <- map_dfr(mget(dts), ~ .x %>%
mutate_all(as.character))

dplyr change many data types

You can use the standard evaluation version of mutate_each (which is mutate_each_) to change the column classes:

dat %>% mutate_each_(funs(factor), l1) %>% mutate_each_(funs(as.numeric), l2)


Related Topics



Leave a reply



Submit