dplyr change many data types
You can use the standard evaluation version of mutate_each
(which is mutate_each_
) to change the column classes:
dat %>% mutate_each_(funs(factor), l1) %>% mutate_each_(funs(as.numeric), l2)
Change the data types of multiple columns through there names in R
Using dplyr
library(dplyr)
df <- df %>%
mutate(across(c(COLB, COlC), as.integer))
Or if there are many columns, specify a range (:
) if they are in the sequence
df %>%
mutate(across(COLB:COlC, as.integer))
Or if only the first column needs to be skipped, can use -
df %>%
mutate(across(-COLA, as.integer))
In base R
, we can use lapply
nm1 <- names(df)[-1]
df[nm1] <- lapply(df[nm1], as.integer)
It is also possible to do this automatically with type.convert
type.convert(df, as.is = TRUE)
using dplyr can we change to numeric data type only those columns for which data type is integer
With base R
we can do
i1 <- sapply(dat, is.integer)
dat[i1] <- lapply(dat[i1], as.numeric)
converting multiple columns from character to numeric format in r
You could try
DF <- data.frame("a" = as.character(0:5),
"b" = paste(0:5, ".1", sep = ""),
"c" = letters[1:6],
stringsAsFactors = FALSE)
# Check columns classes
sapply(DF, class)
# a b c
# "character" "character" "character"
cols.num <- c("a","b")
DF[cols.num] <- sapply(DF[cols.num],as.numeric)
sapply(DF, class)
# a b c
# "numeric" "numeric" "character"
Changing column data types using Mutate_at() in R
Contract Date
and Hire Date
have different formats. Try :
library(dplyr)
library(lubridate)
data2 %>%
mutate(`Contract Date` = as_date(mdy_hm(`Contract Date`)),
`Hire Date` = mdy(`Hire Date`))
We can also use base R to do this :
transform(df, `Contract Date` = as.Date(as.POSIXct(`Contract Date`,
format = "%m/%d/%Y %H:%M")),
`Hire Date` = as.Date(`Hire Date`, "%m/%d/%Y"))
Converting multiple columns to double type in R using dplyr
It would be better to do this with type.convert
from base R
which automatically correct the type based on the value in each column
df1 <- type.convert(df, as.is = TRUE)
In dplyr
, it can be done with across
and specify the range of columns with either numeric index
df %>%
mutate(across(2:4, as.numeric))
Or the column names range
df %>%
mutate(across(X11:P3, as.numeric))
convert selected columns at once to integer in R in dplyr
You can use across
to apply same function to multiple columns.
library(dplyr)
df %>% mutate(across(all_of(cols), as.integer))
#In old version we use `mutate_at`
#df %>% mutate_at(all_of(cols), as.integer)
# depth table price x y z
# <int> <int> <int> <int> <int> <dbl>
#1 61 55 326 3 3 2.43
#2 59 61 326 3 3 2.31
#3 56 65 327 4 4 2.31
Using all_of
is not required but it is a good practice to use it when we use variables which are not present in the dataframe.
dplyr::mutate_if - Using created variables to build new ones
Instead of a list
, return a tibble
which can also get the previous column value from its name and then unnest
the tibble
columns
library(dplyr)
library(tidyr)
x %>%
mutate(across(starts_with('x'),
~ tibble(`1` = (.x * 2),
`2` = `1` * 4), .names = "{.col}_new")) %>%
unnest(where(is.tibble), names_sep = "")
-output
# A tibble: 10 × 7
x x1 y x_new1 x_new2 x1_new1 x1_new2
<int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 1 21 10 2 8 42 168
2 2 22 11 4 16 44 176
3 3 23 12 6 24 46 184
4 4 24 13 8 32 48 192
5 5 25 14 10 40 50 200
6 6 26 15 12 48 52 208
7 7 27 16 14 56 54 216
8 8 28 17 16 64 56 224
9 9 29 18 18 72 58 232
10 10 30 19 20 80 60 240
Or could also use mutate
after converting to tibble
x %>%
transmute(across(starts_with('x'), ~ tibble(new1 = .x *2) %>%
mutate(new2 = new1 *4))) %>%
unnest(where(is_tibble), names_sep = "_") %>%
bind_cols(x, .)
-output
x x1 y x_new1 x_new2 x1_new1 x1_new2
1 1 21 10 2 8 42 168
2 2 22 11 4 16 44 176
3 3 23 12 6 24 46 184
4 4 24 13 8 32 48 192
5 5 25 14 10 40 50 200
6 6 26 15 12 48 52 208
7 7 27 16 14 56 54 216
8 8 28 17 16 64 56 224
9 9 29 18 18 72 58 232
10 10 30 19 20 80 60 240
Or block the multiple statements within {}
x %>%
mutate(across(starts_with('x'), ~
{
new <- .x * 2
new2 <- new * 4
tibble(new, new2)}, .names = "{.col}_")) %>%
unnest(where(is_tibble), names_sep = "")
# A tibble: 10 × 7
x x1 y x_new x_new2 x1_new x1_new2
<int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 1 21 10 2 8 42 168
2 2 22 11 4 16 44 176
3 3 23 12 6 24 46 184
4 4 24 13 8 32 48 192
5 5 25 14 10 40 50 200
6 6 26 15 12 48 52 208
7 7 27 16 14 56 54 216
8 8 28 17 16 64 56 224
9 9 29 18 18 72 58 232
10 10 30 19 20 80 60 240
Related Topics
Fixing Maps Library Data for Pacific Centred (0°-360° Longitude) Display
Dplyr - Group by and Select Top X %
Stumped on How to Scrape the Data from This Site (Using R)
Connect to Postgres via Ssl Using R
R- How to Dynamically Name Data Frames
Replace Duplicated Elements with Na, Instead of Removing Them
How to Set the Default Language of Date in R
Is There a Weighted.Median() Function
Reason Behind Speed of Fread in Data.Table Package in R
How to Implement a Cleanup Routine in R Shiny
Collect All User Inputs Throughout the Shiny App
Add a Box for the Na Values to the Ggplot Legend for a Continuous Map
How to Group Data.Table by Multiple Columns
Producing a Vector Graphics Image (I.E. Metafile) in R Suitable for Printing in Word 2007