Dplyr Change Many Data Types

dplyr change many data types

You can use the standard evaluation version of mutate_each (which is mutate_each_) to change the column classes:

dat %>% mutate_each_(funs(factor), l1) %>% mutate_each_(funs(as.numeric), l2)

Change the data types of multiple columns through there names in R

Using dplyr

library(dplyr)
df <- df %>%
    mutate(across(c(COLB, COlC), as.integer))

Or if there are many columns, specify a range (:) if they are in the sequence

df %>%
    mutate(across(COLB:COlC, as.integer))

Or if only the first column needs to be skipped, can use -

 df %>%
    mutate(across(-COLA, as.integer))

In base R, we can use lapply

nm1 <- names(df)[-1]
df[nm1] <- lapply(df[nm1], as.integer)

It is also possible to do this automatically with type.convert

type.convert(df, as.is = TRUE)

using dplyr can we change to numeric data type only those columns for which data type is integer

With base R we can do

i1 <- sapply(dat, is.integer)
dat[i1] <- lapply(dat[i1], as.numeric)

converting multiple columns from character to numeric format in r

You could try

DF <- data.frame("a" = as.character(0:5),
                 "b" = paste(0:5, ".1", sep = ""),
                 "c" = letters[1:6],
                 stringsAsFactors = FALSE)

# Check columns classes
sapply(DF, class)

#           a           b           c 
# "character" "character" "character" 

cols.num <- c("a","b")
DF[cols.num] <- sapply(DF[cols.num],as.numeric)
sapply(DF, class)

#          a           b           c 
#  "numeric"   "numeric" "character"

Changing column data types using Mutate_at() in R

Contract Date and Hire Date have different formats. Try :

library(dplyr)
library(lubridate)

data2 %>% 
  mutate(`Contract Date` = as_date(mdy_hm(`Contract Date`)), 
         `Hire Date` = mdy(`Hire Date`))

We can also use base R to do this :

transform(df, `Contract Date` = as.Date(as.POSIXct(`Contract Date`, 
                                format = "%m/%d/%Y %H:%M")), 
              `Hire Date` = as.Date(`Hire Date`, "%m/%d/%Y"))

Converting multiple columns to double type in R using dplyr

It would be better to do this with type.convert from base R which automatically correct the type based on the value in each column

df1 <- type.convert(df, as.is = TRUE)

In dplyr, it can be done with across and specify the range of columns with either numeric index

df %>%
   mutate(across(2:4, as.numeric))

Or the column names range

df %>%
   mutate(across(X11:P3, as.numeric))

convert selected columns at once to integer in R in dplyr

You can use across to apply same function to multiple columns.

library(dplyr)

df %>% mutate(across(all_of(cols), as.integer))
#In old version we use `mutate_at`
#df %>%  mutate_at(all_of(cols), as.integer)

#  depth table price     x     y     z
#  <int> <int> <int> <int> <int> <dbl>
#1    61    55   326     3     3  2.43
#2    59    61   326     3     3  2.31
#3    56    65   327     4     4  2.31

Using all_of is not required but it is a good practice to use it when we use variables which are not present in the dataframe.

dplyr::mutate_if - Using created variables to build new ones

Instead of a list, return a tibble which can also get the previous column value from its name and then unnest the tibble columns

library(dplyr)
library(tidyr)
x %>% 
 mutate(across(starts_with('x'), 
                  ~ tibble(`1` =  (.x * 2),
                              `2` = `1` * 4), .names = "{.col}_new")) %>% 
  unnest(where(is.tibble), names_sep = "")

-output

# A tibble: 10 × 7
       x    x1     y x_new1 x_new2 x1_new1 x1_new2
   <int> <int> <int>  <dbl>  <dbl>   <dbl>   <dbl>
 1     1    21    10      2      8      42     168
 2     2    22    11      4     16      44     176
 3     3    23    12      6     24      46     184
 4     4    24    13      8     32      48     192
 5     5    25    14     10     40      50     200
 6     6    26    15     12     48      52     208
 7     7    27    16     14     56      54     216
 8     8    28    17     16     64      56     224
 9     9    29    18     18     72      58     232
10    10    30    19     20     80      60     240

Or could also use mutate after converting to tibble

x %>%
   transmute(across(starts_with('x'), ~ tibble(new1  = .x *2) %>% 
        mutate(new2 = new1 *4))) %>%
    unnest(where(is_tibble), names_sep = "_") %>% 
    bind_cols(x, .)

-output

    x x1  y x_new1 x_new2 x1_new1 x1_new2
1   1 21 10      2      8      42     168
2   2 22 11      4     16      44     176
3   3 23 12      6     24      46     184
4   4 24 13      8     32      48     192
5   5 25 14     10     40      50     200
6   6 26 15     12     48      52     208
7   7 27 16     14     56      54     216
8   8 28 17     16     64      56     224
9   9 29 18     18     72      58     232
10 10 30 19     20     80      60     240

Or block the multiple statements within {}

x %>%
   mutate(across(starts_with('x'), ~ 
      {
     new <- .x * 2
     new2 <- new * 4
     tibble(new, new2)}, .names = "{.col}_")) %>% 
   unnest(where(is_tibble), names_sep = "")
# A tibble: 10 × 7
       x    x1     y x_new x_new2 x1_new x1_new2
   <int> <int> <int> <dbl>  <dbl>  <dbl>   <dbl>
 1     1    21    10     2      8     42     168
 2     2    22    11     4     16     44     176
 3     3    23    12     6     24     46     184
 4     4    24    13     8     32     48     192
 5     5    25    14    10     40     50     200
 6     6    26    15    12     48     52     208
 7     7    27    16    14     56     54     216
 8     8    28    17    16     64     56     224
 9     9    29    18    18     72     58     232
10    10    30    19    20     80     60     240

Dplyr Change Many Data Types