Case_When in Mutate Pipe

case_when in mutate pipe

As of version 0.7.0 of dplyr, case_when works within mutate as follows:

library(dplyr) # >= 0.7.0
mtcars %>% 
  mutate(cg = case_when(carb <= 2 ~ "low",
                        carb > 2  ~ "high"))

For more information: http://dplyr.tidyverse.org/reference/case_when.html

case_when() mutate in pipe but maintain original data

Fortunately this issue is relatively trivial. In case_when() we check through a list of conditions in order, so if the second argument had the condition TRUE and we set if this condition is met the result to be carb we could attain our result.

I have produced an illustrative reprex below.

library(tidyverse)

mtcars %>%
  mutate(carb = case_when(
    cyl == 4 ~ 5000,
    TRUE ~ carb
  )) %>% 
  head(10)
#>                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4 5000
#> Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#> Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
#> Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4 5000
#> Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4 5000
#> Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4

^{Created on 2021-04-07 by the reprex package (v2.0.0)}

How can I use case_when inside dplyr mutate() function?

dplyr has been updated and so the code should work now without using .$ or transforming to data.table - see https://github.com/tidyverse/dplyr/issues/1965

Mutate factor to new variable in case_when() statement

case_when doesn't do the type conversion automatically - i.e. currency is factor whereas the returns from other conditions in case_when is just character. So, we can force convert the currency to character to make all the returns same class and it should work

library(dplyr)
df %>%
  mutate(
    invoice_converted = case_when(
      currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
      currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
      TRUE ~ invoice
    ), currency_converted = case_when(currency == "EUR" ~ "USD",
                                   currency == "CAD" ~ "USD",
                                   TRUE ~ as.character(currency)))

-output

# A tibble: 5 × 5
  shipment invoice currency invoice_converted currency_converted
  <chr>      <dbl> <fct>                <dbl> <chr>             
1 A            500 USD                    500 USD               
2 B            500 EUR                    568 USD               
3 C            500 CAD                    373 USD               
4 D            500 <NA>                   500 <NA>              
5 E            500 SDD                    500 SDD

If we want to keep it as a factor, either wrap with factor after the case_when or directly use fct_recode instead of case_when

library(forcats)
df %>%
  mutate(
    invoice_converted = case_when(
      currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
      currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
      TRUE ~ invoice
    ), currency_converted = fct_recode(currency, USD = "EUR", USD = "CAD"))

-output

# A tibble: 5 × 5
  shipment invoice currency invoice_converted currency_converted
  <chr>      <dbl> <fct>                <dbl> <fct>             
1 A            500 USD                    500 USD               
2 B            500 EUR                    568 USD               
3 C            500 CAD                    373 USD               
4 D            500 <NA>                   500 <NA>              
5 E            500 SDD                    500 SDD

mutate und case_when with multiple cases

You probably meant to write :

library(dplyr)

bsp1 <- bsp1 %>%
          mutate(xxx =
                  case_when(
                    Geschlecht == "m" & Alter > 18 & xx == 55 ~ 1, 
                    Geschlecht == "m" & Alter > 18 & xx == 56 ~ 2, 
                    TRUE  ~ 3 
                 ))

clean code inside of mutate = case_when in dplyr

You don't need the case_when here, you just need a group_by

find.after <- function(data, expr) {
  data %>% 
    group_by(id) %>% 
    mutate("a.{{expr}}" = lead({{expr}}))
}

This way you are always only looking within the same id for the next value

mutate with case_when and contains

We can use grep

df %>%  
   mutate(group = case_when(grepl("Bl", b) ~ "Group1",
                            grepl("re", b, ignore.case = TRUE) ~"Group2"))
#    a     b  group
#1   1 Black Group1
#2   2 Green Group2
#3   3 Green Group2
#4   4 Green Group2
#5   5   Red Group2
#6   6 Green Group2
#7   7 Black Group1
#8   8 Black Group1
#9   9 Green Group2
#10 10 Green Group2
#11  1 Green Group2
#12  2 Green Group2
#13  3  Blue Group1
#14  4   Red Group2
#15  5  Blue Group1
#16  6   Red Group2
#17  7  Blue Group1
#18  8  Blue Group1
#19  9 Black Group1
#20 10 Black Group1

Use mutate case_when() in a specific range of columns in dplyr

dplyr's c_across is very handy for operations like this:

df1 %>% 
  rowwise() %>% 
  mutate(inner_S = ifelse(any(grepl('S', c_across(col1:col4))), 'YES', 'NO'))

  position correction    col1  col2  col3  col4  col5  inner_S
     <dbl> <chr>         <chr> <chr> <chr> <chr> <chr> <chr>  
1      100 62M89S        NA    NA    NA    62M   89S   NO     
2      200 8M1D55M88S    NA    8M    1D    55M   88S   NO     
3      300 1S25M1S36M89S 1S    25M   1S    36M   89S   YES

Case_When in Mutate Pipe