Can Dplyr Package Be Used For Conditional Mutating

Can dplyr package be used for conditional mutating?

Use ifelse

df %>%
  mutate(g = ifelse(a == 2 | a == 5 | a == 7 | (a == 1 & b == 4), 2,
               ifelse(a == 0 | a == 1 | a == 4 | a == 3 |  c == 4, 3, NA)))

Added - if_else: Note that in dplyr 0.5 there is an if_else function defined so an alternative would be to replace ifelse with if_else; however, note that since if_else is stricter than ifelse (both legs of the condition must have the same type) so the NA in that case would have to be replaced with NA_real_ .

df %>%
  mutate(g = if_else(a == 2 | a == 5 | a == 7 | (a == 1 & b == 4), 2,
               if_else(a == 0 | a == 1 | a == 4 | a == 3 |  c == 4, 3, NA_real_)))

Added - case_when Since this question was posted dplyr has added case_when so another alternative would be:

df %>% mutate(g = case_when(a == 2 | a == 5 | a == 7 | (a == 1 & b == 4) ~ 2,
                            a == 0 | a == 1 | a == 4 | a == 3 |  c == 4 ~ 3,
                            TRUE ~ NA_real_))

Added - arithmetic/na_if If the values are numeric and the conditions (except for the default value of NA at the end) are mutually exclusive, as is the case in the question, then we can use an arithmetic expression such that each term is multiplied by the desired result using na_if at the end to replace 0 with NA.

df %>%
  mutate(g = 2 * (a == 2 | a == 5 | a == 7 | (a == 1 & b == 4)) +
             3 * (a == 0 | a == 1 | a == 4 | a == 3 |  c == 4),
         g = na_if(g, 0))

dplyr mutate with conditional values

Try this:

myfile %>% mutate(V5 = (V1 == 1 & V2 != 4) + 2 * (V2 == 4 & V3 != 1))

giving:

  V1 V2 V3 V4 V5
1  1  2  3  5  1
2  2  4  4  1  2
3  1  4  1  1  0
4  4  5  1  3  0
5  5  5  5  4  0

or this:

myfile %>% mutate(V5 = ifelse(V1 == 1 & V2 != 4, 1, ifelse(V2 == 4 & V3 != 1, 2, 0)))

giving:

  V1 V2 V3 V4 V5
1  1  2  3  5  1
2  2  4  4  1  2
3  1  4  1  1  0
4  4  5  1  3  0
5  5  5  5  4  0

Note

Suggest you get a better name for your data frame. myfile makes it seem as if it holds a file name.

Above used this input:

myfile <- 
structure(list(V1 = c(1L, 2L, 1L, 4L, 5L), V2 = c(2L, 4L, 4L, 
5L, 5L), V3 = c(3L, 4L, 1L, 1L, 5L), V4 = c(5L, 1L, 1L, 3L, 4L
)), .Names = c("V1", "V2", "V3", "V4"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5"))

Update 1 Since originally posted dplyr has changed %.% to %>% so have modified answer accordingly.

Update 2 dplyr now has case_when which provides another solution:

myfile %>% 
       mutate(V5 = case_when(V1 == 1 & V2 != 4 ~ 1, 
                             V2 == 4 & V3 != 1 ~ 2,
                             TRUE ~ 0))

dplyr conditional mutate with ifelse

An option is to insert the * with a regex

library(dplyr)
library(stringr)
df %>% 
   mutate(across(Kingdom:Genus,  ~str_replace(.x, "(Incertae sedis)", "*\\1*")))

dplyr and filter for conditional mutation but return full dataset

We can use n_distinct() and subset the data inside the call to n_distinct().

library(dplyr)

output<-data %>%
        group_by(Condition) %>%
        mutate(Result = ifelse(n_distinct(Names[Value>3])>1,
                               'pass',
                               'fail')) %>%
        ungroup


identical(output, desired)
[1] TRUE

dplyr: Mutate, with conditional if applied to the row before

After arrangeing the dataset by 'devices', 'StartSession', create the 'EndSessionN' by using case_when i.e. if the next value of 'StartSession' (lead) is less than 'EndSession' return the next value of 'StartSession' or else return EndSession

library(dplyr)
datateste5 %>% 
  arrange(devices, StartSession) %>% 
  mutate(StartSessionN = StartSession,
         EndSessionN = case_when(lead(StartSession) < EndSession
            ~ lead(StartSession), TRUE ~ EndSession))

In the dplyr package can you mutate a column based on the values in a different column

You could list the conditions for "Not hazardous" and assign the rest to "Hazardous".

df2 %>%
  mutate(audit_score_cat = case_when(
    gender ==    "Male" & between(audit_score, 0, 3) ~ "Not hazardous",
    gender == "Females" & between(audit_score, 0, 2) ~ "Not hazardous",
    TRUE                                             ~ "Hazardous"
  ))

Conditional and grouped mutate dplyr

If you are comparing year 1991 with 1990, you can do:

socks %>% 
    group_by(drawer_nbr) %>% 
    summarise(growth = +(sock_total[year == 1991] - sock_total[year == 1990] > 0))
# A tibble: 3 x 2
#  drawer_nbr growth
#       <int>  <int>
#1          1      0
#2          2      1
#3          3      0

Can Dplyr Package Be Used For Conditional Mutating