Can dplyr package be used for conditional mutating?
Use ifelse
df %>%
mutate(g = ifelse(a == 2 | a == 5 | a == 7 | (a == 1 & b == 4), 2,
ifelse(a == 0 | a == 1 | a == 4 | a == 3 | c == 4, 3, NA)))
Added - if_else: Note that in dplyr 0.5 there is an if_else
function defined so an alternative would be to replace ifelse
with if_else
; however, note that since if_else
is stricter than ifelse
(both legs of the condition must have the same type) so the NA
in that case would have to be replaced with NA_real_
.
df %>%
mutate(g = if_else(a == 2 | a == 5 | a == 7 | (a == 1 & b == 4), 2,
if_else(a == 0 | a == 1 | a == 4 | a == 3 | c == 4, 3, NA_real_)))
Added - case_when Since this question was posted dplyr has added case_when
so another alternative would be:
df %>% mutate(g = case_when(a == 2 | a == 5 | a == 7 | (a == 1 & b == 4) ~ 2,
a == 0 | a == 1 | a == 4 | a == 3 | c == 4 ~ 3,
TRUE ~ NA_real_))
Added - arithmetic/na_if If the values are numeric and the conditions (except for the default value of NA at the end) are mutually exclusive, as is the case in the question, then we can use an arithmetic expression such that each term is multiplied by the desired result using na_if
at the end to replace 0 with NA.
df %>%
mutate(g = 2 * (a == 2 | a == 5 | a == 7 | (a == 1 & b == 4)) +
3 * (a == 0 | a == 1 | a == 4 | a == 3 | c == 4),
g = na_if(g, 0))
dplyr mutate with conditional values
Try this:
myfile %>% mutate(V5 = (V1 == 1 & V2 != 4) + 2 * (V2 == 4 & V3 != 1))
giving:
V1 V2 V3 V4 V5
1 1 2 3 5 1
2 2 4 4 1 2
3 1 4 1 1 0
4 4 5 1 3 0
5 5 5 5 4 0
or this:
myfile %>% mutate(V5 = ifelse(V1 == 1 & V2 != 4, 1, ifelse(V2 == 4 & V3 != 1, 2, 0)))
giving:
V1 V2 V3 V4 V5
1 1 2 3 5 1
2 2 4 4 1 2
3 1 4 1 1 0
4 4 5 1 3 0
5 5 5 5 4 0
Note
Suggest you get a better name for your data frame. myfile makes it seem as if it holds a file name.
Above used this input:
myfile <-
structure(list(V1 = c(1L, 2L, 1L, 4L, 5L), V2 = c(2L, 4L, 4L,
5L, 5L), V3 = c(3L, 4L, 1L, 1L, 5L), V4 = c(5L, 1L, 1L, 3L, 4L
)), .Names = c("V1", "V2", "V3", "V4"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
Update 1 Since originally posted dplyr has changed %.%
to %>%
so have modified answer accordingly.
Update 2 dplyr now has case_when
which provides another solution:
myfile %>%
mutate(V5 = case_when(V1 == 1 & V2 != 4 ~ 1,
V2 == 4 & V3 != 1 ~ 2,
TRUE ~ 0))
dplyr conditional mutate with ifelse
An option is to insert the *
with a regex
library(dplyr)
library(stringr)
df %>%
mutate(across(Kingdom:Genus, ~str_replace(.x, "(Incertae sedis)", "*\\1*")))
dplyr and filter for conditional mutation but return full dataset
We can use n_distinct() and subset the data inside the call to n_distinct().
library(dplyr)
output<-data %>%
group_by(Condition) %>%
mutate(Result = ifelse(n_distinct(Names[Value>3])>1,
'pass',
'fail')) %>%
ungroup
identical(output, desired)
[1] TRUE
dplyr: Mutate, with conditional if applied to the row before
After arrange
ing the dataset by 'devices', 'StartSession', create the 'EndSessionN' by using case_when
i.e. if the next value of 'StartSession' (lead
) is less than 'EndSession' return the next value of 'StartSession' or else return EndSession
library(dplyr)
datateste5 %>%
arrange(devices, StartSession) %>%
mutate(StartSessionN = StartSession,
EndSessionN = case_when(lead(StartSession) < EndSession
~ lead(StartSession), TRUE ~ EndSession))
In the dplyr package can you mutate a column based on the values in a different column
You could list the conditions for "Not hazardous" and assign the rest to "Hazardous".
df2 %>%
mutate(audit_score_cat = case_when(
gender == "Male" & between(audit_score, 0, 3) ~ "Not hazardous",
gender == "Females" & between(audit_score, 0, 2) ~ "Not hazardous",
TRUE ~ "Hazardous"
))
Conditional and grouped mutate dplyr
If you are comparing year 1991 with 1990, you can do:
socks %>%
group_by(drawer_nbr) %>%
summarise(growth = +(sock_total[year == 1991] - sock_total[year == 1990] > 0))
# A tibble: 3 x 2
# drawer_nbr growth
# <int> <int>
#1 1 0
#2 2 1
#3 3 0
Related Topics
Replace Column Values With Na Based on a Different Column or Row Position With Tidyverse
How to Find the Difference in Value in Every Two Consecutive Rows in R
How to Get Rowsums for Selected Columns in R
Regex to Replace Comma to Dot Separator
Calculate Max Value Across Multiple Columns by Multiple Groups
How to Get to the Next Line in the R Command Prompt Without Executing
How to Sort a Data Frame by Alphabetic Order of a Character Variable in R
How to Show Code But Hide Output in Rmarkdown
How to Remove Na from a Factor Variable (And from a Ggplot Chart)
Removing All Empty Columns and Rows in Data.Frame When Rows Don't Go Away
Rstudio Suddenly Stopped Showing Plots in the Plot Pane
Error in Confusionmatrix the Data and Reference Factors Must Have the Same Number of Levels
How to Specify the Size of a Graph in Ggplot2 Independent of Axis Labels
Mapping Columns/Rows from One Dataframe to Another Based on Row Number
How to Create a Consecutive Group Number
How to Add Row and Column to a Dataframe of Different Length