Extract column name in mutate_if call
You have to use quo
instead of enquo
#enquo(.) :
<quosure: empty>
~function (expr)
{
enexpr(expr)
}
...
#quo(.) :
<quosure: frame>
~x
<quosure: frame>
~y
<quosure: frame>
~z
With your example :
mutate_if(df, is.numeric, funs({
lookup_value <- df_lookup %>% pull(quo_name(quo(.)))
ifelse(is.na(.), lookup_value, .)
}))
# A tibble: 10 x 4
x y z a
<int> <int> <int> <chr>
1 1 1 8 a
2 2 2 2 b
3 3 3 3 c
4 4 5 8 d
5 5 1 8 e
6 6 2 2 a
7 7 3 3 b
8 8 5 8 c
9 9 1 8 d
10 10 2 2 e
Using group_by with mutate_if by column name
You can use mutate_at
along with contains
from dplyr
as follows,
library(dplyr)
exm_data %>%
group_by(group) %>%
mutate_at(vars(contains('demo')), funs(mean)) %>%
mutate_at(vars(contains('meas')), funs(median))
which gives,
# A tibble: 50 x 5
# Groups: group [5]
group demo_age demo_height meas_score1 meas_score2
<chr> <dbl> <dbl> <dbl> <dbl>
1 d 0.12916082 60.26550 0.1932882 -0.5356818
2 b -0.31142894 64.50839 0.3219514 -0.4777860
3 b -0.31142894 64.50839 0.3219514 -0.4777860
4 a -0.34373403 64.84180 0.1929516 -0.3821047
5 a -0.34373403 64.84180 0.1929516 -0.3821047
6 b -0.31142894 64.50839 0.3219514 -0.4777860
7 d 0.12916082 60.26550 0.1932882 -0.5356818
8 a -0.34373403 64.84180 0.1929516 -0.3821047
9 d 0.12916082 60.26550 0.1932882 -0.5356818
10 c -0.05963747 59.07845 -0.2395409 -0.4484245
BONUS You don't need to load stringr
Access the column names in the `mutate_at` to use it for subseting a list
Does this work for you?
library(dplyr)
library(rlang)
df %>%
mutate_at(vars(var1,var2),
.funs = function(x){recode_list %<>% .[[as_label(enquo(x))]]
recode(x,!!!recode_list)})
## A tibble: 4 x 2
# var1 var2
# <dbl> <dbl>
#1 1 0
#2 1 -1
#3 2 1
#4 3 1
I suspect this works while placing the subset recode_list
directly into recode
does not is because enquo
delays evaluation of x
until assignment with %<>%
. Then !!!
can force evaluation after it has been properly evaluated previously.
Edit
Your approach with rlang
also works with some modifications:
library(rlang)
df %>%
mutate_at(vars(var1, var2), function(x) {
var_name <- rlang::as_label(substitute(x))
recode(x, !!!recode_list[[var_name]])
})
Dplyr _if verbs with predicate function referring to the column names & multiple conditions?
You can do:
btest %>%
select_if(str_detect(names(.), "jcr") & sapply(., is.numeric))
jcr_fourth
1 6
2 7
3 8
4 9
5 10
6 11
mutate variable if column name contains a string
You should wrap your contains("trait") variable filter into vars() call
my_data %>%
mutate_at(vars(contains('trait')), funs(.=='True'))
P.S. I suggest you also drop your if_else()
call and just use logical comparison directly
dplyr::mutate_if() with multiple conditions including column class not working
Note that mutate_if
is being phased out in favour of across
, so the following is perhaps what you want...
df %>%
mutate(across(where(is.character) & matches(varnames), ~mean(as.numeric(.))))
a b c d
1 2 2 4 1
2 2 3 3 5
3 2 4 2 9
Mutate if col name contains 'dat' to date
Instead of mutate_if
, we need mutate_at
. In the newer version of dplyr
, it can be done with mutate
and across
library(dplyr) # >= 1.0.0
df1 <- df %>%
mutate(across(contains('dat'), ~ as.Date(as.character(.), format = '%Y%m%d')))
Prior to 1.0.0
, mutate_at
can be used
df1 <- df %>%
mutate_at(vars(contains('dat')), ~as.Date(as.character(.), format = '%Y%m%d'))
mutate_if
is generally used to check some condition based on the values of the columns, i.e.
df %>%
mutate_if(is.numeric, ~ as.Date(as.character(.), format = '%Y%m%d'))
As a reproducible example
head(iris) %>%
mutate_if(is.numeric, ~ .x * 5)
how to use mutate_if to change values
In this small example, I'm not sure that you actually need mutate_if()
. mutate_if
is designed to use the _if
part to determine which columns to subset and work on, rather than an if
condition when modifying a value.
Rather, you can use mutate_at()
to select your columns to operate on - either based on their exact name or by using vars(contains('your_string'))
.
See the help page for more info on the mutate_*
functions: https://dplyr.tidyverse.org/reference/mutate_all.html
Here are 3 options, using mutate()
and mutate_at()
:
# using mutate()
tbl %>%
mutate(
b = ifelse(a > 25, NA, b)
)
# mutate_at - we select only column 'b'
tbl %>%
mutate_at(vars(c('b')), ~ifelse(a > 25, NA, .))
# select only columns with 'b' in the col name
tbl %>%
mutate_at(vars(contains('b')), ~ifelse(a > 25, NA, .))
Which all produce the same output:
# A tibble: 6 x 2
a b
<dbl> <dbl>
1 10 12
2 20 23
3 30 NA
4 40 NA
5 10 56
6 60 NA
I know it's not mutate_if
, but I suspect you don't actually need it.
Related Topics
If_Else() 'False' Must Be Type Double, Not Integer - in R
Multiple Colors in a Facet Strip Background
Ddply Multiple Quantiles by Group
Why Are Lubridate Functions So Slow When Compared with As.Posixct
Using Mean with .Sd and .Sdcols in Data.Table
How to Extract Data from a Rasterbrick
Ggplot2 Each Group Consists of Only One Observation
Can't Connect to Local MySQL Server Through Socket Error When Using Ssh Tunel
How to Use Aws Cli to Only Copy Files in S3 Bucket That Match a Given String Pattern
Retrieve Census Tract from Coordinates
How to Increase the Space Between Grouped Bars in Ggplot2
What Are Helpful Optimizations in R for Big Data Sets
Clear Memory Allocated by R Session (Gc() Doesnt Help !)
How to Make Discrete Gradient Color Bar with Geom_Contour_Filled
Plotting Continuous and Discrete Series in Ggplot with Facet