Extract Column Name in Mutate_If Call

Extract column name in mutate_if call

You have to use quo instead of enquo

#enquo(.) :
<quosure: empty>
~function (expr)
{
enexpr(expr)
}
...

#quo(.) :
<quosure: frame>
~x
<quosure: frame>
~y
<quosure: frame>
~z

With your example :

mutate_if(df, is.numeric, funs({
lookup_value <- df_lookup %>% pull(quo_name(quo(.)))
ifelse(is.na(.), lookup_value, .)
}))

# A tibble: 10 x 4
x y z a
<int> <int> <int> <chr>
1 1 1 8 a
2 2 2 2 b
3 3 3 3 c
4 4 5 8 d
5 5 1 8 e
6 6 2 2 a
7 7 3 3 b
8 8 5 8 c
9 9 1 8 d
10 10 2 2 e

Using group_by with mutate_if by column name

You can use mutate_at along with contains from dplyr as follows,

library(dplyr)

exm_data %>%
group_by(group) %>%
mutate_at(vars(contains('demo')), funs(mean)) %>%
mutate_at(vars(contains('meas')), funs(median))

which gives,

# A tibble: 50 x 5
# Groups: group [5]
group demo_age demo_height meas_score1 meas_score2
<chr> <dbl> <dbl> <dbl> <dbl>
1 d 0.12916082 60.26550 0.1932882 -0.5356818
2 b -0.31142894 64.50839 0.3219514 -0.4777860
3 b -0.31142894 64.50839 0.3219514 -0.4777860
4 a -0.34373403 64.84180 0.1929516 -0.3821047
5 a -0.34373403 64.84180 0.1929516 -0.3821047
6 b -0.31142894 64.50839 0.3219514 -0.4777860
7 d 0.12916082 60.26550 0.1932882 -0.5356818
8 a -0.34373403 64.84180 0.1929516 -0.3821047
9 d 0.12916082 60.26550 0.1932882 -0.5356818
10 c -0.05963747 59.07845 -0.2395409 -0.4484245

BONUS You don't need to load stringr

Access the column names in the `mutate_at` to use it for subseting a list

Does this work for you?

library(dplyr)
library(rlang)
df %>%
mutate_at(vars(var1,var2),
.funs = function(x){recode_list %<>% .[[as_label(enquo(x))]]
recode(x,!!!recode_list)})
## A tibble: 4 x 2
# var1 var2
# <dbl> <dbl>
#1 1 0
#2 1 -1
#3 2 1
#4 3 1

I suspect this works while placing the subset recode_list directly into recode does not is because enquo delays evaluation of x until assignment with %<>%. Then !!! can force evaluation after it has been properly evaluated previously.

Edit

Your approach with rlang also works with some modifications:

library(rlang)
df %>%
mutate_at(vars(var1, var2), function(x) {
var_name <- rlang::as_label(substitute(x))
recode(x, !!!recode_list[[var_name]])
})

Dplyr _if verbs with predicate function referring to the column names & multiple conditions?

You can do:

btest %>%
select_if(str_detect(names(.), "jcr") & sapply(., is.numeric))

jcr_fourth
1 6
2 7
3 8
4 9
5 10
6 11

mutate variable if column name contains a string

You should wrap your contains("trait") variable filter into vars() call

my_data %>% 
mutate_at(vars(contains('trait')), funs(.=='True'))

P.S. I suggest you also drop your if_else() call and just use logical comparison directly

dplyr::mutate_if() with multiple conditions including column class not working

Note that mutate_if is being phased out in favour of across, so the following is perhaps what you want...

df %>% 
mutate(across(where(is.character) & matches(varnames), ~mean(as.numeric(.))))

a b c d
1 2 2 4 1
2 2 3 3 5
3 2 4 2 9

Mutate if col name contains 'dat' to date

Instead of mutate_if, we need mutate_at. In the newer version of dplyr, it can be done with mutate and across

library(dplyr) # >= 1.0.0
df1 <- df %>%
mutate(across(contains('dat'), ~ as.Date(as.character(.), format = '%Y%m%d')))

Prior to 1.0.0, mutate_at can be used

df1 <- df %>%
mutate_at(vars(contains('dat')), ~as.Date(as.character(.), format = '%Y%m%d'))

mutate_if is generally used to check some condition based on the values of the columns, i.e.

df  %>%
mutate_if(is.numeric, ~ as.Date(as.character(.), format = '%Y%m%d'))

As a reproducible example

head(iris) %>%
mutate_if(is.numeric, ~ .x * 5)

how to use mutate_if to change values

In this small example, I'm not sure that you actually need mutate_if(). mutate_if is designed to use the _if part to determine which columns to subset and work on, rather than an if condition when modifying a value.

Rather, you can use mutate_at() to select your columns to operate on - either based on their exact name or by using vars(contains('your_string')).

See the help page for more info on the mutate_* functions: https://dplyr.tidyverse.org/reference/mutate_all.html

Here are 3 options, using mutate() and mutate_at():

# using mutate()
tbl %>%
mutate(
b = ifelse(a > 25, NA, b)
)

# mutate_at - we select only column 'b'
tbl %>%
mutate_at(vars(c('b')), ~ifelse(a > 25, NA, .))

# select only columns with 'b' in the col name
tbl %>%
mutate_at(vars(contains('b')), ~ifelse(a > 25, NA, .))

Which all produce the same output:

# A tibble: 6 x 2
a b
<dbl> <dbl>
1 10 12
2 20 23
3 30 NA
4 40 NA
5 10 56
6 60 NA

I know it's not mutate_if, but I suspect you don't actually need it.



Related Topics



Leave a reply



Submit