Pass a String as Variable Name in Dplyr::Filter

Pass a string as variable name in dplyr::filter

!! or UQ evaluates the variable, so mtcars %>% filter(!!var == 4) is the same as mtcars %>% filter('cyl' == 4) where the condition always evaluates to false; You can prove this by printing !!var in the filter function:

mtcars %>% filter({ print(!!var); (!!var) == 4 })
# [1] "cyl"
# [1] mpg cyl disp hp drat wt qsec vs am gear carb
# <0 rows> (or 0-length row.names)

To evaluate var to the cyl column, you need to convert var to a symbol of cyl first, then evaluate the symbol cyl to a column:

Using rlang:

library(rlang)
var <- 'cyl'
mtcars %>% filter((!!sym(var)) == 4)

# mpg cyl disp hp drat wt qsec vs am gear carb
#1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#2 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#3 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
# ...

Or use as.symbol/as.name from baseR:

mtcars %>% filter((!!as.symbol(var)) == 4)

mtcars %>% filter((!!as.name(var)) == 4)

Pass a string as variable name in dplyr::mutate

This operation can be carried out with := while evaluating (!!) and using the conversion to symbol and evaluating on the rhs of assignment

library(dplyr)
my_mtcars <- mtcars %>%
mutate(!! var := factor(!! rlang::sym(var)))
class(my_mtcars$vs)
#[1] "factor"

Or without thinking too much, use mutate_at, which can take strings in vars and apply the function of interest

my_mtcars2 <- mtcars %>% 
mutate_at(vars(var), factor)

Filter function with variable name in R/dplyr

This is likely a dupe (of the link I provided in my comment), but for your case:

name <- "my_dna_42_x"
gene <- "my_gene_12213"
df2 <- df1 %>%
group_by(DNA, ID) %>%
filter(any(DNA == name & ID == gene))
### ^--- single '&'

See the difference between

c(TRUE, TRUE) && c(TRUE, FALSE)
# [1] TRUE
c(TRUE, TRUE) & c(TRUE, FALSE)
# [1] TRUE FALSE

Using a string for variable name in dplyr top_n

You need to use sym() (or as.name() in base) to turn a string into symbol, then add !! to unquote it.

top_n(df, 5, !!sym(metric))

dplyr passing column names as a variable with is.na filter

!! is often not enough to unquote variable names. You often need them in conjunction with rlang::sym. And if you have more than one variable to unquote, you need to use !!! and rlang::syms

df_construction <- function(selected_month, selected_variable){

df1 <- airquality %>%
dplyr::filter(Month == selected_month,
!is.na(!!rlang::sym(selected_variable_en)))%>%
select(Month, Day, selected_variable)

return(df1)
}

For select, you can directly put variable names. There has been a new functionality in dplyr to unquote {{}} but it does not work in all cases.

If you start writing variables names in functions, you might have difficulties with dplyr. In that aspect, data.table is easier to use (see a blog post I wrote on the subject)

Return variable name as string with dplyr in wrapper function

This kind of provides your expected output ("Var" is a list, so not ideal); does it solve your problem?

library(tidyverse)
data("iris")

fxn1<-function(DF, grp, var){
out<-DF %>%
group_by({{grp}}) %>%
summarize(Mean_Val=mean({{var}}, na.rm=TRUE),
Var=deparse(substitute({{var}})))
}

Demo1<-fxn1(iris, Species, Petal.Width)
#> `summarise()` has grouped output by 'Species'. You can override using the
#> `.groups` argument.
Demo1
#> # A tibble: 12 × 3
#> # Groups: Species [3]
#> Species Mean_Val Var
#> <fct> <dbl> <chr>
#> 1 setosa 0.246 "(function (...) "
#> 2 setosa 0.246 "{"
#> 3 setosa 0.246 " .External2(ffi_tilde_eval, sys.call(), environment(…
#> 4 setosa 0.246 "})(Petal.Width)"
#> 5 versicolor 1.33 "(function (...) "
#> 6 versicolor 1.33 "{"
#> 7 versicolor 1.33 " .External2(ffi_tilde_eval, sys.call(), environment(…
#> 8 versicolor 1.33 "})(Petal.Width)"
#> 9 virginica 2.03 "(function (...) "
#> 10 virginica 2.03 "{"
#> 11 virginica 2.03 " .External2(ffi_tilde_eval, sys.call(), environment(…
#> 12 virginica 2.03 "})(Petal.Width)"

fxn2<-function(DF, grp, var){
out<-DF %>%
group_by({{grp}}) %>%
summarize(Mean_Val=mean({{var}}, na.rm=TRUE),
Var=deparse(substitute(var)))
}

Demo2<-fxn2(iris, Species, Petal.Width)
Demo2
#> # A tibble: 3 × 3
#> Species Mean_Val Var
#> <fct> <dbl> <chr>
#> 1 setosa 0.246 var
#> 2 versicolor 1.33 var
#> 3 virginica 2.03 var

Desired<-iris %>% group_by(Species) %>% summarize(Mean_Val=mean(Petal.Width), Var="Petal.Width")
Desired
#> # A tibble: 3 × 3
#> Species Mean_Val Var
#> <fct> <dbl> <chr>
#> 1 setosa 0.246 Petal.Width
#> 2 versicolor 1.33 Petal.Width
#> 3 virginica 2.03 Petal.Width

fxn3 <- function(DF, grp, var){
DF %>%
group_by({{grp}}) %>%
summarize(Mean_Val=mean({{var}}, na.rm=TRUE),
Var=c(ensym(var)))
}

Demo3 <- fxn3(iris, Species, Petal.Width)
Demo3
#> # A tibble: 3 × 3
#> Species Mean_Val Var
#> <fct> <dbl> <list>
#> 1 setosa 0.246 <sym>
#> 2 versicolor 1.33 <sym>
#> 3 virginica 2.03 <sym>

print.data.frame(Demo3)
#> Species Mean_Val Var
#> 1 setosa 0.246 Petal.Width
#> 2 versicolor 1.326 Petal.Width
#> 3 virginica 2.026 Petal.Width

Created on 2022-04-21 by the reprex package (v2.0.1)

Turn variable name into string inside function dplyr

Abstracting out the plotting part and focusing on the file name, I think you can also use rlang::as_name here to convert the symbol into the string you need.

library(ggplot2)

df <- data.frame(region = c(1, 2), mean_age = c(20, 30))

bar_plot <- function(table, col_plot) {
# ggplot(table, aes(
# x = region,
# y = {{ col_plot }}
# )) +
# geom_bar(stat = "identity", fill = "steelblue")
filename <- glue::glue("results/{rlang::as_name(enquo(col_plot))}.png")
filename
}

bar_plot(df, mean_age)
#> results/mean_age.png

Note that we need to do two things: first wrap the argument col_plot in enquo, so we get mean_age of instead of literally col_plot. Then convert with as_name() to turn mean_age into "mean-age".

how to use string variable as filter condition in dplyr

We can use rlang::parse_expr with eval

library(dplyr)
mtcars %>% filter(eval(rlang::parse_expr(cond)))

# mpg cyl disp hp drat wt qsec vs am gear carb
#1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#2 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#3 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#4 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#5 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#6 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#7 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#8 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#9 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#10 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#11 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2

Or using eval and parse

mtcars %>% filter(eval(parse(text = cond)))


Related Topics



Leave a reply



Submit