Curly Curly Tidy Evaluation and Modifying Inputs or Their Names

curly curly Tidy evaluation and modifying inputs or their names

Say you want a version of the following function that takes multiple inputs instead of just a single var:

mean_by <- function(data, var, by) {
data %>%
group_by({{ by }}) %>%
summarise(average = mean({{ var }}, na.rm = TRUE))
}

You can't just pass ... to summarise, because then the user needs to call mean() themselves.

mean_by <- function(data, var, ..., by) {
data %>%
group_by({{ by }}) %>%
summarise(...)
}

mtcars %>% mean_by(foo = disp)
#> Error: Column `foo` must be length 1 (a summary value), not 32

mtcars %>% mean_by(foo = mean(disp))
#> # A tibble: 1 x 1
#> foo
#> <dbl>
#> 1 231.

The solution is to quote the dots, modify each of the inputs so they are wrapped in a new call to mean(), and then splice them back:

mean_by <- function(data, ..., by) {
# `.named` makes sure the dots have default names, if not supplied
dots <- enquos(..., .named = TRUE)

# Go over all inputs, and wrap them in a call
dots <- lapply(dots, function(dot) call("mean", dot, na.rm = TRUE))

# Finally, splice the expressions back into `summarise()`:
data %>%
group_by({{ by }}) %>%
summarise(!!!dots)
}

We are considering how we could improve syntax for this case. Early thoughts at http://rpubs.com/lionel-/superstache

curly curly tidy evaluation programming with multiple inputs and custom function across columns

Maybe I'm misunderstanding what the issue is, but the standard pattern of forwarding the dots seems to work fine here:

my_function <- function(data, ..., by) {
data %>%
group_by({{ by }}) %>%
filter_at(vars(...), any_vars(n_distinct(.) != 1)) %>%
ungroup
}

foo %>%
my_function( a, b, by=group ) # works

can not understand Curly Curly in tidyeavl in r

You can write this as a function passing unquoted variables and not as a string using the curly-curly {{ operator like this:

my_function <- function(test.data, new.col.name, col) {
test.data %>% # error
mutate({{new.col.name}} := lag({{col}}, 2))
}

my_function(test.data, lag_pm10, pm10)

Output:

  pm10 lag_pm10
1 1 NA
2 2 NA
3 3 1
4 4 2
5 5 3

dplyr non standard evaluation with curly curly outside a function

Curly-Curly is used within the functions and with unquoted variables.

library(dplyr)
library(rlang)

my_func <- function(data, var) {
data %>% group_by({{var}}) %>% summarise(n=n())
}

my_func(mpg, model)

# model n
# <chr> <int>
# 1 4runner 4wd 6
# 2 a4 7
# 3 a4 quattro 8
# 4 a6 quattro 3
# 5 altima 6
# 6 c1500 suburban 2wd 5
# 7 camry 7
# 8 camry solara 7
# 9 caravan 2wd 11
#10 civic 9
# … with 28 more rows

To use outside functions and with quoted variables we can use sym and evaluate (!!)

mpg %>% group_by(!!sym(my_var)) %>% summarise(n=n())

Or use group_by_at

mpg %>% group_by_at(my_var) %>% summarise(n=n())

Tidy Evaluation not working with mutate and stringr

Tidy evaluation completely depends on how you send your inputs.

For example, if you send your input as an unquoted variable your attempt would work.

library(dplyr)
library(stringr)
library(rlang)

change_fun <- function(df, text_col) {
df %>% mutate({{text_col}} := str_replace_all({{text_col}}, "a","X"))
}

change_fun(iris, Species) %>% head

# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.1 3.5 1.4 0.2 setosX
#2 4.9 3.0 1.4 0.2 setosX
#3 4.7 3.2 1.3 0.2 setosX
#4 4.6 3.1 1.5 0.2 setosX
#5 5.0 3.6 1.4 0.2 setosX
#6 5.4 3.9 1.7 0.4 setosX

To pass input as quoted variables use sym to convert into symbol first and then evaluate !!.

change_fun <- function(df, text_col) {
df %>% mutate(!!text_col := str_replace_all(!!sym(text_col), "a","X"))
}

change_fun(iris, "Species") %>% head

# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.1 3.5 1.4 0.2 setosX
#2 4.9 3.0 1.4 0.2 setosX
#3 4.7 3.2 1.3 0.2 setosX
#4 4.6 3.1 1.5 0.2 setosX
#5 5.0 3.6 1.4 0.2 setosX
#6 5.4 3.9 1.7 0.4 setosX

How to use tidy evaluation with column name as strings?

We can use also ensym with !!

my_summarise <- function(df, group_var) {

df %>%
group_by(!!rlang::ensym(group_var)) %>%
summarise(a = mean(a))
}

my_summarise(df, 'g1')

Or another option is group_by_at

my_summarise <- function(df, group_var) {

df %>%
group_by_at(vars(group_var)) %>%
summarise(a = mean(a))
}

my_summarise(df, 'g1')

Select named [list] element using tidy evaluation

You can use tidyselect which implements the backend for select():

select2 <- function(.x, ...) {
vars <- rlang::names2(.x)
vars <- tidyselect::vars_select(vars, ...)
.x[vars]
}

x <- list(a = 1, b = 2)
select2(x, dplyr::starts_with("a"))

Note that it's bad practice to implement an S3 method when you don't own either the generic (e.g. select() owned by dplyr) or the class (e.g. list from R core).

How to Enquote Multiple independent vars?

Here's an approach using pivot_longer (the successor to gather) and the "curly curly" operator, which accomplishes the enquo and the !! in one step.

var <- names(mtcars)[2:11]
var
# [1] "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb"

library(tidyverse)
mtcars %>%
pivot_longer(cols = {{ var }})

# A tibble: 320 x 3
mpg name value
<dbl> <chr> <dbl>
1 21 cyl 6
2 21 disp 160
3 21 hp 110
4 21 drat 3.9
5 21 wt 2.62
6 21 qsec 16.5
7 21 vs 0
8 21 am 1
9 21 gear 4
10 21 carb 4
# … with 310 more rows

Or if you want to stick with gather:

mtcars %>% 
gather("name", "value", {{ var }})

For more on the "curly curly" or "embrace" operator, here are more resources:

https://www.tidyverse.org/blog/2019/06/rlang-0-4-0/#a-simpler-interpolation-pattern-with-

https://sharla.party/post/tidyeval/

https://www.brodrigues.co/blog/2019-06-20-tidy_eval_saga/

curly curly Tidy evaluation and modifying inputs or their names

R How to use curly curly with filter or filter_?

As Lionel points out, curly-curly works inside functions. To use it with filter, you thus have to wrap the call inside a function.

f <- function(.df, v) { 
filter(.df, {{ v }} > 0)
}

# Curly-curly provides automatic NSE support
f( A, var2 )
# # A tibble: 3 x 3
# var1 var2 var3
# <dbl> <dbl> <dbl>
# 1 -2.35 0.0645 0.460
# 2 0.429 0.959 -0.694
# 3 -0.890 2.42 -0.936

# Strings have to be first converted to symbols
f( A, !!sym("var3") )
# # A tibble: 3 x 3
# var1 var2 var3
# <dbl> <dbl> <dbl>
# 1 -1.21 -0.477 0.134
# 2 -2.35 0.0645 0.460
# 3 -0.575 -0.511 0.575

Curly-curly is meant to reference a single argument. You can extend it to work with multiple variables through sequential application with the help of purrr::reduce. (Don't forget to convert your strings into actual variable names first!):

syms(varnames_2) %>% reduce(f, .init=A)
# # A tibble: 1 x 3
# var1 var2 var3
# <dbl> <dbl> <dbl>
# 1 -2.35 0.0645 0.460


Related Topics



Leave a reply



Submit