how to use non-standard evaluation in R
The return of sym
should be evaluated with eval
or rlang::eval_tidy
before they can be used in plot
. For example:
a <- 1:10
x <- sym('a')
plot(eval(x))
plot(rlang::eval_tidy(x))
!!
or !!!
are forcing operators used to force evaluation in tidyverse functions.
Functions and non-standard evaluation in dplyr
You could do the following :
library(tidyverse)
xy <- data.frame(xvar = 1:10, yvar = 11:20)
plotfunc <- function(data, x, y){
x <- enquo(x)
y <- enquo(y)
print(
ggplot(data, aes(x = !!x, y = (!!y)^2)) +
geom_line()
)
}
plotfunc(xy, xvar, yvar)
Non standard evaluation basically means that you're passing the argument as an expression rather than a value. quo
and enquo
also associate an evaluation environment to this expression.
Hadley Wickham introduces it like this in his book :
In most programming languages, you can only access the values of a
function’s arguments. In R, you can also access the code used to
compute them. This makes it possible to evaluate code in non-standard
ways: to use what is known as non-standard evaluation, or NSE for
short. NSE is particularly useful for functions when doing interactive
data analysis because it can dramatically reduce the amount of typing.
Non standard evaluation in dplyr: how do you indirect a function's multiple arguments?
You can define a arg for the data.frame and add the ...
for others variables to group by
testfunc <- function(df,...) {
df %>%
group_by(...) %>%
summarise(mpg = mean(mpg))
}
testfunc(mtcars,cyl,gear)
dplyr group_by and summarize with non-standard evaluation
In this case, it is better to use ensym
as we are passing a string. Also, the ensym
works with unquoted argument as well
foo2 <- function(df, var) {
var <- ensym(var)
df %>%
group_by(a) %>%
summarize(trues=sum(!!var),
falses=sum(! (!!var)))
}
foo2(df, 'b')
# A tibble: 2 x 3
# a trues falses
#* <dbl> <int> <int>
#1 1 2 1
#2 2 1 2
foo2(df, b)
# A tibble: 2 x 3
# a trues falses
#* <dbl> <int> <int>
#1 1 2 1
#2 2 1 2
If the argument passed is an object, evaluate (!!
) while passing into the function to avoid the literal evaluation
foo2(df, !!var)
# A tibble: 2 x 3
# a trues falses
#* <dbl> <int> <int>
#1 1 2 1
#2 2 1 2
Non-standard evaluation in dplyr when using dots for variable number of arguments
Inside your function, across(...,
should instead be across(c(...),
.
library(dplyr, warn.conflicts = FALSE)
sessionInfo()$otherPkgs$dplyr$Version
#> [1] "1.0.7"
tib <- tibble(
x = c("cats and dogs", "foxes and hounds"),
y = c("whales and dolphins", "cats and foxes"),
z = c("dogs and geese", "cats and mice")
)
filter_words <- function(.data, ...) {
words_to_filter <- c("cat", "dog")
.data %>% mutate(
across(c(...), ~ gsub(
paste0(words_to_filter, collapse = "|"),
"#@!*", ., perl = TRUE
)
)
)
}
tib %>%
filter_words(x, y)
#> # A tibble: 2 × 3
#> x y z
#> <chr> <chr> <chr>
#> 1 #@!*s and #@!*s whales and dolphins dogs and geese
#> 2 foxes and hounds #@!*s and foxes cats and mice
Created on 2022-01-17 by the reprex package (v2.0.1)
What is non-standard evaluation and how can you pass an undefined variable to a function in R?
For the second question, the reason you can pass x like a variable rather than a string is due to non-standard evaluation. Effectively, the function arguments are captured rather than being immediately evaluated, and then evaluated within the scope that they exist. For example, with the quote()
function, we can capture the input as-is, rather than looking for the value inside var
. Then, we can evaluate it inside another environment like the mtcars
data frame.
var <- quote(mpg)
> var
mpg
eval(var, envir = mtcars)
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
We can make a similar use of NSE within functions:
f <- function(x) {
input <- substitute(x)
print(input)
eval(input, envir = mtcars)
}
Here, we capture whatever was passed to the argument, and then execute it in the scope of the mtcars
data frame.
f(cyl)
cyl
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
You can read more about this at the above link and here.
Using Standard Evaluation
We can achieve the same results without NSE, but the way we call the functions will differ. In this case, arguments will be immediately evaluated and you will get an object not found error if you pass an undefined variable to the function.
f <- function(x) {
print(x)
mtcars[[x]]
}
To use this function, mpg
must be passed as a string.
f("mpg")
[1] "mpg"
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
You can see the results are identical to the first example, but in this case mpg
is a string rather than a captured expression. The second line of the function can be interpreted as mtcars[["mpg"]]
. Trying to use NSE with this function will result in an error:
f(mpg)
Error in print(x) : object 'mpg' not found
non-standard evaluation (NSE) with dplyr in R
You should use curly-curly ({{}}
) which avoids quo
& !!
. Also you can use count
which is a shortcut for group_by
+ summarise
.
table_summary <- function (data, group_by1){
data %>%
dplyr::count({{group_by1}}) %>%
dplyr::mutate(pct = paste0((round(N/sum(N)*100, 2))," %"))
}
table_summary(clientData, agegroup)
It seems agegroup
is a string. To continue with OP's approach we need to convert it to symbol (sym
) and evaluate it (!!
)
table_summary <- function (data, group_by1){
data %>%
dplyr::group_by(!!sym(group_by1)) %>%
dplyr::summarise(N = n()) %>%
dplyr::mutate(pct = paste0((round(N/sum(N)*100, 2))," %"))
}
Using strings as arguments in custom dplyr function using non-standard evaluation
You can either use sym
to turn "y" into a symbol or parse_expr
to parse it into an expression, then unquote it using !!
:
library(rlang)
testFun(data.frame(x = c("a", "b", "c"), y = 1:3), !!sym(myVar))
testFun(data.frame(x = c("a", "b", "c"), y = 1:3), !!parse_expr(myVar))
Result:
x y
1 a 0
2 b 100
3 c 200
Check my answer in this question for explanation of difference between sym
and parse_expr
.
How to evaluate a constructed string with non-standard evaluation using dplyr?
Use sym
and :=
like this:
library(dplyr)
library(rlang)
t <- tibble( x_01 = c(1, 2, 3), x_02 = c(4, 5, 6))
i <- 1
new <- sym(sprintf("d_%02d", i))
var <- sym(sprintf("x_%02d", i))
t %>% mutate(!!new := (!!var) * 2)
giving:
# A tibble: 3 x 3
x_01 x_02 d_01
<dbl> <dbl> <dbl>
1 1 4 2
2 2 5 4
3 3 6 6
Also note that this is trivial in base R:
tdf <- data.frame( x_01 = c(1, 2, 3), x_02 = c(4, 5, 6))
i <- 1
new <- sprintf("d_%02d", i)
var <- sprintf("x_%02d", i)
tdf[[new]] <- 2 * tdf[[var]]
Related Topics
Ordering Stacks by Size in a Ggplot2 Stacked Bar Graph
"'\W' Is an Unrecognized Escape" in Grep
Obtaining Connected Components of Neighboring Values
Adding 15 Business Days in Lubridate
Calling a User-Defined R Function from C++ Using Rcpp
Get Margin Line Locations in Log Space
Rcpp Function to Select (And to Return) a Sub-Dataframe
Apply Function to Elements Over a List
All Possible Combinations of a Set That Sum to a Target Value
Trouble Passing on an Argument to Function Within Own Function
How to Select Non-Numeric Columns Using Dplyr::Select_If
R Grep Pattern Regex with Brackets
Overlay Grid Rather Than Draw on Top of It
New R-Studio Version 0.98.932 Deletes .Md File - How to Prevent