Non-Standard Evaluation and Quasiquotation in Dplyr() Not Working as (Naively) Expected

Non-standard evaluation and quasiquotation in dplyr() not working as (naively) expected

So, I've realized that what I was struggling with in this question (and many other probelms) is not really quasiquotation and/or non-standard evaluation, but rather converting character strings into object names. Here is my new solution:

letrs_top.df <- letrs_count.df %>%
    top_n(5, get(count_colname))

how to use non-standard evaluation in R

The return of sym should be evaluated with eval or rlang::eval_tidy before they can be used in plot. For example:

a <- 1:10

x <- sym('a')

plot(eval(x))
plot(rlang::eval_tidy(x))

!! or !!! are forcing operators used to force evaluation in tidyverse functions.

Non-standard eval in dplyr::mutate

We can convert the string to symbol and then evaluate

tmp %>% 
   mutate(!! pct.name := (!! sym(upper)/(!! sym(lower))))
# A tibble: 6 x 10
#  qa11a state_abbv fipscode     qa1a reg.pct precleared  year shelby pres  rej.reg
#  <dbl> <chr>      <chr>       <dbl>   <dbl>      <dbl> <dbl>  <dbl> <lgl>   <dbl>
#1  1616 AL         0100100000  34727  0.0465       1.00  2010      0 F      0.0465
#2  7293 AL         0100300000 114952  0.0634       1.00  2010      0 F      0.0634
#3  1528 AL         0100500000  16450  0.0929       1.00  2010      0 F      0.0929
#4  1219 AL         0100700000  12239  0.0996       1.00  2010      0 F      0.0996
#5  2049 AL         0100900000  31874  0.0643       1.00  2010      0 F      0.0643
#6   286 AL         0101100000   7650  0.0374       1.00  2010      0 F      0.0374

when we apply enquo on a string, it is converting to a quosure with quotes

enquo(upper)
# <quosure>
#  expr: ^"qa11a"
#  env:  empty

Instead of converting from a string, it could be easier to do

upper <- quo(qalla)
lower <- quo(qala)

In the OP's code, calling enquo i.e. converting to quosure on a string object will result in string quosure and that is not intended

upper <- "qa11a"
lower <- "qa1a"
enquo(upper)
#<quosure>
#  expr: ^"qa11a"
#  env:  empty

We can compare it to

upper <- quo(qa11a)
lower <- quo(qa1a)
upper
# <quosure>
#  expr: ^qalla
#  env:  global

and executing it

tmp %>% 
     mutate(!! pct.name := (!! upper)/ (!! lower))
# A tibble: 6 x 10
#  qa11a state_abbv fipscode     qa1a reg.pct precleared  year shelby pres  rej.reg
#  <dbl> <chr>      <chr>       <dbl>   <dbl>      <dbl> <dbl>  <dbl> <lgl>   <dbl>
#1  1616 AL         0100100000  34727  0.0465       1.00  2010      0 F      0.0465
#2  7293 AL         0100300000 114952  0.0634       1.00  2010      0 F      0.0634
#3  1528 AL         0100500000  16450  0.0929       1.00  2010      0 F      0.0929
#4  1219 AL         0100700000  12239  0.0996       1.00  2010      0 F      0.0996
#5  2049 AL         0100900000  31874  0.0643       1.00  2010      0 F      0.0643
#6   286 AL         0101100000   7650  0.0374       1.00  2010      0 F      0.0374

Using pre-existing character vectors in quasiquotation of an expression with rlang

In pre 0.5.0 dplyr the underlying framework for non-standard evaluation was lazyeval and required special consideration for strings. Hadley Wickham released a fundamentally new version of dplyr with a new underbelly called rlang which provides a more consistent framework for non-standard evaluation. This was version 0.70 - here's an explanation of why 0.6.0 was skipped - https://blog.rstudio.org/2017/06/13/dplyr-0-7-0/

The following now works without any special considerations:

library("tidyverse")
my_cols <- c("Petal.Width", "Petal.Length")
iris %>%
  select(my_cols)

Note that the new rlang framework adds the ability to have a vector of naked symbols using quosures

my_quos <- quos(Petal.Width, Petal.Length)
iris %>%
  select(!!!my_quos)

You can read more about programming with dplyr here - http://dplyr.tidyverse.org/articles/programming.html

Comparison in Shiny

library("shiny")
library("tidyverse")
library("DT")
library("rlang")
shinyApp(
  ui = fluidPage(
    selectInput(
      "cols_to_show",
      "Columns to show",
      choices = colnames(iris),
      multiple = TRUE
    ),
    dataTableOutput("verb_table"),
    dataTableOutput("tidyeval_table")
  ),
  server = function(input, output) {
    output$verb_table <- renderDataTable({
      iris %>%
        select_(.dots = input$cols_to_show)

    })

    output$tidyeval_table <- renderDataTable({
      iris %>%
        select(!!!syms(input$cols_to_show))

    })
  }
)

Non-Standard Evaluation and Character Vectors

For standard evaluation, you will want to use the functions with an underscore after their given name. In this case that is select_(). And we will also need to use the .dots argument to insert your vector into the call.

d %>% select_(.dots = v)

See help(select) and vignette("nse") for more.

Create R function using dplyr::filter problem

The reason it did not work in your original function was that col_1 was string but dplyr::filter() expected "unquoted" input variable for the LHS. Thus, you need to first convert col_1 to variable using sym() then unquote it inside filter using !! (bang bang).

rlang has really nice function qq_show to show what actually happens with quoting/unquoting (see the output below)

Using eval(parse()) construction within dplyr

Try select(df, !!paste0('Peter_', target))

Why is enquo + !! preferable to substitute + eval

I want to give an answer that is independent of dplyr, because there is a very clear advantage to using enquo over substitute. Both look in the calling environment of a function to identify the expression that was given to that function. The difference is that substitute() does it only once, while !!enquo() will correctly walk up the entire calling stack.

Consider a simple function that uses substitute():

f <- function( myExpr ) {
  eval( substitute(myExpr), list(a=2, b=3) )
}

f(a+b)   # 5
f(a*b)   # 6

This functionality breaks when the call is nested inside another function:

g <- function( myExpr ) {
  val <- f( substitute(myExpr) )
  ## Do some stuff
  val
}

g(a+b)
# myExpr     <-- OOPS

Now consider the same functions re-written using enquo():

library( rlang )

f2 <- function( myExpr ) {
  eval_tidy( enquo(myExpr), list(a=2, b=3) )
}

g2 <- function( myExpr ) {
  val <- f2( !!enquo(myExpr) )
  val
}

g2( a+b )    # 5
g2( b/a )    # 1.5

And that is why enquo() + !! is preferable to substitute() + eval(). dplyr simply takes full advantage of this property to build a coherent set of NSE functions.

UPDATE: rlang 0.4.0 introduced a new operator {{ (pronounced "curly curly"), which is effectively a short hand for !!enquo(). This allows us to simplify the definition of g2 to

g2 <- function( myExpr ) {
  val <- f2( {{myExpr}} )
  val
}

Using Dplyr within a user-defined function to summarise data then plot it

First of all, inside dplyr functions you don't need to call variables indexing the dataframe like df[, timevar]. Use just the variable name. Besides that, when indexing a dataframe you have to specify if you are calling columns or rows, so df[timevar] is wrong.

About the function, it's a problem of evaluation.

This structure below is working:

ConsistencyPlot <- function(df, var1, timevar, lossvar){
  var1 <- enquo(var1)
  timevar <- enquo(timevar)
  lossvar <- enquo(lossvar)

  df1 <- df %>%
    group_by(!!timevar, !!var1) %>%
    summarise(MeanLoss = mean(!!lossvar))

  ggplot(df1, aes(x = !!var1, y = MeanLoss, color = !!timevar, group = !!timevar)) +
    geom_line() +
    geom_point()
}

Look that the parameters were transformed with enquo() and then passed in the function using !!. So, you can pass the arguments without quoting them.

ConsistencyPlot(df, JudicialOrientation, Year, Loss)

I hope you find it useful.

Non-Standard Evaluation and Quasiquotation in Dplyr() Not Working as (Naively) Expected