How Do {{}} Double Curly Brackets Work in Dplyr

How do {{}} double curly brackets work in dplyr?

{{}} (curly-curly) have lot of applications. It is called as meta-programming and is used for writing functions. For example, consider this example :

library(dplyr)
library(rlang)

mtcars %>% group_by(cyl) %>% summarise(new_mpg = mean(mpg))

# A tibble: 3 x 2
# cyl new_mpg
# <dbl> <dbl>
#1 4 26.7
#2 6 19.7
#3 8 15.1

Now if you want to write this as a function passing unquoted variables (not a string), you can use {{}} as :

my_fun <- function(data, group_col, col, new_col) {
data %>%
group_by({{group_col}}) %>%
summarise({{new_col}} := mean({{col}}))
}

mtcars %>% my_fun(cyl, mpg, new_mpg)

# cyl new_mpg
# <dbl> <dbl>
#1 4 26.7
#2 6 19.7
#3 8 15.1

Notice that you are passing all the variables without quotes and the group-column (cyl), the column which is being aggregated (mpg), the name of new column (new_mpg) are all dynamic. This would just be one use-case of it.

To learn more refer to:

  • Programming with dplyr
  • rlang 0.4.0: curly-curly

How to use double curly brackets in R with ggplot and dates?

The double curly braces work with unquoted variables names. With variables names as character strings, use aes_string, see also here.

suppressPackageStartupMessages({
library(dplyr)
library(ggplot2)
library(lubridate)
})

#Sample data
test <- tibble(Week = seq(as.Date("2014/09/04"), by = "week", length.out = 8),
Month = ymd(rep('2014-09-01', 4), rep('2014-10-01', 4)),
Week_score = c(2, 3, 4, 6, 5, 7, 8, 9),
Month_score = c(15, NA, NA, NA, 29, NA, NA, NA))

#My function
make_time_plot <- function(data, time_period = c("Week", "Month")) {

time_period <- match.arg(time_period)
time_period_score <- paste0(time_period, "_score")

data %>%
ggplot(aes_string(x = time_period, y = time_period_score)) +
geom_line()
}

#make_time_plot(data = test, time_period = "Month")
make_time_plot(data = test, time_period = "Week")

Created on 2022-04-02 by the reprex package (v2.0.1)

Turn a {{}} (dplyr double curly braces) interpolation into a string

Instead of the double curly braces, which evaluate the variable in context of the data frame, you need to use enquo to get a string:

library(tidyverse)

f <- function(.df, .var) {
.df %>% ggplot(aes(x = wt, y = {{ .var }})) +
geom_point() +
ggtitle(enquo(.var))
}

f(mtcars, hp)

Created on 2021-01-08 by the reprex package (v0.3.0)

In R, how do you select a column while using double curly brackets? Why can't I use the $ operator with curly brackets?

Based on the example, we use a single column for time_range, filter the data into dat_score1 and dat_score2, based on the time_range and the NA elements in 'score_1', 'score_2' columns, use that in geom_line as data

library(lubridate)
library(dplyr)
library(ggplot2)

make_chart <- function(data, time_range = Month, start_date = NA_Date_) {

dat_score1 <- data %>%
filter(complete.cases(score_1), {{time_range}} >= as.Date(start_date))
dat_score2 <- data %>%
filter(complete.cases(score_2),
{{time_range}} >= as.Date(start_date))

data %>%
ggplot(aes(x = {{time_range}})) +
geom_line(data= dat_score1,
aes(y = score_1, colour = "red",
linetype = "score 1"), size= 1) +
geom_line(data=dat_score2,
aes(y = score_2, colour = "blue", linetype = "score 2"), size= 1)

}

-testing

make_chart(data = test, time_range = Month, start_date = '1990-02-06' )

-output

Sample Image

How to filter dates with double curly brackets?

{{}} is meant to be used on unquoted variable names, not on strings.

Remove the "" from time_period when calling your function, and it works:

date_filter <- function(data, time_period, start_date = NA_Date_) {


data %>%
filter({{time_period}} > as.Date(start_date))
}

date_filter(data = test, time_period = Week, start_date = '2014-09-06')

#> # A tibble: 7 × 4
#> Week Month Week_score Month_score
#> <date> <date> <dbl> <dbl>
#> 1 2014-09-11 2014-09-01 3 NA
#> 2 2014-09-18 2014-09-01 4 NA
#> 3 2014-09-25 2014-09-01 6 NA
#> 4 2014-10-02 2014-10-01 5 29
#> 5 2014-10-09 2014-10-01 7 NA
#> 6 2014-10-16 2014-10-01 8 NA
#> 7 2014-10-23 2014-10-01 9 NA

Created on 2022-04-02 by the reprex package (v2.0.1)

rlang has a nice article which I recommend everybody to read before working with {{}}.

Why does time_period = "Week" work in the first function but not in the other?

Because {{time_period}} gets interpreted as a string, and not the variable Week. You quite literally are trying to compare "Week" > "2014-09-06" in the first function (which is always TRUE), and "Week" > as.Date("2014-09-06") in the second function (which makes R want to convert "Week" into a date, and thus you get an error).

Double brackets (i.e., {{}}, curly-curly) do not work within tidyr::complete() and tidyr::nesting()

You can write complete as -

library(dplyr)
library(tidyr)

complete_data <- function(x, variable){
x %>% complete(Time = seq(min(Time), max(Time), by = 1), {{variable}})
}

cw_complete <- cw_subset %>% complete_data(variable = Chick)

To use nesting you may use ensym -

complete_data <- function(x, variable){
x %>%
complete(Time = seq(min(Time), max(Time), by = 1),
nesting(!!ensym(variable)))
}

Using rlang double curly braces {{ in data.table

I don't think you want to use rlang with data.table. data.table already has more convenient facilities itself. Also suggest not using setDT here as that will result in the side effect of changing dt in place.

library(data.table)

test_dt <- function(dt, col1, col2) {
as.data.table(dt)[, .( test = mean(.SD[[col1]])), by = c(col2)]
}

test_dt(dt = iris, col1 = 'Sepal.Length', col2 = 'Species')
## Species test
## 1: setosa 5.006
## 2: versicolor 5.936
## 3: virginica 6.588

This also works:

test_dt <- function(dt, col1, col2) {
as.data.table(dt)[, .( test = mean(get(col1))), by = c(col2)]
}

test_dt(dt=iris, col1='Sepal.Length', col2='Species')

rlang double curly braces within lm() formula

A couple of nuances. First, {{ is generally only supported in functions powered by tidyverse. The upcoming rlang::inject() function will allow you to extend that support to arbitrary functions.

Second, {{ is shorthand for !!enquo(), which captures the expression provided to the function AND the environment where that expression should be evaluated. Since the environment is already provided by the data frame df, the better verb to use here is ensym(), which captures the symbol only.

The following works with rlang 0.4.10:

my_lm <- function(df, yvar, xvar) {
ysym <- rlang::ensym(yvar)
xsym <- rlang::ensym(xvar)
rlang::inject( lm(!!ysym ~ !!xsym, data=df) )
}

my_lm(mpg, cty, hwy)

In R, how do I get an if statement to recognize if input in double curly brackets is a certain value?

You have to use:

make_plot(test_data, x_variable = "ses")

or alternatively:

ses <- "ses"
make_plot(test_data, x_variable = ses)

This error means that the object ses is not declared.


If you want to be able to pass undeclared objects such as ses as an input, you could use substitute(x_variable) or deparse(substitute(x_variable))

make_plot <- function(data, x_variable) {
#print(deparse(substitute(x_variable)))
if(deparse(substitute(x_variable)) == "ses") {
ggplot(data = test_data, aes(x = ses, y = total)) +
geom_col()
} else {
print("This function isn't designed for this variable, sorry!")
}
}

Sample Image

This is non-standard evaluation however, so make sure this is indeed what you're after as it can lead to surprising behaviours.

This explains the difference between both options, from Advanced R
Sample Image

Is it possible to pass multible variables to the same curly curly?

If your function takes several groups of multiple variables, you need external quoting with vars(). This function simply capture its inputs as a list of expressions:

vars(foo, bar)
#> [[1]]
#> <quosure>
#> expr: ^foo
#> env: global
#>
#> [[2]]
#> <quosure>
#> expr: ^bar
#> env: global

Take an argument that you splice with !!!:

group_mean <- function(.data, .vars, ...) {
.data <- doingsomethingelse(.data, ...)

.data %>%
group_by(!!!.vars) %>%
summarise_all(mean)
}

Use it like this:

data %>% group_mean(vars(foo, bar), baz, quux)


Related Topics



Leave a reply



Submit