What Does the %<>% Operator Mean in R

What does !! operator mean in R

The !! and {{ operators are placeholders to flag a variable as having been quoted. They are usually only needed if you intend to program with the tidyverse.
The tidyverse likes to leverage NSE (non-standard Evaluation) in order to reduce the amount of repetition. The most frequent application is towards the "data.frame" class, in which expressions/symbols are evaluated in the context of a data.frame before searching other scopes.
In order for this to work, some special functions (like in the package dplyr) have arguments that are quoted. To quote an expression, is to save the symbols that make up the expression and prevent the evaluation (in the context of tidyverse they use "quosures", which is like a quoted expression except it contains a reference to the environment the expression was made).
While NSE is great for interactive use, it is notably harder to program with.
Lets consider the dplyr::select

 library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union

iris <- as_tibble(iris)

my_select <- function(.data, col) {
select(.data, col)
}

select(iris, Species)
#> # A tibble: 150 × 1
#> Species
#> <fct>
#> 1 setosa
#> 2 setosa
#> 3 setosa
#> 4 setosa
#> 5 setosa
#> 6 setosa
#> 7 setosa
#> 8 setosa
#> 9 setosa
#> 10 setosa
#> # … with 140 more rows
my_select(iris, Species)
#> Error: object 'Species' not found

we encounter an error because within the scope of my_select
the col argument is evaluated with standard evaluation and
cannot find a variable named Species.

If we attempt to create a variable in the global environemnt, we see that the funciton
works - but it isn't behaving to the heuristics of the tidyverse. In fact,
they produce a note to inform you that this is ambiguous use.

 Species <- "Sepal.Width"
my_select(iris, Species)
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(col)` instead of `col` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
#> This message is displayed once per session.
#> # A tibble: 150 × 1
#> Sepal.Width
#> <dbl>
#> 1 3.5
#> 2 3
#> 3 3.2
#> 4 3.1
#> 5 3.6
#> 6 3.9
#> 7 3.4
#> 8 3.4
#> 9 2.9
#> 10 3.1
#> # … with 140 more rows

To remedy this, we need
to prevent evaluation with enquo() and unquote with !! or just use {{.

 my_select2 <- function(.data, col) {
col_quo <- enquo(col)
select(.data, !!col_quo) #attempting to find whatever symbols were passed to `col` arugment
}
#' `{{` enables the user to skip using the `enquo()` step.
my_select3 <- function(.data, col) {
select(.data, {{col}})
}

my_select2(iris, Species)
#> # A tibble: 150 × 1
#> Species
#> <fct>
#> 1 setosa
#> 2 setosa
#> 3 setosa
#> 4 setosa
#> 5 setosa
#> 6 setosa
#> 7 setosa
#> 8 setosa
#> 9 setosa
#> 10 setosa
#> # … with 140 more rows
my_select3(iris, Species)
#> # A tibble: 150 × 1
#> Species
#> <fct>
#> 1 setosa
#> 2 setosa
#> 3 setosa
#> 4 setosa
#> 5 setosa
#> 6 setosa
#> 7 setosa
#> 8 setosa
#> 9 setosa
#> 10 setosa
#> # … with 140 more rows

In summary, you really only need !! and {{ if you are trying to apply NSE programatically
or do some type of programming on the language.

!!! is used to splice a list/vector of some sort into arguments of some quoting expression.

 library(rlang)
quo_let <- quo(paste(!!!LETTERS))
quo_let
#> <quosure>
#> expr: ^paste("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
#> "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y",
#> "Z")
#> env: global
eval_tidy(quo_let)
#> [1] "A B C D E F G H I J K L M N O P Q R S T U V W X Y Z"

Created on 2021-08-30 by the reprex package (v2.0.1)

Use of $ and %% operators in R

You are not really pulling a value from a function but rather from the list object that the function returns. $ is actually an infix that takes two arguments, the values preceding and following it. It is a convenience function designed that uses non-standard evaluation of its second argument. It's called non-standard because the unquoted characters following $ are first quoted before being used to extract a named element from the first argument.

 t.test  # is the function
t.test(x) # is a named list with one of the names being "p.value"

The value can be pulled in one of three ways:

 t.test(x)$p.value
t.test(x)[['p.value']] # numeric vector
t.test(x)['p.value'] # a list with one item

my.name.for.p.val <- 'p.value'
t.test(x)[[ my.name.for.p.val ]]

When you surround a set of characters with flanking "%"-signs you can create your own vectorized infix function. If you wanted a pmax for which the defautl was na.rm=TRUE do this:

 '%mypmax%' <- function(x,y) pmax(x,y, na.rm=TRUE)

And then use it without quotes:

> c(1:10, NA) %mypmax% c(NA,10:1)
[1] 1 10 9 8 7 6 7 8 9 10 1

What does the %% operator mean in R?

The help, ?magrittr::`%<>%`, answers all your questions, if you are refering to magrittr`s compound assignment pipe-operator:

[...] %<>% is used to update a value
by first piping it into one or more rhs expressions, and then
assigning the result. For example, some_object %<>% foo %>% bar is
equivalent to some_object <- some_object %>% foo %>% bar. It must be
the first pipe-operator in a chain, but otherwise it works like %>%.

So

library(magrittr)
set.seed(1);x <- rnorm(5)
x %<>% abs %>% sort
x
# [1] 0.1836433 0.3295078 0.6264538 0.8356286 1.5952808

is the same as

set.seed(1);x <- rnorm(5)
x <- sort(abs(x))
x
# [1] 0.1836433 0.3295078 0.6264538 0.8356286 1.5952808

What is the difference between '/' operator and %/% operator in r programming language?

/ does the usual division.

Regarding %/%, the documentation states:

%% indicates x mod y and %/% indicates integer division. It
is guaranteed that x == (x %% y) + y * ( x %/% y ) (up to
rounding error) unless y == 0 […]

To find the documentation for such operators, enter the following on the R console:

help("%/%")

What does the double percentage sign (%%) mean?

The "Arithmetic operators" help page (which you can get to via ?"%%") says

%%’ indicates ‘x mod y’

which is only helpful if you've done enough programming to know that this is referring to the modulo operation, i.e. integer-divide x by y and return the remainder. This is useful in many, many, many applications. For example (from @GavinSimpson in comments), %% is useful if you are running a loop and want to print some kind of progress indicator to the screen every nth iteration (e.g. use if (i %% 10 == 0) { #do something} to do something every 10th iteration).

Since %% also works for floating-point numbers in R, I've just dug up an example where if (any(wts %% 1 != 0)) is used to test where any of the wts values are non-integer.

Meaning of @ operator in R language?

meuse is an S4 object

isS4(meuse)
[1] TRUE

If you take the structure of of meuse (str_meuse) you'll see some fields are denoted with your @ operator, including one called data. These slots can be accessed with @ similar to how you might see other slots in other objects accessed using the $ operator. So meuse@data gives you the data portion of the meuse object.

str(meuse)

Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
..@ data :'data.frame': 155 obs. of 12 variables:
.. ..$ cadmium: num [1:155] 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
.. ..$ copper : num [1:155] 85 81 68 81 48 61 31 29 37 24 ...
.. ..$ lead : num [1:155] 299 277 199 116 117 137 132 150 133 80 ...
.. ..$ zinc : num [1:155] 1022 1141 640 257 269 ...
.. ..$ elev : num [1:155] 7.91 6.98 7.8 7.66 7.48 ...
.. ..$ dist : num [1:155] 0.00136 0.01222 0.10303 0.19009 0.27709 ...
.. ..$ om : num [1:155] 13.6 14 13 8 8.7 7.8 9.2 9.5 10.6 6.3 ...
.. ..$ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ soil : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 2 1 1 2 ...
.. ..$ lime : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
.. ..$ landuse: Factor w/ 15 levels "Aa","Ab","Ag",..: 4 4 4 11 4 11 4 2 2 15 ...
.. ..$ dist.m : num [1:155] 50 30 150 270 380 470 240 120 240 420 ...
..@ coords.nrs : int [1:2] 1 2
..@ coords : num [1:155, 1:2] 181072 181025 181165 181298 18130

See how that subsetting is working?

str(meuse@data)
'data.frame': 155 obs. of 12 variables:
$ cadmium: num 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
$ copper : num 85 81 68 81 48 61 31 29 37 24 ...
$ lead : num 299 277 199 116 117 137 132 150 133 80 ...
$ zinc : num 1022 1141 640 257 269 ...
$ elev : num 7.91 6.98 7.8 7.66 7.48 ...
$ dist : num 0.00136 0.01222 0.10303 0.19009 0.27709 ...
$ om : num 13.6 14 13 8 8.7 7.8 9.2 9.5 10.6 6.3 ...
$ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ soil : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 2 1 1 2 ...
$ lime : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
$ landuse: Factor w/ 15 levels "Aa","Ab","Ag",..: 4 4 4 11 4 11 4 2 2 15 ...
$ dist.m : num 50 30 150 270 380 470 240 120 240 420 ...

R: What are operators like %in% called and how can I learn about them?

There are several different things going on here with the percent symbol:

Binary Operators

As several have already pointed out, things of the form %%, %in%, %*% are binary operators (respectively modulo, match, and matrix multiply), just like a +, -, etc. They are functions that operate on two arguments that R recognizes as being special due to their name structure (starts and ends with a %). This allows you to use them in form:

Argument1 %fun_name% Argument2

instead of the more traditional:

fun_name(Argument1, Argument2)

Keep in mind that the following are equivalent:

10 %% 2 == `%%`(10, 2)
"hello" %in% c("hello", "world") == `%in%`("hello", c("hello", "world"))
10 + 2 == `+`(10, 2)

R just recognizes the standard operators as well as the %x% operators as special and allows you to use them as traditional binary operators if you don't quote them. If you quote them (in the examples above with backticks), you can use them as standard two argument functions.

Custom Binary Operators

The big difference between the standard binary operators and %x% operators is that you can define custom binary operators and R will recognize them as special and treat them as binary operators:

`%samp%` <- function(e1, e2) sample(e1, e2)
1:10 %samp% 2
# [1] 1 9

Here we defined a binary operator version of the sample function

"%" (Percent) as a token in special function

The meaning of "%" in function like sprintf or format is completely different and has nothing to do with binary operators. The key thing to note is that in those functions the % character is part of a quoted string, and not a standard symbol on the command line (i.e. "%" and % are very different). In the context of sprintf, inside a string, "%" is a special character used to recognize that the subsequent characters have a special meaning and should not be interpreted as regular text. For example, in:

sprintf("I'm a number: %.2f", runif(3))
# [1] "I'm a number: 0.96" "I'm a number: 0.74" "I'm a number: 0.99"

"%.2f" means a floating point number (f) to be displayed with two decimals (.2). Notice how the "I'm a number: " piece is interpreted literally. The use of "%" allows sprintf users to mix literal text with special instructions on how to represent the other sprintf arguments.

What does %% function mean in R?

%...% operators

%>% has no builtin meaning but the user (or a package) is free to define operators of the form %whatever% in any way they like. For example, this function will return a string consisting of its left argument followed by a comma and space and then it's right argument.

"%,%" <- function(x, y) paste0(x, ", ", y)

# test run

"Hello" %,% "World"
## [1] "Hello, World"

The base of R provides %*% (matrix mulitiplication), %/% (integer division), %in% (is lhs a component of the rhs?), %o% (outer product) and %x% (kronecker product). It is not clear whether %% falls in this category or not but it represents modulo.

expm The R package, expm, defines a matrix power operator %^%. For an example see Matrix power in R .

operators The operators R package has defined a large number of such operators such as %!in% (for not %in%). See http://cran.r-project.org/web/packages/operators/operators.pdf

igraph This package defines %--% , %->% and %<-% to select edges.

lubridate This package defines %m+% and %m-% to add and subtract months and %--% to define an interval. igraph also defines %--% .

Pipes

magrittr In the case of %>% the magrittr R package has defined it as discussed in the magrittr vignette. See http://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html

magittr has also defined a number of other such operators too. See the Additional Pipe Operators section of the prior link which discusses %T>%, %<>% and %$% and http://cran.r-project.org/web/packages/magrittr/magrittr.pdf for even more details.

dplyr The dplyr R package used to define a %.% operator which is similar; however, it has been deprecated and dplyr now recommends that users use %>% which dplyr imports from magrittr and makes available to the dplyr user. As David Arenburg has mentioned in the comments this SO question discusses the differences between it and magrittr's %>% : Differences between %.% (dplyr) and %>% (magrittr)

pipeR The R package, pipeR, defines a %>>% operator that is similar to magrittr's %>% and can be used as an alternative to it. See http://renkun.me/pipeR-tutorial/

The pipeR package also has defined a number of other such operators too. See: http://cran.r-project.org/web/packages/pipeR/pipeR.pdf

postlogic The postlogic package defined %if% and %unless% operators.

wrapr The R package, wrapr, defines a dot pipe %.>% that is an explicit version of %>% in that it does not do implicit insertion of arguments but only substitutes explicit uses of dot on the right hand side. This can be considered as another alternative to %>%. See https://winvector.github.io/wrapr/articles/dot_pipe.html

Bizarro pipe. This is not really a pipe but rather some clever base syntax to work in a way similar to pipes without actually using pipes. It is discussed in http://www.win-vector.com/blog/2017/01/using-the-bizarro-pipe-to-debug-magrittr-pipelines-in-r/ The idea is that instead of writing:

1:8 %>% sum %>% sqrt
## [1] 6

one writes the following. In this case we explicitly use dot rather than eliding the dot argument and end each component of the pipeline with an assignment to the variable whose name is dot (.) . We follow that with a semicolon.

1:8 ->.; sum(.) ->.; sqrt(.)
## [1] 6

Update Added info on expm package and simplified example at top. Added postlogic package.

Update 2 The development version of R has defined a |> pipe. Unlike magrittr's %>% it can only substitute into the first argument of the right hand side. Although limited, it works via syntax transformation so it has no performance impact.

tilde(~) operator in R

From ?lm:

[..] when fitting a linear model y ~ x - 1 specifies a line through the
origin [..]

The "-" in the formula removes a specified term.

So y ~ 1 is just a model with a constant (intercept) and no regressor.

lm(mtcars$mpg ~ 1)
#Call:
#lm(formula = mtcars$mpg ~ 1)
#
#Coefficients:
#(Intercept)
# 20.09

Can it be any other number instead of 1?

No, just try and see.

lm(mtcars$mpg ~ 0) tells R to remove the constant (equal to y ~ -1), and lm(mtcars$mpg ~ 2) gives an error (correctly).

You should read y ~ 1 as y ~ constant inside the formula, it's not a simple number.



Related Topics



Leave a reply



Submit