R: What Do You Call the :: and ::: Operators and How Do They Differ

R: What do you call the :: and ::: operators and how do they differ?

It turns out there is a unique way to access help info for operators such as these colons: add quotations marks around the operator. [E.g., ?'::' or help(":::")].

  • Also, instead of quotation marks, back-ticks (i.e, ` ) also work.

Double Colon Operator and Triple Colon Operator

The answer to the question can be found on the help page for "Double Colon and Triple Colon Operators" (see here).

For a package pkg, pkg::name returns the value of the exported variable name in namespace pkg, whereas pkg:::name returns the value of the internal variable name. The package namespace will be loaded if it was not loaded before the call, but the package will not be attached to the search path.

The difference can be seen by examining the code of each:

> `::`
function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
getExportedValue(pkg, name)
}
<bytecode: 0x00000000136e2ae8>
<environment: namespace:base>

> `:::`
function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
get(name, envir = asNamespace(pkg), inherits = FALSE)
}
<bytecode: 0x0000000013482f50>
<environment: namespace:base>

:: calls getExportedValue(pkg, name), returning the value of the exported variable name in the package's namespace.

::: calls get(name, envir = asNamespace(pkg), inherits = FALSE), searching for the object name in the Namespace environment of the package, and returning the value of the internal variable name.


So, what exactly is a namespace?

This site does a good job of explaining the concept of namespaces in R. Importantly:

As the name suggests, namespaces provide “spaces” for “names”. They provide a context for looking up the value of an object associated with a name.

Is there a technical difference between = and -

Yes there is. This is what the help page of '=' says:

The operators <- and = assign into the
environment in which they are
evaluated. The operator <- can be used
anywhere, whereas the operator = is
only allowed at the top level (e.g.,
in the complete expression typed at
the command prompt) or as one of the
subexpressions in a braced list of
expressions.

With "can be used" the help file means assigning an object here. In a function call you can't assign an object with = because = means assigning arguments there.

Basically, if you use <- then you assign a variable that you will be able to use in your current environment. For example, consider:

matrix(1,nrow=2)

This just makes a 2 row matrix. Now consider:

matrix(1,nrow<-2)

This also gives you a two row matrix, but now we also have an object called nrow which evaluates to 2! What happened is that in the second use we didn't assign the argument nrow 2, we assigned an object nrow 2 and send that to the second argument of matrix, which happens to be nrow.

Edit:

As for the edited questions. Both are the same. The use of = or <- can cause a lot of discussion as to which one is best. Many style guides advocate <- and I agree with that, but do keep spaces around <- assignments or they can become quite hard to interpret. If you don't use spaces (you should, except on twitter), I prefer =, and never use ->!

But really it doesn't matter what you use as long as you are consistent in your choice. Using = on one line and <- on the next results in very ugly code.

What's the difference between `=` and ` -` in R?

From here:

The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.

What are the differences between = and - assignment operators?


What are the differences between the assignment operators = and <- in R?

As your example shows, = and <- have slightly different operator precedence (which determines the order of evaluation when they are mixed in the same expression). In fact, ?Syntax in R gives the following operator precedence table, from highest to lowest:


‘-> ->>’ rightwards assignment
‘<- <<-’ assignment (right to left)
‘=’ assignment (right to left)

But is this the only difference?

Since you were asking about the assignment operators: yes, that is the only difference. However, you would be forgiven for believing otherwise. Even the R documentation of ?assignOps claims that there are more differences:

The operator <- can be used anywhere,
whereas the operator = is only allowed at the top level (e.g.,
in the complete expression typed at the command prompt) or as one
of the subexpressions in a braced list of expressions.

Let’s not put too fine a point on it: the R documentation is wrong. This is easy to show: we just need to find a counter-example of the = operator that isn’t (a) at the top level, nor (b) a subexpression in a braced list of expressions (i.e. {…; …}). — Without further ado:

x
# Error: object 'x' not found
sum((x = 1), 2)
# [1] 3
x
# [1] 1

Clearly we’ve performed an assignment, using =, outside of contexts (a) and (b). So, why has the documentation of a core R language feature been wrong for decades?

It’s because in R’s syntax the symbol = has two distinct meanings that get routinely conflated (even by experts, including in the documentation cited above):

  1. The first meaning is as an assignment operator. This is all we’ve talked about so far.
  2. The second meaning isn’t an operator but rather a syntax token that signals named argument passing in a function call. Unlike the = operator it performs no action at runtime, it merely changes the way an expression is parsed.

So how does R decide whether a given usage of = refers to the operator or to named argument passing? Let’s see.

In any piece of code of the general form …

‹function_name›(‹argname› = ‹value›, …)
‹function_name›(‹args›, ‹argname› = ‹value›, …)

… the = is the token that defines named argument passing: it is not the assignment operator. Furthermore, = is entirely forbidden in some syntactic contexts:

if (‹var› = ‹value›) …
while (‹var› = ‹value›) …
for (‹var› = ‹value› in ‹value2›) …
for (‹var1› in ‹var2› = ‹value›) …

Any of these will raise an error “unexpected '=' in ‹bla›”.

In any other context, = refers to the assignment operator call. In particular, merely putting parentheses around the subexpression makes any of the above (a) valid, and (b) an assignment. For instance, the following performs assignment:

median((x = 1 : 10))

But also:

if (! (nf = length(from))) return()

Now you might object that such code is atrocious (and you may be right). But I took this code from the base::file.copy function (replacing <- with =) — it’s a pervasive pattern in much of the core R codebase.

The original explanation by John Chambers, which the the R documentation is probably based on, actually explains this correctly:

[= assignment is] allowed in only two places in the grammar: at the top level (as a complete program or user-typed expression); and when isolated from surrounding logical structure, by braces or an extra pair of parentheses.


In sum, by default the operators <- and = do the same thing. But either of them can be overridden separately to change its behaviour. By contrast, <- and -> (left-to-right assignment), though syntactically distinct, always call the same function. Overriding one also overrides the other. Knowing this is rarely practical but it can be used for some fun shenanigans.

What does !! operator mean in R

The !! and {{ operators are placeholders to flag a variable as having been quoted. They are usually only needed if you intend to program with the tidyverse.
The tidyverse likes to leverage NSE (non-standard Evaluation) in order to reduce the amount of repetition. The most frequent application is towards the "data.frame" class, in which expressions/symbols are evaluated in the context of a data.frame before searching other scopes.
In order for this to work, some special functions (like in the package dplyr) have arguments that are quoted. To quote an expression, is to save the symbols that make up the expression and prevent the evaluation (in the context of tidyverse they use "quosures", which is like a quoted expression except it contains a reference to the environment the expression was made).
While NSE is great for interactive use, it is notably harder to program with.
Lets consider the dplyr::select

 library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union

iris <- as_tibble(iris)

my_select <- function(.data, col) {
select(.data, col)
}

select(iris, Species)
#> # A tibble: 150 × 1
#> Species
#> <fct>
#> 1 setosa
#> 2 setosa
#> 3 setosa
#> 4 setosa
#> 5 setosa
#> 6 setosa
#> 7 setosa
#> 8 setosa
#> 9 setosa
#> 10 setosa
#> # … with 140 more rows
my_select(iris, Species)
#> Error: object 'Species' not found

we encounter an error because within the scope of my_select
the col argument is evaluated with standard evaluation and
cannot find a variable named Species.

If we attempt to create a variable in the global environemnt, we see that the funciton
works - but it isn't behaving to the heuristics of the tidyverse. In fact,
they produce a note to inform you that this is ambiguous use.

 Species <- "Sepal.Width"
my_select(iris, Species)
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(col)` instead of `col` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
#> This message is displayed once per session.
#> # A tibble: 150 × 1
#> Sepal.Width
#> <dbl>
#> 1 3.5
#> 2 3
#> 3 3.2
#> 4 3.1
#> 5 3.6
#> 6 3.9
#> 7 3.4
#> 8 3.4
#> 9 2.9
#> 10 3.1
#> # … with 140 more rows

To remedy this, we need
to prevent evaluation with enquo() and unquote with !! or just use {{.

 my_select2 <- function(.data, col) {
col_quo <- enquo(col)
select(.data, !!col_quo) #attempting to find whatever symbols were passed to `col` arugment
}
#' `{{` enables the user to skip using the `enquo()` step.
my_select3 <- function(.data, col) {
select(.data, {{col}})
}

my_select2(iris, Species)
#> # A tibble: 150 × 1
#> Species
#> <fct>
#> 1 setosa
#> 2 setosa
#> 3 setosa
#> 4 setosa
#> 5 setosa
#> 6 setosa
#> 7 setosa
#> 8 setosa
#> 9 setosa
#> 10 setosa
#> # … with 140 more rows
my_select3(iris, Species)
#> # A tibble: 150 × 1
#> Species
#> <fct>
#> 1 setosa
#> 2 setosa
#> 3 setosa
#> 4 setosa
#> 5 setosa
#> 6 setosa
#> 7 setosa
#> 8 setosa
#> 9 setosa
#> 10 setosa
#> # … with 140 more rows

In summary, you really only need !! and {{ if you are trying to apply NSE programatically
or do some type of programming on the language.

!!! is used to splice a list/vector of some sort into arguments of some quoting expression.

 library(rlang)
quo_let <- quo(paste(!!!LETTERS))
quo_let
#> <quosure>
#> expr: ^paste("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
#> "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y",
#> "Z")
#> env: global
eval_tidy(quo_let)
#> [1] "A B C D E F G H I J K L M N O P Q R S T U V W X Y Z"

Created on 2021-08-30 by the reprex package (v2.0.1)

Boolean operators && and ||

The shorter ones are vectorized, meaning they can return a vector, like this:

((-2:2) >= 0) & ((-2:2) <= 0)
# [1] FALSE FALSE TRUE FALSE FALSE

The longer form evaluates left to right examining only the first element of each vector, so the above gives

((-2:2) >= 0) && ((-2:2) <= 0)
# [1] FALSE

As the help page says, this makes the longer form "appropriate for programming control-flow and [is] typically preferred in if clauses."

So you want to use the long forms only when you are certain the vectors are length one.

You should be absolutely certain your vectors are only length 1, such as in cases where they are functions that return only length 1 booleans. You want to use the short forms if the vectors are length possibly >1. So if you're not absolutely sure, you should either check first, or use the short form and then use all and any to reduce it to length one for use in control flow statements, like if.

The functions all and any are often used on the result of a vectorized comparison to see if all or any of the comparisons are true, respectively. The results from these functions are sure to be length 1 so they are appropriate for use in if clauses, while the results from the vectorized comparison are not. (Though those results would be appropriate for use in ifelse.

One final difference: the && and || only evaluate as many terms as they need to (which seems to be what is meant by short-circuiting). For example, here's a comparison using an undefined value a; if it didn't short-circuit, as & and | don't, it would give an error.

a
# Error: object 'a' not found
TRUE || a
# [1] TRUE
FALSE && a
# [1] FALSE
TRUE | a
# Error: object 'a' not found
FALSE & a
# Error: object 'a' not found

Finally, see section 8.2.17 in The R Inferno, titled "and and andand".

What exactly is ggplot doing with the `+` operator?

I'll answer the first question. You should ask the second question in a separate posting.

R lets you override most operators. The easiest way to do it is using the "S3" object system. This is a very simple system where you attach an attribute named "class" to the object, and that affects how R processes some functions. (The ones this applies to are called "generic functions". There are other functions that don't pay any attention to the class.)

Each ggplot2 function returns an object with a class. You can use the class() function to get the class. For example, class(ggplot(data = "mtcars")) is a character vector containing c("gg", "ggplot"), and class(geom_histogram(bins = 10, color="purple", fill="white")) is the vector c("LayerInstance","Layer","ggproto","gg").

If you ask for methods("+") you'll see all the classes with methods defined for addition, and that includes "gg", so R will call that method to process the addition in the expression you used.

How does one look up documentation for infix operators in R?

For basic operators you need to put quotes around them.

Also here is a list of R base package functions for you to play around with.

?"%*%"
# matmult {base} R Documentation
# Matrix Multiplication ...

? "'"
# Quote
# Quotes {base} R Documentation
# Quotes ...

Also has hadley mentioned:

?"?"
# Question {utils} R Documentation
# Documentation Shortcuts ...
?"??"
# help.search {utils} R Documentation
# Search the Help System

Difference between the == and %in% operators in R

%in% is value matching and "returns a vector of the positions of (first) matches of its first argument in its second" (See help('%in%')) This means you could compare vectors of different lengths to see if elements of one vector match at least one element in another. The length of output will be equal to the length of the vector being compared (the first one).

1:2 %in% rep(1:2,5)
#[1] TRUE TRUE

rep(1:2,5) %in% 1:2
#[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

#Note this output is longer in second

== is logical operator meant to compare if two things are exactly equal. If the vectors are of equal length, elements will be compared element-wise. If not, vectors will be recycled. The length of output will be equal to the length of the longer vector.

1:2 == rep(1:2,5)
#[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

rep(1:2,5) == 1:2
#[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

1:10 %in% 3:7
#[1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE

#is same as

sapply(1:10, function(a) any(a == 3:7))
#[1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE

NOTE: If possible, try to use identical or all.equal instead of == and.



Related Topics



Leave a reply



Submit