Difference Between the == and %In% Operators in R

Difference between the == and %in% operators in R

%in% is value matching and "returns a vector of the positions of (first) matches of its first argument in its second" (See help('%in%')) This means you could compare vectors of different lengths to see if elements of one vector match at least one element in another. The length of output will be equal to the length of the vector being compared (the first one).

1:2 %in% rep(1:2,5)
#[1] TRUE TRUE

rep(1:2,5) %in% 1:2
#[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

#Note this output is longer in second

== is logical operator meant to compare if two things are exactly equal. If the vectors are of equal length, elements will be compared element-wise. If not, vectors will be recycled. The length of output will be equal to the length of the longer vector.

1:2 == rep(1:2,5)
#[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

rep(1:2,5) == 1:2
#[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

1:10 %in% 3:7
#[1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE

#is same as

sapply(1:10, function(a) any(a == 3:7))
#[1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE

NOTE: If possible, try to use identical or all.equal instead of == and.

What is the difference between `%in%` and `==`?

The problem is vector recycling.

Your first line does exactly what you'd expect. It checks what elements of df$time are in c(0.5, 3) and returns the values which are.

Your second line is trickier. It's actually equivalent to

df[df$time == rep(c(0.5,3), length.out=nrow(df)),]

To see this, let's see what happens if use a vector rep(0.5, 10):

rep(0.5, 10) == c(0.5, 3)
[1] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE

See how it returns every odd value. Essentially it's matching 0.5 to the vector c(0.5, 3, 0.5, 3, 0.5...)

You can manipulate a vector to produce no matches this way. Take the vector: rep(c(3, 0.5), 5):

rep(c(3, 0.5), 5) == c(0.5, 3)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

They're all FALSE; you are matching every 0.5 with 3 and vice versa.

What's the difference between `=` and `-` in R?

From here:

The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.

The difference between & and && in R

& is a logical operator so R coverts your quantities to logical values before comparison. For numeric values any non-0 (and non-NA/Null/NaN stuff) gets the value TRUE and 0 gets FALSE. So with that said things make quite a bit of sense

> as.logical(c(1,2,3))
[1] TRUE TRUE TRUE
> as.logical(c(1,3,3))
[1] TRUE TRUE TRUE
> as.logical(c(1,2,3)) & as.logical(c(1,2,3))
[1] TRUE TRUE TRUE
> as.logical(c(1,2,3)) & as.logical(c(1,3,3))
[1] TRUE TRUE TRUE

What is the difference between = and ==?

It depends on context as to what = means. == is always for testing equality.

= can be

  1. in most cases used as a drop-in replacement for <-, the assignment operator.

    > x = 10
    > x
    [1] 10
  2. used as the separator for key-value pairs used to assign values to arguments in function calls.

    rnorm(n = 10, mean = 5, sd = 2)

Because of 2. above, = can't be used as a drop-in replacement for <- in all situations. Consider

> rnorm(N <- 10, mean = 5, sd = 2)
[1] 4.893132 4.572640 3.801045 3.646863 4.522483 4.881694 6.710255 6.314024
[9] 2.268258 9.387091
> rnorm(N = 10, mean = 5, sd = 2)
Error in rnorm(N = 10, mean = 5, sd = 2) : unused argument (N = 10)
> N
[1] 10

Now some would consider rnorm(N <- 10, mean = 5, sd = 2) poor programming, but it is valid and you need to be aware of the differences between = and <- for assignment.

== is always used for equality testing:

> set.seed(10)
> logi <- sample(c(TRUE, FALSE), 10, replace = TRUE)
> logi
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> logi == TRUE
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> seq.int(1, 10) == 5L
[1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE

Do be careful with == too however, as it really means exactly equal to and on a computer where floating point operations are involved you may not get the answer you were expecting. For example, from ?'==':

> x1 <- 0.5 - 0.3
> x2 <- 0.3 - 0.1
> x1 == x2 # FALSE on most machines
[1] FALSE
> identical(all.equal(x1, x2), TRUE) # TRUE everywhere
[1] TRUE

where all.equal() tests for equality allowing for a little bit of fuzziness due to loss of precision/floating point operations.

Is there a technical difference between = and -

Yes there is. This is what the help page of '=' says:

The operators <- and = assign into the
environment in which they are
evaluated. The operator <- can be used
anywhere, whereas the operator = is
only allowed at the top level (e.g.,
in the complete expression typed at
the command prompt) or as one of the
subexpressions in a braced list of
expressions.

With "can be used" the help file means assigning an object here. In a function call you can't assign an object with = because = means assigning arguments there.

Basically, if you use <- then you assign a variable that you will be able to use in your current environment. For example, consider:

matrix(1,nrow=2)

This just makes a 2 row matrix. Now consider:

matrix(1,nrow<-2)

This also gives you a two row matrix, but now we also have an object called nrow which evaluates to 2! What happened is that in the second use we didn't assign the argument nrow 2, we assigned an object nrow 2 and send that to the second argument of matrix, which happens to be nrow.

Edit:

As for the edited questions. Both are the same. The use of = or <- can cause a lot of discussion as to which one is best. Many style guides advocate <- and I agree with that, but do keep spaces around <- assignments or they can become quite hard to interpret. If you don't use spaces (you should, except on twitter), I prefer =, and never use ->!

But really it doesn't matter what you use as long as you are consistent in your choice. Using = on one line and <- on the next results in very ugly code.

What is the difference between '/' operator and %/% operator in r programming language?

/ does the usual division.

Regarding %/%, the documentation states:

%% indicates x mod y and %/% indicates integer division. It
is guaranteed that x == (x %% y) + y * ( x %/% y ) (up to
rounding error) unless y == 0 […]

To find the documentation for such operators, enter the following on the R console:

help("%/%")

Boolean operators && and ||

The shorter ones are vectorized, meaning they can return a vector, like this:

((-2:2) >= 0) & ((-2:2) <= 0)
# [1] FALSE FALSE TRUE FALSE FALSE

The longer form evaluates left to right examining only the first element of each vector, so the above gives

((-2:2) >= 0) && ((-2:2) <= 0)
# [1] FALSE

As the help page says, this makes the longer form "appropriate for programming control-flow and [is] typically preferred in if clauses."

So you want to use the long forms only when you are certain the vectors are length one.

You should be absolutely certain your vectors are only length 1, such as in cases where they are functions that return only length 1 booleans. You want to use the short forms if the vectors are length possibly >1. So if you're not absolutely sure, you should either check first, or use the short form and then use all and any to reduce it to length one for use in control flow statements, like if.

The functions all and any are often used on the result of a vectorized comparison to see if all or any of the comparisons are true, respectively. The results from these functions are sure to be length 1 so they are appropriate for use in if clauses, while the results from the vectorized comparison are not. (Though those results would be appropriate for use in ifelse.

One final difference: the && and || only evaluate as many terms as they need to (which seems to be what is meant by short-circuiting). For example, here's a comparison using an undefined value a; if it didn't short-circuit, as & and | don't, it would give an error.

a
# Error: object 'a' not found
TRUE || a
# [1] TRUE
FALSE && a
# [1] FALSE
TRUE | a
# Error: object 'a' not found
FALSE & a
# Error: object 'a' not found

Finally, see section 8.2.17 in The R Inferno, titled "and and andand".

What are the differences between = and - assignment operators?

What are the differences between the assignment operators = and <- in R?

As your example shows, = and <- have slightly different operator precedence (which determines the order of evaluation when they are mixed in the same expression). In fact, ?Syntax in R gives the following operator precedence table, from highest to lowest:


‘-> ->>’ rightwards assignment
‘<- <<-’ assignment (right to left)
‘=’ assignment (right to left)

But is this the only difference?

Since you were asking about the assignment operators: yes, that is the only difference. However, you would be forgiven for believing otherwise. Even the R documentation of ?assignOps claims that there are more differences:

The operator <- can be used anywhere,
whereas the operator = is only allowed at the top level (e.g.,
in the complete expression typed at the command prompt) or as one
of the subexpressions in a braced list of expressions.

Let’s not put too fine a point on it: the R documentation is wrong. This is easy to show: we just need to find a counter-example of the = operator that isn’t (a) at the top level, nor (b) a subexpression in a braced list of expressions (i.e. {…; …}). — Without further ado:

x
# Error: object 'x' not found
sum((x = 1), 2)
# [1] 3
x
# [1] 1

Clearly we’ve performed an assignment, using =, outside of contexts (a) and (b). So, why has the documentation of a core R language feature been wrong for decades?

It’s because in R’s syntax the symbol = has two distinct meanings that get routinely conflated (even by experts, including in the documentation cited above):

  1. The first meaning is as an assignment operator. This is all we’ve talked about so far.
  2. The second meaning isn’t an operator but rather a syntax token that signals named argument passing in a function call. Unlike the = operator it performs no action at runtime, it merely changes the way an expression is parsed.

So how does R decide whether a given usage of = refers to the operator or to named argument passing? Let’s see.

In any piece of code of the general form …

‹function_name›(‹argname› = ‹value›, …)
‹function_name›(‹args›, ‹argname› = ‹value›, …)

… the = is the token that defines named argument passing: it is not the assignment operator. Furthermore, = is entirely forbidden in some syntactic contexts:

if (‹var› = ‹value›) …
while (‹var› = ‹value›) …
for (‹var› = ‹value› in ‹value2›) …
for (‹var1› in ‹var2› = ‹value›) …

Any of these will raise an error “unexpected '=' in ‹bla›”.

In any other context, = refers to the assignment operator call. In particular, merely putting parentheses around the subexpression makes any of the above (a) valid, and (b) an assignment. For instance, the following performs assignment:

median((x = 1 : 10))

But also:

if (! (nf = length(from))) return()

Now you might object that such code is atrocious (and you may be right). But I took this code from the base::file.copy function (replacing <- with =) — it’s a pervasive pattern in much of the core R codebase.

The original explanation by John Chambers, which the the R documentation is probably based on, actually explains this correctly:

[= assignment is] allowed in only two places in the grammar: at the top level (as a complete program or user-typed expression); and when isolated from surrounding logical structure, by braces or an extra pair of parentheses.


In sum, by default the operators <- and = do the same thing. But either of them can be overridden separately to change its behaviour. By contrast, <- and -> (left-to-right assignment), though syntactically distinct, always call the same function. Overriding one also overrides the other. Knowing this is rarely practical but it can be used for some fun shenanigans.



Related Topics



Leave a reply



Submit