Do you reassign == and != to isTRUE( all.equal() )?
As @joran alluded to, you'll run into floating point issues with ==
and !=
in pretty much any other language too. One important aspect of them in R is the vectorization part.
It would be much better to define a new function almostEqual
, fuzzyEqual
or similar. It is unfortunate that there is no such base function. all.equal
isn't very efficient since it handles all kinds of objects and returns a string describing the difference when mostly you just want TRUE
or FALSE
.
Here's an example of such a function. It's vectorized like ==
.
almostEqual <- function(x, y, tolerance=1e-8) {
diff <- abs(x - y)
mag <- pmax( abs(x), abs(y) )
ifelse( mag > tolerance, diff/mag <= tolerance, diff <= tolerance)
}
almostEqual(1, c(1+1e-8, 1+2e-8)) # [1] TRUE FALSE
...it is around 2x faster than all.equal
for scalar values, and much faster with vectors.
x <- 1
y <- 1+1e-8
system.time(for(i in 1:1e4) almostEqual(x, y)) # 0.44 seconds
system.time(for(i in 1:1e4) all.equal(x, y)) # 0.93 seconds
Why doesn't all.equal work within dplyr's mutate function?
As @DavidArenburg said, all.equal()
is not vectorized.
The following code will work:
mutate(patientdata, isAge34 = age == 34)
Incorrect logical return for simple sequence
You can duplicate what all.equal is doing by writing a comparison function of your own:
is.nearenough=function(x,y,tol=.Machine$double.eps^0.5){
abs(x-y)<tol
}
then you can do which(is.nearenough(s,1.2)) where s is your sequence. You may need to tweak the tolerance for your application.
What's the difference between identical(x, y) and isTRUE(all.equal(x, y))?
all.equal
tests for near equality, while identical
is more exact (e.g. it has no tolerance for differences, and it compares storage type). From ?identical:
The function ‘all.equal’ is also
sometimes used to test equality this
way, but was intended for something
different: it allows for small
differences in numeric results.
And one reason you would wrap all.equal
in isTRUE
is because all.equal
will report differences rather than simply return FALSE
.
Test for equality among all elements of a single numeric vector
I use this method, which compares the min and the max, after dividing by the mean:
# Determine if range of vector is FP 0.
zero_range <- function(x, tol = .Machine$double.eps ^ 0.5) {
if (length(x) == 1) return(TRUE)
x <- range(x) / mean(x)
isTRUE(all.equal(x[1], x[2], tolerance = tol))
}
If you were using this more seriously, you'd probably want to remove missing values before computing the range and mean.
How to avoid `all` function returning `TRUE` when comparing to `NULL` or an empty object
The documentation for all
clearly says:
That all(logical(0)) is true is a useful convention: it ensures that
all(all(x), all(y)) == all(x, y) even if x has length zero.
so there is no way to obtain your desired result with all
.
As noted in the comments, identical
and all.equal
are closer matches to your request. However, identical
wouldn't warn you if the objects under comparison are of different length. The drawback of all.equal
is that it wouldn't return you a logical value in the case of different lengths:
all.equal(y[y>5 & y<0],z[z>5 & z<10])
# [1] "Numeric: lengths (0, 4) differ"
and I believe that the official documentation suggests not to use all.equal
directly in if
expressions:
Do not use all.equal directly in if expressions—either use
isTRUE(all.equal(....)) or identical if appropriate.
However, isTRUE(all.equal(y[y>5 & y<0],z[z>5 & z<10]))
wouldn't tell you about different lengths.
[Solution]
You can simply write your own function for this purpose and add some syntactic sugar for convenience:
'%=%' <- function(a,b) {
if (length(a)!=length(b)) warning('Objects are of different length')
identical(a,b)
}
It will return TRUE
if the objects are identical
y[y>5 & y<10] %=% z[z>5 & z<10]
# [1] TRUE
and FALSE
if the objects are different (+warning if they are of different length):
y[y>5 & y<0] %=% z[z>5 & z<10]
# [1] FALSE
# Warning message:
# In y[y > 5 & y < 0] %=% z[z > 5 & z < 10] :
# Objects are of different length
Which equals operator (== vs ===) should be used in JavaScript comparisons?
The strict equality operator (===
) behaves identically to the abstract equality operator (==
) except no type conversion is done, and the types must be the same to be considered equal.
Reference: Javascript Tutorial: Comparison Operators
The ==
operator will compare for equality after doing any necessary type conversions. The ===
operator will not do the conversion, so if two values are not the same type ===
will simply return false
. Both are equally quick.
To quote Douglas Crockford's excellent JavaScript: The Good Parts,
JavaScript has two sets of equality operators:
===
and!==
, and their evil twins==
and!=
. The good ones work the way you would expect. If the two operands are of the same type and have the same value, then===
producestrue
and!==
producesfalse
. The evil twins do the right thing when the operands are of the same type, but if they are of different types, they attempt to coerce the values. the rules by which they do that are complicated and unmemorable. These are some of the interesting cases:'' == '0' // false
0 == '' // true
0 == '0' // true
false == 'false' // false
false == '0' // true
false == undefined // false
false == null // false
null == undefined // true
' \t\r\n ' == 0 // true
The lack of transitivity is alarming. My advice is to never use the evil twins. Instead, always use
===
and!==
. All of the comparisons just shown producefalse
with the===
operator.
Update:
A good point was brought up by @Casebash in the comments and in @Phillipe Laybaert's answer concerning objects. For objects, ==
and ===
act consistently with one another (except in a special case).
var a = [1,2,3];
var b = [1,2,3];
var c = { x: 1, y: 2 };
var d = { x: 1, y: 2 };
var e = "text";
var f = "te" + "xt";
a == b // false
a === b // false
c == d // false
c === d // false
e == f // true
e === f // true
The special case is when you compare a primitive with an object that evaluates to the same primitive, due to its toString
or valueOf
method. For example, consider the comparison of a string primitive with a string object created using the String
constructor.
"abc" == new String("abc") // true
"abc" === new String("abc") // false
Here the ==
operator is checking the values of the two objects and returning true
, but the ===
is seeing that they're not the same type and returning false
. Which one is correct? That really depends on what you're trying to compare. My advice is to bypass the question entirely and just don't use the String
constructor to create string objects from string literals.
Reference
http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.3
Is there a difference between == and is?
is
will return True
if two variables point to the same object (in memory), ==
if the objects referred to by the variables are equal.
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
>>> b == a
True
# Make a new copy of list `a` via the slice operator,
# and assign it to variable `b`
>>> b = a[:]
>>> b is a
False
>>> b == a
True
In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:
>>> 1000 is 10**3
False
>>> 1000 == 10**3
True
The same holds true for string literals:
>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True
Please see this question as well.
Related Topics
Quickest Way to Read a Subset of Rows of a CSV
Top to Bottom Alignment of Two Ggplot2 Figures
Do You Reassign == and != to Istrue( All.Equal() )
R:Loops to Process Large Dataset(Gbs) in Chunks
How to Colour the Labels of a Dendrogram by an Additional Factor Variable in R
Identify Consecutive Sequences Based on a Given Variable
Keep First Row by Multiple Columns in an R Data.Table
How to Pass Aes Parameters of Ggplot to Function
Running Out of Heap Space in Sparklyr, But Have Plenty of Memory
Tidyverse Not Loaded, It Says "Namespace 'Vctrs' 0.2.0 Is Already Loaded, But >= 0.2.1 Is Required"
Consistent Factor Levels for Same Value Over Different Datasets
Open Hyperlink on Click on an Ggplot/Plotly Chart
R: Colsums When Not All Columns Are Numeric
Remove Words in One Column Present in Another Column in R
How to Pass Vector to Integrate Function