Numeric Comparison Difficulty in R

Numeric comparison difficulty in R

I've never been a fan of all.equal for such things. It seems to me the tolerance works in mysterious ways sometimes. Why not just check for something greater than a tolerance less than 0.05

tol = 1e-5

(a-b) >= (0.05-tol)

In general, without rounding and with just conventional logic I find straight logic better than all.equal

If x == y then x-y == 0. Perhaps x-y is not exactly 0 so for such cases I use

abs(x-y) <= tol

You have to set tolerance anyway for all.equal and this is more compact and straightforward than all.equal.

Using R: How to compare two numbers with floating-point issues related to the options(digits=n) and how the numbers were introduced?

all.equal might be useful here:

n1 <- 0.14999999999999999
n2 <- .15
n3 <- 0.15000000000000002

all.equal(n1,n2)
# [1] TRUE

all.equal(n1,n3)
# [1] TRUE

You can manually specify a tolerance if you like, e.g.,

all.equal(n1, n3, tolerance = 1.5e-16)
# [1] "Mean relative difference: 1.850372e-16"

Finally, as the help page for all.equal says, if you need a bool returned, wrap it in isTRUE(all.equal(...)) or identical.

Less or equal for floats in R

Interesting question. I am sure there are better ways, but this simple function takes two vectors of doubles and returns if they are nearly equal element-wise (mode = "ae"), given the specified tolerance. It also can return if they are less than (mode = "lt") or if they are nearly equal or less than (mode = "ne.lt"), along with their "gt" equivalents...

near_equal <- function( x , y , tol = 1.5e-8 , mode = "ae" ){
ae <- mapply( function(x,y) isTRUE( all.equal( x , y , tolerance = tol ) ) , x , y )
gt <- x > y
lt <- x < y
if( mode == "ae" )
return( ae )
if( mode == "gt" )
return( gt )
if( mode == "lt" )
return( lt )
if( mode == "ne.gt" )
return( ae | gt )
if( mode == "ne.lt" )
return( ae | lt )
}


# And in action....
set.seed(1)
x <- 1:5
# [1] 1 2 3 4 5
y <- 1:5 + rnorm(5,sd=0.1)
# [1] 0.9373546 2.0183643 2.9164371 4.1595281 5.0329508


near_equal( x , y , tol = 0.05 , mode = "ae" )
#[1] FALSE TRUE TRUE TRUE TRUE

near_equal( x , y , tol = 0.05 , mode = "ne.gt" )
#[1] TRUE TRUE TRUE TRUE TRUE

Hope that helps.

First circle of R hell. 0.1 != 0.3/3

See these questions:

  • In R, what is the difference between these two?
  • Numeric comparison difficulty in R

Generally speaking, you can deal with this by including a tolerance level as per the second link above.

Rounding issue in all.equal

This has to do with floating point accuracy. The manual isn't entirely clear at first glance, but in your example the mean absolute difference of 2-1.981 is 0.019 which is > 0.01, the tolerance. scale is also NULL. Therefore the comparison made is the relative difference scaled by the mean absolute difference. Eh?!

Using tolerance implies that you care about the magnitude of the numbers involved. Relative difference accounts for not how big the difference is (absolute terms), but how great it is, relative to the numbers being compared. Given the example in the link, the difference between 5 and 6 is more significant (I use the term loosely) than between 1,000,000,000 and 1,000,000,001.

So if the relative difference between the two numbers is less than tolerance the numbers are considered equal. For two single numbers (as in this example) the relative difference is given by:

( current - target ) / current

Which is

( 2 - 1.981 ) / 2 == 0.0095

The tolerance you specified is 0.01 therefore the numbers are considered equal because the relative difference is less than this. The difference between these numbers ± the relative difference also just happens to be the smallest representable floating point number!

identical( abs( ( 2 - 0.0095 ) - ( 1.981 + 0.0095 ) ) , .Machine$double.eps )
[1] TRUE

Now try:

all.equal( 2 , 1.981 , 0.00949999999999 )
[1] "Mean relative difference: 0.0095"


Related Topics



Leave a reply



Submit