Compare Double to Zero Using Epsilon

Compare double to zero using epsilon

Assuming 64-bit IEEE double, there is a 52-bit mantissa and 11-bit exponent. Let's break it to bits:

1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^0 = 1

The smallest representable number greater than 1:

1.0000 00000000 00000000 00000000 00000000 00000000 00000001 × 2^0 = 1 + 2^-52

Therefore:

epsilon = (1 + 2^-52) - 1 = 2^-52

Are there any numbers between 0 and epsilon? Plenty... E.g. the minimal positive representable (normal) number is:

1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^-1022 = 2^-1022

In fact there are (1022 - 52 + 1)×2^52 = 4372995238176751616 numbers between 0 and epsilon, which is 47% of all the positive representable numbers...

How to safely compare std::complexdouble with Zero with some precision Epsilon?

Have you tried std::fpclassify from <cmath>?

if (std::fpclassify(a.real()) == FP_ZERO) {}

To check if both the real and imaginary part of a complex are 0:

if (a == 0.0) {}

As mentioned by @eerorika a long time before I did. An answer you rejected.

Floating point precision, rounding, flag raising, subnormal is all implementation-defined (See: [basic.fundamental], [support.limits.general], and ISO C 5.2.4.2.2).

The right way to compare a System.Double to '0' (a number, int?)

Well, how close do you need the value to be to 0? If you go through a lot of floating point operations which in "infinite precision" might result in 0, you could end up with a result "very close" to 0.

Typically in this situation you want to provide some sort of epsilon, and check that the result is just within that epsilon:

if (Math.Abs(something) < 0.001)

The epsilon you should use is application-specific - it depends on what you're doing.

Of course, if the result should be exactly zero, then a simple equality check is fine.

Modern practice to compare double/float for equality in modern C++

Is this code with modern C++11/14/17/21 is still the way we should compare float and doubles, or now it's ok just to write if (double1 == double2) And compiler will handle the epsilon issue for us?

Both approaches function the same in modern C++ as they did in early C++.

Both approaches are also flawed.

  • Using == assumes that your code has accounted for any floating point rounding errors, and it's very rare/difficult for code to do that.

  • Comparing against epsilon assumes that a reasonable amount of rounding error will be less than the constant epsilon, and that is very likely a wrong assumption!

    • If your numbers have magnitude greater than 2.0, your epsilon trick will be no different from direct comparison, and have the same flaws. Regardless of whether you use < or <=.
    • If your numbers have the same sign and a magnitude smaller than epsilon, your epsilon trick will say they are always equal, even if one is hundreds of times larger than the other. They would both be equal to zero, too.

A wise approach may be to avoid writing code that depends on whether floating point numbers are equal. Instead test if they are relatively close, by some factor.

The code below will test whether two numbers are within about 0.01% of each other. Regardless of their scale.

const auto relative_difference_factor = 0.0001.    // 0.01%
const auto greater_magnitude = std::max(std::abs(double1),std::abs(double2));

if ( std::abs(double1-double2) < relative_difference_factor * greater_magnitude )
std::cout<<"Relatively close";
else
std::cout<<"Not relatively close";

Comparing floating point number to zero

You are correct with your observation.

If x == 0.0, then abs(x) * epsilon is zero and you're testing whether abs(y) <= 0.0.

If y == 0.0 then you're testing abs(x) <= abs(x) * epsilon which means either epsilon >= 1 (it isn't) or x == 0.0.

So either is_equal(val, 0.0) or is_equal(0.0, val) would be pointless, and you could just say val == 0.0. If you want to only accept exactly +0.0 and -0.0.

The FAQ's recommendation in this case is of limited utility. There is no "one size fits all" floating-point comparison. You have to think about the semantics of your variables, the acceptable range of values, and the magnitude of error introduced by your computations. Even the FAQ mentions a caveat, saying this function is not usually a problem "when the magnitudes of x and y are significantly larger than epsilon, but your mileage may vary".

precision of comparing double values with EPSILON in C

Bounds calculations are convoluted and have holes.

See that upper_bound_x[n] == lower_bound_x[n+1]. Then when a compare occurs with (D->values[k][col2] == upper_bound_x[n], it will neither fit in in region n nor region n+1.

// Existing code
upper_bound_x[0]=min_x+interval_x; //upper bound of the first region in y
lower_bound_x[0]=min_x; //lower bound of the first region in y
for (j=0; j<X_REGIONS; j++){
upper_bound_x[j+1]=upper_bound_x[j]+interval_x;
lower_bound_x[j+1]=lower_bound_x[j]+interval_x;
}
....
if (D->values[k][col2] < upper_bound_x[j] && D->values[k][col2] > lower_bound_x[j] ){

Suggest re-write and use a bound_x[X_REGIONS+1] array and then use compare:

if (D->values[k][col2] >= bound_x[j] && D->values[k][col2] < bound_x[j] ){

Alternately, code could skip the bound[] arrays (x&y) and calculate bounds on the fly.

Minor:

Repeated code: Make helper functions to calculate min and max and then cal once each to calculate for the x and the y.

Code should post definition of CSV. It is confusion to have x in one column and y in another. Better to have a array of point (Make own struct holding an x and y), rather than an array of double pairs.

Be sure to #include <math.h>



Related Topics



Leave a reply



Submit