Floating Point Comparison

How should I do floating point comparison?

Comparing for greater/smaller is not really a problem unless you're working right at the edge of the float/double precision limit.

For a "fuzzy equals" comparison, this (Java code, should be easy to adapt) is what I came up with for The Floating-Point Guide after a lot of work and taking into account lots of criticism:

public static boolean nearlyEqual(float a, float b, float epsilon) {
    final float absA = Math.abs(a);
    final float absB = Math.abs(b);
    final float diff = Math.abs(a - b);

    if (a == b) { // shortcut, handles infinities
        return true;
    } else if (a == 0 || b == 0 || diff < Float.MIN_NORMAL) {
        // a or b is zero or both are extremely close to it
        // relative error is less meaningful here
        return diff < (epsilon * Float.MIN_NORMAL);
    } else { // use relative error
        return diff / (absA + absB) < epsilon;
    }
}

It comes with a test suite. You should immediately dismiss any solution that doesn't, because it is virtually guaranteed to fail in some edge cases like having one value 0, two very small values opposite of zero, or infinities.

An alternative (see link above for more details) is to convert the floats' bit patterns to integer and accept everything within a fixed integer distance.

In any case, there probably isn't any solution that is perfect for all applications. Ideally, you'd develop/adapt your own with a test suite covering your actual use cases.

What is the best way to compare floats for almost-equality in Python?

Python 3.5 adds the math.isclose and cmath.isclose functions as described in PEP 485.

If you're using an earlier version of Python, the equivalent function is given in the documentation.

def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
    return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)

rel_tol is a relative tolerance, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal.

abs_tol is an absolute tolerance that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal.

Why does comparison of floating-point to infinity work?

As anyone knowledgeable enough will know, you cannot compare two floating-point numbers with simple logic operators and expect a logical result.

This has no basis in the IEEE 754 standard or any other specification of floating-point behavior I am aware of. It is an unfortunately common misstatement of floating-point arithmetic.

The fact is that comparison for equality is a perfect operation in floating-point: It produces true if and only if the two operands represent the same number. There is never any error in a comparison for equality.

Another misstatement is that floating-point numbers approximate real numbers. Per IEEE 754, each floating-point value other than NaN represents one number, and it represents that number exactly.

The fact is that floating-point numbers are exact while floating-point operations approximate real arithmetic; correctly-rounded operations produce the nearest representable value (nearest in any direction or in a selected direction, with various rules for ties).

This distinction is critical for understanding, analyzing, designing, and writing proofs about floating-point arithmetic.

Why then, does a logical EQUAL TO comparison to INFINITY always return true when the number is, in fact, INFINITY?

As stated above, comparison for equality produces true if and only if its operands represent the same number. If x is infinity, then x == INFINITY returns true. If x is three, then x == 3 returns true.

People sometimes run into trouble when they do not understand what value is in a number. For example, in float x = 3.3;, people sometimes do not realize that C converts the double 3.3 to float, and therefore x does not contain the same value as 3.3. This is because the conversion operation approximates its results, not because the value of x is anything other than its specific assigned value.

I tried the same comparison with NAN,…

A NaN is Not a Number, so, in a comparison for equality, it never satisfies “the two operands represent the same number”, so the comparison produces false.

Floating point equality and tolerances

This blogpost contains an example, fairly foolproof implementation, and detailed theory behind it
http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
it is also one of a series, so you can always read more.
In short: use ULP for most numbers, use epsilon for numbers near zero, but there are still caveats. If you want to be sure about your floating point math i recommend reading whole series.

.NET - Floating Point Comparison

Whether you believe it or not, this is intended behaviour, and conforms to some IEEE standard.

Its not possible to represent an analogue every-day value such as a massive number or a small fraction with complete fidelity in a single binary representation. The floating point numbers in .NET, such as float or double do their best to minimize error when you assign numbers to them, so when you assigned 0.2 to the variable, the language did its best to choose the representation with the smallest error.

Its not that the number somehow degrades in memory - this is a deliberate step. If you are comparing floating point numbers, you should always allow a region either side of your comparison that is acceptable. Your representation of 0.2 is close to a very large number of decimal places. Is this good enough for your application? It looks glaring to your eyes, but actually is a very small error. When comparing doubles and floats, (to integers or to each other), you should always consider what is the acceptable precision, and accept a range either side of your expected result.

You can also choose to use other types, like decimal that has extremely good precision on decimal places - but is also very large compared to floats and doubles.

efficient floating point comparison

This looks like a contrived way of attempting to circumvent the "should never compare equality on floating point" rule. Comparing inequality is not very much different to comparing equality as you are implicity relying on floating point precision in both cases. Your final 'else' statement is an implicit A == B.

The normal idiom is if (::fabs(A - B) < e) where e is some tolerance, although in your case you don't need the ::fabs.

If you want different results for positive, negative and equality (within limits of computational precision), then do something like

if (A - B > e){
    return 0;
} else if (A - B < -e){
    return 1;
} else {
    return -1;
}

The best you can hope for is setting e to std::numeric_limits<double>::epsilon(). The actual value depends on the number of computational steps executed in order to arrive at A and B. 1e-08 is probably realistic.

As for speed, it is what it is unfortunately: I can't see this being either the bottleneck or running any faster.

3-way comparison of floating-point numbers

What's the appropriate way to do it, taking that (NAN) into account?

Return a FP type to allow 4 different returns values: -1.0, 0.0, 1.0, NAN.

Below also returns -0.0 in select cases involving -0.0.

#include <math.h>

double fcmp(double a, double b) {
  if (isunordered(a, b)) return NAN;
  if (a > b) return 1.0; 
  if (a < b) return -1.0; 
  return a - b;
}

I'd even consider propagating the a or b when one is NAN to maintain the NAN payload. There may exist many different non-a-numbers.

double fcmp(double a, double b) {
  if (isnan(a)) return a;
  if (isnan(b)) return b;
  ...
}

But let us look at a 3-way used for sorting as with qsort(). A question is where to put NANs? A common goal is to put them at the end is the list - that is all NAN are greater than others. To do so, we need to consistently compare, even if both operands are NANs with different payloads.

// All NAN considered greater than others
// return 0, a positive or negative int.
int fcmp_for_qsort(const void *ap, const void *bp) {
  double a = *(const double *) ap;
  double b = *(const double *) bp;
  if (isnan(a)) {
    if (isnan(b)) {
      return 0;
    }
    return 1;
  }
  if (isnan(b)) -1;
  return (a > b) - (a < b);
}