Rounding Errors

Rounding errors in python

Each multiplication results in twice as many digits (or bits) as the original numbers and needs to be rounded so that it will fit back into the space allocated for a floating point number. This rounding can potentially change the results when you rearrange the order.

Rounding errors for very large numbers in python

Contrary to common misconception, the decimal module does not perform exact arithmetic. It performs adjustable-precision, decimal, floating-point arithmetic.

The decimal module defaults to 28-decimal-digit precision. The rounding operation you requested needs at least 29-digit precision, and rather than silently giving you less precision, decimal throws an error for this particular operation.

Of course, the operations you wanted to perform involve much higher than 29-digit precision, but because you did half your math in float arithmetic instead of decimal arithmetic, you lost most of your precision before decimal could even get involved. float has a little less than 16 decimal digits of precision. Wrapping a float computation in a Decimal call won't do the math in Decimal arithmetic; you need to start with Decimal:

import decimal
decimal.getcontext().prec = 100 # enough
print(round(decimal.Decimal('0.123456789123456789123456789123456789123')+
decimal.Decimal('123456789123456789123456789'), 2))

How do I avoid rounding errors with doubles?

Lets start with this:

How do I avoid rounding errors with doubles?

Basically, you can't. They are inherent to numerical calculations using floating point types. Trust me ... or take the time to read this article:

  • What every Computer Scientist should know about floating-point arithmetic by David Goldberg.

In this case, the other thing that comes into play is that trigonometric functions are implemented by computing a finite number of steps of an infinite series with finite precision (i.e. floating point) arithmetic. The javadoc for the Math class leaves some "wiggle room" on the accuracy of the math functions. It is worth reading the javadocs to understand the expected error bounds.

Finally, if you are computing (for example) sin π/2 you need to consider how accurate your representation of π/2 is.


So what you should really be asking is how to deal with the rounding error that unavoidably happens.

In this case, you are asking is how to make it look like the user of your program as if there isn't any rounding error. There are two approaches to this:

  • Leave it alone! The rounding errors occur, so we should not lie to the users about it. It is better to educate them. (Honestly, this is high school maths, and even "the pointy haired boss" should understand that arithmetic is inexact.)

    Routines like printf do a pretty good job. And the -0.000 displayed in this case is actually a truthful answer. It means that the computed answer rounds to zero to 3 decimal places but is actually negative. This is not actually hard for someone with high school maths to understand. If you explain it.

  • Lie. Fake it. Put in some special case code to explicitly convert numbers between -0.0005 and zero to exactly zero. The code suggested in a comment

      System.out.printf(Locale.ROOT, "%.3f ", Math.round(v * 1000d) / 1000d);

    is another way to do the job. But the risk of this is that the lie could be dangerous in some circumstances. On the other hand, you could say that real mistake problem is displaying the numbers to 3 decimal places.

Rounding error in R?

This is a combination of two extremely Frequently A'd Qs.

  • finite floating-point precision: this R FAQ 7.31, see e.g. Why are these numbers not equal? . The value gets rounded to 178379.5. It won't help if you set options(digits=22) to print numbers to more decimal places; the precision has been lost because (as you suggested) R only stores values up to 53 binary/22ish decimal digits of precision.
  • round to even: R "rounds to even", see Is there an error in round function in R? . That means the value will be rounded up.

This is not about printing precision.

If you had used fewer '9's, you would have seen what you expected (which would be a combination of R's limited printing precision plus the expected rounding)

> x <- 178379.49
>
> x
[1] 178379.5 ## prints as .5, but full precision is present
> round(x)
[1] 178379

Rounding Errors?

It is true.

It is an inherent limitation of how floating point values are represented in memory in a finite number of bits.

This program, for instance, prints "false":

public class Main {
public static void main(String[] args) {
double a = 0.7;
double b = 0.9;
double x = a + 0.1;
double y = b - 0.1;
System.out.println(x == y);
}
}

Instead of exact comparison with '==' you usually decide on some level of precision and ask if the numbers are "close enough":

System.out.println(Math.abs(x - y) < 0.0001);

SQL Server: Avoiding Rounding Errors

What you are dealing with is not floating point mathematics but just lack of decimal precision due to rounding.

A solution I've implemented (mainly in reporting) is to do something like the below, where you calculate the difference due to rounding and add that to one row (usually the largest value to minimise the effect).

declare @TOTALCOST numeric(18, 4) = 1.125;

with CTE as (
select item='Item1', volume=3.636
union
select item='Item2', volume=14.946
union
select item='Item3', volume=26.05
), cte2 as (
select
item,
volume,
totVol = (Sum(volume) over ()),
proportion = Round((Sum(volume) over (partition by item)) / (Sum(volume) over ()),3),
costAllocation = Round((Sum(volume) over (partition by item)) * @TOTALCOST / (Sum(volume) over ()),3),
Row_Number() over(order by volume desc) rn
from CTE
)
select item, volume, totVol, proportion,
costAllocation + case when rn=1 then @TOTALCOST-Sum(costAllocation) over() else 0 end costAllocation
from cte2

Rounding errors: deal with operation on vectors with very small components

You are dealing with two different issues here:

Underflow / Overflow

Calculating the norm of very small values may underflow to zero when you calculate the square. Large values may overflow to infinity. This can be solved by using a stable norm algorithm.
A simple way to deal with this is to scale the values temporarily. See for example this:

a = np.array((1e-30, 2e-30), dtype='f4')
np.linalg.norm(a) # result is 0 due to underflow in single precision
scale = 1. / np.max(np.abs(a))
np.linalg.norm(a * scale) / scale # result is 2.236e-30

This is now a two-pass algorithm because you have to iterate over all your data before determining a scaling value. If this is not to your liking, there are single-pass algorithms, though you probably don't want to implement them in Python. The classic would be Blue's algorithm:
http://degiorgi.math.hr/~singer/aaa_sem/Float_Norm/p15-blue.pdf

A simpler but much less efficient way is to simply chain calls to hypot (which uses a stable algorithm). You should never do this, but just for completion:

norm = 0.
for value in a:
norm = math.hypot(norm, value)

Or even a hierarchical version like this to reduce the number of numpy calls:

norm = a
while len(norm) > 1:
hlen = len(norm) >> 1
front, back = norm[:hlen], norm[hlen: 2 * hlen]
tail = norm[2 * hlen:] # only present with length is not even
norm = np.append(np.hypot(front, back), tail)
norm = norm[0]

You are free to combine these strategies. For example if you don't have your data available all at once but blockwise (e.g. because the data set is too large and you read it from disk), you can pick a scaling value per block, then chain the blocks together with a few calls to hypot.

Rounding errors

You accumulate rounding errors, especially when accumulating values of different magnitude. If you accumulate values of different signs, you may also experience catastrophic cancelation. To avoid these issues, you need to use a compensated summation scheme. Python provides a very good one with math.fsum.
So if you absolutely need highest accuracy, go with something like this:

math.sqrt(math.fsum(np.square(a * scale))) / scale

Note that this is overkill for a simple norm since there are no sign changes in the accumulation (so no cancelation) and the squaring increases all differences in magnitude so that the result will always be dominated by its largest components, unless you are dealing with a truly horrifying dataset. That numpy does not provide built-in solutions for these issues tells you that the naive algorithm is actually good enough for most real-world applications. No reason to go overboard with the implementation before you actually run into trouble.

Application to dot products

I've focused on the l2 norm because that is the case that is more generally understood to be hazardous. Of course you can apply similar strategies to a dot product.

np.dot(a, b)

ascale = 1. / np.max(np.abs(a))
bscale = 1. / np.max(np.abs(b))

np.dot(a * ascale, b * bscale) / (ascale * bscale)

This is particularly useful if you use mixed precision. For example the dot product could be calculated in single precision but the x / (ascale * bscale) could take place in double or even extended precision.

And of course math.fsum is still available: dot = math.fsum(a * b)

Bonus thoughts

The whole scaling itself introduces some rounding errors because no one guarantees you that a/b is exactly representable in floating point. However, you can avoid this by picking a scaling factor that is an exact power of 2. Multiplying with a power of 2 is always exact in FP (assuming you stay in the representable range). You can get the exponent with math.frexp



Related Topics



Leave a reply



Submit