Java:Why Should We Use Bigdecimal Instead of Double in the Real World

Java:Why should we use BigDecimal instead of Double in the real world?

It's called loss of precision and is very noticeable when working with either very big numbers or very small numbers. The binary representation of decimal numbers with a radix is in many cases an approximation and not an absolute value. To understand why you need to read up on floating number representation in binary. Here is a link: http://en.wikipedia.org/wiki/IEEE_754-2008. Here is a quick demonstration:

in bc (An arbitrary precision calculator language) with precision=10:

(1/3+1/12+1/8+1/15) = 0.6083333332

(1/3+1/12+1/8) = 0.541666666666666

(1/3+1/12) = 0.416666666666666

Java double:

0.6083333333333333

0.5416666666666666

0.41666666666666663

Java float:

0.60833335

0.5416667

0.4166667

If you are a bank and are responsible for thousands of transactions every day, even though they are not to and from one and same account (or maybe they are) you have to have reliable numbers. Binary floats are not reliable - not unless you understand how they work and their limitations.

Double vs. BigDecimal?

A BigDecimal is an exact way of representing numbers. A Double has a certain precision. Working with doubles of various magnitudes (say d1=1000.0 and d2=0.001) could result in the 0.001 being dropped altogether when summing as the difference in magnitude is so large. With BigDecimal this would not happen.

The disadvantage of BigDecimal is that it's slower, and it's a bit more difficult to program algorithms that way (due to + - * and / not being overloaded).

If you are dealing with money, or precision is a must, use BigDecimal. Otherwise Doubles tend to be good enough.

I do recommend reading the javadoc of BigDecimal as they do explain things better than I do here :)

Using BigDecimal as a beginner instead of Double?

double should be used whenever you are working with real numbers where perfect precision is not required. Here are some common examples:

  • Computer graphics, for several reasons: exact precision is rarely required, as few monitors have more than a low four-digit number of pixels; additionally, most trigonometric functions are available only for float and double, and trigonometry is essential to most graphics work
  • Statistical analysis; metrics like mean and standard deviation are typically expected to have at most a little more precision than the individual data points
  • Randomness (e.g. Random.nextDouble()), where the point isn't a specific number of digits; the priority is a real number over some specific distribution
  • Machine learning, where multiplication factors are being learned and specific decimal precision isn't required

For values like money, using double at any point at all is a recipe for a bad time.
BigDecimal should generally be used for money and anything else where you care about a specific number of decimal digits, but it has inferior performance and offers fewer mathematical operations.

Why is BigDecimal more precise than double?

A double is a remarkably fast floating point data type implemented at a very low level on many chipsets.

Its precision is sufficient for very many applications: e.g. measuring the distance of the sun to Pluto to the nearest centimetre!

Always a performance trade-off when thinking about moving to a more precise data type as the latter will be much slower and your favourite mathematical libraries may not support them. Remember that the outputs of your program are a function of the quality of the inputs.

As a final remark, never use a double to represent cash quantities though!

Why not use Double or Float to represent currency?

Because floats and doubles cannot accurately represent the base 10 multiples that we use for money. This issue isn't just for Java, it's for any programming language that uses base 2 floating-point types.

In base 10, you can write 10.25 as 1025 * 10-2 (an integer times a power of 10). IEEE-754 floating-point numbers are different, but a very simple way to think about them is to multiply by a power of two instead. For instance, you could be looking at 164 * 2-4 (an integer times a power of two), which is also equal to 10.25. That's not how the numbers are represented in memory, but the math implications are the same.

Even in base 10, this notation cannot accurately represent most simple fractions. For instance, you can't represent 1/3: the decimal representation is repeating (0.3333...), so there is no finite integer that you can multiply by a power of 10 to get 1/3. You could settle on a long sequence of 3's and a small exponent, like 333333333 * 10-10, but it is not accurate: if you multiply that by 3, you won't get 1.

However, for the purpose of counting money, at least for countries whose money is valued within an order of magnitude of the US dollar, usually all you need is to be able to store multiples of 10-2, so it doesn't really matter that 1/3 can't be represented.

The problem with floats and doubles is that the vast majority of money-like numbers don't have an exact representation as an integer times a power of 2. In fact, the only multiples of 0.01 between 0 and 1 (which are significant when dealing with money because they're integer cents) that can be represented exactly as an IEEE-754 binary floating-point number are 0, 0.25, 0.5, 0.75 and 1. All the others are off by a small amount. As an analogy to the 0.333333 example, if you take the floating-point value for 0.01 and you multiply it by 10, you won't get 0.1. Instead you will get something like 0.099999999786...

Representing money as a double or float will probably look good at first as the software rounds off the tiny errors, but as you perform more additions, subtractions, multiplications and divisions on inexact numbers, errors will compound and you'll end up with values that are visibly not accurate. This makes floats and doubles inadequate for dealing with money, where perfect accuracy for multiples of base 10 powers is required.

A solution that works in just about any language is to use integers instead, and count cents. For instance, 1025 would be $10.25. Several languages also have built-in types to deal with money. Among others, Java has the BigDecimal class, and Rust has the rust_decimal crate, and C# has the decimal type.

A realistic example where using BigDecimal for currency is strictly better than using double

I can see four basic ways that double can screw you when dealing with currency calculations.

Mantissa Too Small

With ~15 decimal digits of precision in the mantissa, you are you going to get the wrong result any time you deal with amounts larger than that. If you are tracking cents, problems would start to occur before 1013 (ten trillion) dollars.

While that's a big number, it's not that big. The US GDP of ~18 trillion exceeds it, so anything dealing with country or even corporation sized amounts could easily get the wrong answer.

Furthermore, there are plenty of ways that much smaller amounts could exceed this threshold during calculation. You might be doing a growth projection or a over a number of years, which results in a large final value. You might be doing a "what if" scenario analysis where various possible parameters are examined and some combination of parameters might result in very large values. You might be working under financial rules which allow fractions of a cent which could chop another two orders of magnitude or more off of your range, putting you roughly in line with the wealth of mere individuals in USD.

Finally, let's not take a US centric view of things. What about other currencies? One USD is worth is worth roughly 13,000 Indonesian Rupiah, so that's another 2 orders of magnitude you need to track currency amounts in that currency (assuming there are no "cents"!). You're almost getting down to amounts that are of interest to mere mortals.

Here is an example where a growth projection calculation starting from 1e9 at 5% goes wrong:

method   year                         amount           delta
double 0 $ 1,000,000,000.00
Decimal 0 $ 1,000,000,000.00 (0.0000000000)
double 10 $ 1,628,894,626.78
Decimal 10 $ 1,628,894,626.78 (0.0000004768)
double 20 $ 2,653,297,705.14
Decimal 20 $ 2,653,297,705.14 (0.0000023842)
double 30 $ 4,321,942,375.15
Decimal 30 $ 4,321,942,375.15 (0.0000057220)
double 40 $ 7,039,988,712.12
Decimal 40 $ 7,039,988,712.12 (0.0000123978)
double 50 $ 11,467,399,785.75
Decimal 50 $ 11,467,399,785.75 (0.0000247955)
double 60 $ 18,679,185,894.12
Decimal 60 $ 18,679,185,894.12 (0.0000534058)
double 70 $ 30,426,425,535.51
Decimal 70 $ 30,426,425,535.51 (0.0000915527)
double 80 $ 49,561,441,066.84
Decimal 80 $ 49,561,441,066.84 (0.0001678467)
double 90 $ 80,730,365,049.13
Decimal 90 $ 80,730,365,049.13 (0.0003051758)
double 100 $ 131,501,257,846.30
Decimal 100 $ 131,501,257,846.30 (0.0005645752)
double 110 $ 214,201,692,320.32
Decimal 110 $ 214,201,692,320.32 (0.0010375977)
double 120 $ 348,911,985,667.20
Decimal 120 $ 348,911,985,667.20 (0.0017700195)
double 130 $ 568,340,858,671.56
Decimal 130 $ 568,340,858,671.55 (0.0030517578)
double 140 $ 925,767,370,868.17
Decimal 140 $ 925,767,370,868.17 (0.0053710938)
double 150 $ 1,507,977,496,053.05
Decimal 150 $ 1,507,977,496,053.04 (0.0097656250)
double 160 $ 2,456,336,440,622.11
Decimal 160 $ 2,456,336,440,622.10 (0.0166015625)
double 170 $ 4,001,113,229,686.99
Decimal 170 $ 4,001,113,229,686.96 (0.0288085938)
double 180 $ 6,517,391,840,965.27
Decimal 180 $ 6,517,391,840,965.22 (0.0498046875)
double 190 $ 10,616,144,550,351.47
Decimal 190 $ 10,616,144,550,351.38 (0.0859375000)

The delta (difference between double and BigDecimal first hits > 1 cent at year 160, around 2 trillion (which might not be all that much 160 years from now), and of course just keeps getting worse.

Of course, the 53 bits of Mantissa mean that the relative error for this kind of calculation is likely to be very small (hopefully you don't lose your job over 1 cent out of 2 trillion). Indeed, the relative error basically holds fairly steady through most of the example. You could certainly organize it though so that you (for example) subtract two various with loss of precision in the mantissa resulting in an arbitrarily large error (exercise up to reader).

Changing Semantics

So you think you are pretty clever, and managed to come up with a rounding scheme that lets you use double and have exhaustively tested your methods on your local JVM. Go ahead and deploy it. Tomorrow or next week or whenever is worst for you, the results change and your tricks break.

Unlike almost every other basic language expression and certainly unlike integer or BigDecimal arithmetic, by default the results of many floating point expressions don't have a single standards defined value due to the strictfp feature. Platforms are free to use, at their discretion, higher precision intermediates, which may result in different results on different hardware, JVM versions, etc. The result, for the same inputs, may even vary at runtime when the method switches from interpreted to JIT-compiled!

If you had written your code in the pre-Java 1.2 days, you'd be pretty pissed when Java 1.2 suddenly introduces the now-default variable FP behavior. You might be tempted to just use strictfp everywhere and hope you don't run into any of the multitude of related bugs - but on some platforms you'd be throwing away much of the performance that double bought you in the first place.

There's nothing to say that the JVM spec won't again change in the future to accommodate further changes in FP hardware, or that the JVM implementors won't use the rope that the default non-strictfp behavior gives them to do something tricky.

Inexact Representations

As Roland pointed out in his answer, a key problem with double is that it doesn't have exact representations for some most non-integer values. Although a single non-exact value like 0.1 will often "roundtrip" OK in some scenarios (e.g., Double.toString(0.1).equals("0.1")), as soon as you do math on these imprecise values the error can compound, and this can be irrecoverable.

In particular, if you are "close" to a rounding point, e.g., ~1.005, you might get a value of 1.00499999... when the true value is 1.0050000001..., or vice-versa. Because the errors go in both directions, there is no rounding magic that can fix this. There is no way to tell if a value of 1.004999999... should be bumped up or not. Your roundToTwoPlaces() method (a type of double rounding) only works because it handled a case where 1.0049999 should be bumped up, but it will never be able to cross the boundary, e.g., if cumulative errors cause 1.0050000000001 to be turned into 1.00499999999999 it can't fix it.

You don't need big or small numbers to hit this. You only need some math and for the result to fall close to the boundary. The more math you do, the larger the possible deviations from the true result, and the more chance of straddling a boundary.

As requested here a searching test that does a simple calculation: amount * tax and rounds it to 2 decimal places (i.e., dollars and cents). There are a few rounding methods in there, the one currently used, roundToTwoPlacesB is a souped-up version of yours1 (by increasing the multiplier for n in the first rounding you make it a lot more sensitive - the original version fails right away on trivial inputs).

The test spits out the failures it finds, and they come in bunches. For example, the first few failures:

Failed for 1234.57 * 0.5000 = 617.28 vs 617.29
Raw result : 617.2850000000000000000000, Double.toString(): 617.29
Failed for 1234.61 * 0.5000 = 617.30 vs 617.31
Raw result : 617.3050000000000000000000, Double.toString(): 617.31
Failed for 1234.65 * 0.5000 = 617.32 vs 617.33
Raw result : 617.3250000000000000000000, Double.toString(): 617.33
Failed for 1234.69 * 0.5000 = 617.34 vs 617.35
Raw result : 617.3450000000000000000000, Double.toString(): 617.35

Note that the "raw result" (i.e., the exact unrounded result) is always close to a x.xx5000 boundary. Your rounding method errs both on the high and low sides. You can't fix it generically.

Imprecise Calculations

Several of the java.lang.Math methods don't require correctly rounded results, but rather allow errors of up to 2.5 ulp. Granted, you probably aren't going to be using the hyperbolic functions much with currency, but functions such as exp() and pow() often find their way into currency calculations and these only have an accuracy of 1 ulp. So the number is already "wrong" when it is returned.

This interacts with the "Inexact Representation" issue, since this type of error is much more serious than that from the normal mathematic operations which are at least choosing the best possible value from with the representable domain of double. It means that you can have many more round-boundary crossing events when you use these methods.

Float and BigDecimal precision difference

3.1 defines a double while 3.1f defines a float. What you see is the problem the float has of representing that value (float uses "only" 32-bits and double 64-bits).

If you want to define a 3.1 exactly using BigDecimal use the String constructor:

BigDecimal foo = new BigDecimal("3.1");
System.out.println(foo);

Output:

3.1

When would you use object BigInteger instead of simply using double?

When you are using BigInteger you can't use operators such as *. You must use methods of the BigInteger class :

return factorial(a-1).multiply(a);

The reason for using BigInteger instead of double is precision. double has limited precision, so large integers can't be represented accurately.

EDIT: You should actually use

return BigInteger.valueOf(a).multiply(factorial(a-1));

since BigInteger multiply(long v) is package private.



Related Topics



Leave a reply



Submit