Java Floating Point High Precision Library

Java floating point high precision library

There are three libraries mentioned on the Arbitrary Precision Arithmetic page: java.math (containing the mentioned BigDecimal), Apfloat and JScience. I run a little speed check on them which just uses addition and multiplication.

The result is that for a relatively small number of digits BigDecimal is OK (half as fast as the others for 1000 digits), but if you use more digits it is way off - JScience is about 4 times faster. But the clear performance winner is Apfloat. The other libraries seem to use naive multiplication algorithms that take time proportional to the square of the number of digits, but the time of Apfloat seems to grow almost linearly. On 10000 digits it was 4 times as fast as JScience, but on 40000 digits it is 16 times as fast as JScience.

On the other hand: JScience provides EXCELLENT functionality for mathematical problems: matrices, vectors, symbolic algorithms, solution of equation systems and what not. So I'll probably go with JScience and later write a wrapper to integrate Apfloat into the algorithms of JScience - due to the good design this seems easily possible.

(UPDATE: I wrote a test suite for the number package of JScience and fixed a number of bugs. This went into release 4.3.1. So I can recommend checking it out.)

Retain precision with double in Java

As others have mentioned, you'll probably want to use the BigDecimal class, if you want to have an exact representation of 11.4.

Now, a little explanation into why this is happening:

The float and double primitive types in Java are floating point numbers, where the number is stored as a binary representation of a fraction and a exponent.

More specifically, a double-precision floating point value such as the double type is a 64-bit value, where:

  • 1 bit denotes the sign (positive or negative).
  • 11 bits for the exponent.
  • 52 bits for the significant digits (the fractional part as a binary).

These parts are combined to produce a double representation of a value.

(Source: Wikipedia: Double precision)

For a detailed description of how floating point values are handled in Java, see the Section 4.2.3: Floating-Point Types, Formats, and Values of the Java Language Specification.

The byte, char, int, long types are fixed-point numbers, which are exact representions of numbers. Unlike fixed point numbers, floating point numbers will some times (safe to assume "most of the time") not be able to return an exact representation of a number. This is the reason why you end up with 11.399999999999 as the result of 5.6 + 5.8.

When requiring a value that is exact, such as 1.5 or 150.1005, you'll want to use one of the fixed-point types, which will be able to represent the number exactly.

As has been mentioned several times already, Java has a BigDecimal class which will handle very large numbers and very small numbers.

From the Java API Reference for the BigDecimal class:

Immutable,
arbitrary-precision signed decimal
numbers. A BigDecimal consists of an
arbitrary precision integer unscaled
value and a 32-bit integer scale. If
zero or positive, the scale is the
number of digits to the right of the
decimal point. If negative, the
unscaled value of the number is
multiplied by ten to the power of the
negation of the scale. The value of
the number represented by the
BigDecimal is therefore (unscaledValue
× 10^-scale).

There has been many questions on Stack Overflow relating to the matter of floating point numbers and its precision. Here is a list of related questions that may be of interest:

  • Why do I see a double variable initialized to some value like 21.4 as 21.399999618530273?
  • How to print really big numbers in C++
  • How is floating point stored? When does it matter?
  • Use Float or Decimal for Accounting Application Dollar Amount?

If you really want to get down to the nitty gritty details of floating point numbers, take a look at What Every Computer Scientist Should Know About Floating-Point Arithmetic.

JVM Arbitrary Precision Libraries

Unfortunately, I think you are out of luck for a Java native library. I have not found one. I recommend wrapping GMP, which has excellent arbitrary precision performance, using JNI. There is JNI overhead, but if you're in the 1500 digit range, that should be small compared to the difference in algorithmic complexity. You can find various wrappings of GMP for Java (I believe the most popular one is here).

Higher precision doubles and trigonometric functions in Java

I'm pretty sure primitives won't get you there, but the BigDecimal class is as good as it gets (for everything except trigonometry).

For trigonometric functions, however, you will have to resort to an external library, like APFloat (see this previous question).

fixed point arithmetics in java with fast performance

Are you completely sure BigDecimal is the performance problem? Did you use a profiler to find out? If yes, two options that could help are:

1) Use long and multiply all values by a factor (for example 100 if you are interested in cents).

2) Use a specially designed class that implements something similar to BigDecimal, but using long internally. I don't know if a good open source library exists (maybe the Java Math Fixed Point Library?). I wrote one such class myself quite a long time ago (2001 I believe) for J2ME. It's a bit tricky to get right. Please note BigDecimal uses a long internally as well except if high precision is needed, so this solution will only help a tiny bit in most cases.

Using double isn't a good option in many cases, because of rounding and precision problems.

Significant precision difference in floating point calculations in Java / C#

The cause of not getting 15 digits is the subtraction of two similar numbers. z / Math.sin(lat0) is about 6347068.978968251. n0 * (1 - eSq) is about 6346068.978912728. With 7 digits before the decimal point, a change in the 9th decimal place in the subtraction result corresponds to a change of less than one part in 10^15 in one of those inputs.

The simplest solution to this type of problem is usually to display only the digits that are supported by reliable digits in the inputs. Very few measurements can be done to one part in 10^12, so in this case it is almost certain that the digits that differ due to floating point rounding error would be dropped.

For example, it looks as though your data relates to location. One of the most carefully measured pieces of such data is the height of Mount Everest. The current best estimate, based on high precision GPS measurement, is "29,035 feet, with an error margin of plus or minus 6.5 feet" Encyclopedia Britannica. An error in the 13th significant digit corresponds to an error of less than one thousandth of an inch in measuring the circumference of the earth.

If the rounding error really is significant relative to the result requirements, and to what is realistically achievable given the accuracy of the inputs, then you might need to look at more clever ways of arranging the calculation or at higher precision arithmetic.

x86 80-bit floating point type in Java

An 80-bit value should be best held as combination of a long (for the mantissa) and an int for the exponent and sign. For many operations, it will probably be most practical to place the upper and lower halves of the long into separate "long" values, so the code for addition of two numbers with matching signs and exponents would probably be something like:

long resultLo = (num1.mant & 0xFFFFFFFFL)+(num2.mant & 0xFFFFFFFFL);
long resultHi = (num1.mant >>> 32)+(num2.mant >>> 32)+(resultLo >>> 32);
result.exp = num1.exp; // Should match num2.exp
if (resultHi > 0xFFFFFFFFL) {
exponent++;
resultHi = (resultHi + ((resultHi & 2)>>>1)) >>> 1; // Round the result
}
rest.mant = (resultHi << 32) + resultLo;

A bit of a nuisance all around, but not completely unworkable. The key is
to break numbers into pieces small enough that you can do all your math as
type "long".

BTW, note that if one of the numbers did not originally have the same exponent,
it will be necessary to keep track of whether any bits "fell off the end" when
shifting it left or right to match the exponent of the first number, so as to
be able to properly round the result afterward.

Library for strict floating point Math in .NET

Unfortunately there is no way to enforce FP strictness in C#, the .Net CLR just lacks the ability to do calculations with less precision that the maximum that is can.

I think there's a performance gain for this - it doesn't check that you might want less precision. There's also no need - .Net doesn't run in a virtual machine and so doesn't worry about different floating point processors.

However isn't strictfp optional? Could you execute your Java code without the strictfp modifier? Then it should pick up the same floating point mechanism as .Net

So instead of forcing .Net to use strictfp and checking it comes out with the same values as your Java code you could force Java to not use strictfp and check that it then comes out the same as the .Net code.

Float and double datatype in Java

The Wikipedia page on it is a good place to start.

To sum up:

  • float is represented in 32 bits, with 1 sign bit, 8 bits of exponent, and 23 bits of the significand (or what follows from a scientific-notation number: 2.33728*1012; 33728 is the significand).

  • double is represented in 64 bits, with 1 sign bit, 11 bits of exponent, and 52 bits of significand.

By default, Java uses double to represent its floating-point numerals (so a literal 3.14 is typed double). It's also the data type that will give you a much larger number range, so I would strongly encourage its use over float.

There may be certain libraries that actually force your usage of float, but in general - unless you can guarantee that your result will be small enough to fit in float's prescribed range, then it's best to opt with double.

If you require accuracy - for instance, you can't have a decimal value that is inaccurate (like 1/10 + 2/10), or you're doing anything with currency (for example, representing $10.33 in the system), then use a BigDecimal, which can support an arbitrary amount of precision and handle situations like that elegantly.



Related Topics



Leave a reply



Submit