Exactly Storing Large Integers

Should I use double data structure to store very large Integer values?

Whether double arithmetic is slow as compared to integer arithmetic depends on the CPU and the bit size of the integer/double.

On modern hardware floating point arithmetic is generally not slow. Even though the general rule may be that integer arithmetic is typically a bit faster than floating point arithmetic, this is not always true. For instance multiplication & division can even be significantly faster for floating point than the integer counterpart (see this answer)

This may be different for embedded systems with no hardware support for floating point. Then double arithmetic will be extremely slow.

Regarding your original problem: You should note that a 64 bit long long int can store more integers exactly (2^63) while double can store integers only up to 2^53 exactly. It can store higher numbers though, but not all integers: they will get rounded.

The nice thing about floating point is that it is much more convenient to work with. You have special symbols for infinity (Inf) and a symbol for undefined (NaN). This makes division by zero for instance possible and not an exception. Also one can use NaN as a return value in case of error or abnormal conditions. With integers one often uses -1 or something to indicate an error. This can propagate in calculations undetected, while NaN will not be undetected as it propagates.

Practical example: The programming language MATLAB has double as the default data type. It is used always even for cases where integers are typically used, e.g. array indexing. Even though MATLAB is an intepreted language and not so fast as a compiled language such as C or C++ is is quite fast and a powerful tool.

Bottom line: Using double instead of integers will not be slow. Perhaps not most efficient, but performance hit is not severe (at least not on modern desktop computer hardware).

How to store a large (10 digits) integer?

Your concrete example could be stored in long (or java.lang.Long if this is necessary).

If at any point you need bigger numbers, you can try
java.math.BigInteger (if integer), or java.math.BigDecimal (if decimal)

storing large integer


Question : Can we store this squareroot in 105.5 bits (round it like
13 bytes + 2 bits etc) and later read and square value to get original
value back?

No. You need to take log_2 of an integer (not a floating point) to see how many bits it needs. Ex: Log_2(256) = 8 bits. That number can be stored as 0x10000000. However, Log_2(256.123456789) ~= 8 bits as well. There is obviously more information in that second number, however.

To get around this, you could multiply your value by a power of 2 or 10 and store that as a integer (this is essentially fixed point: http://en.wikipedia.org/wiki/Fixed-point_arithmetic). So in your example, multiply 57636793900346419278364744407607.475108338 by 10^9 to get the integer: 57636793900346419278364744407607475108338, which is what you would store. Log_2 of that is 135.4, so you need at least 136 bits of information to store that number exactly.

How to store BIG int values

For values which only require up to 28 digits, you can use System.Decimal. The way it's designed, you don't encounter the scaling issue you do with double, where for large numbers the gap between two adjacent numbers is bigger than 1. Admittedly this looks slightly odd, given that decimal is usually used for non-integer values - but in some cases I think it's a perfectly reasonable use, so long as you document it properly.

For values bigger than that, you can use System.Numerics.BigInteger.

Another alternative is to just use double and accept that you really don't get a precision down to the integer. When it comes to the distance between galaxies, are you really going to have a value which is accurate to a metre anyway? It does depend on how you're going to use this - it can certainly make testing simpler if you use a nicely-predictable integer type, but you should really think about where the values are going to go and what you're going to do with them.

Handling very large numbers in Python

Python supports a "bignum" integer type which can work with arbitrarily large numbers. In Python 2.5+, this type is called long and is separate from the int type, but the interpreter will automatically use whichever is more appropriate. In Python 3.0+, the int type has been dropped completely.

That's just an implementation detail, though — as long as you have version 2.5 or better, just perform standard math operations and any number which exceeds the boundaries of 32-bit math will be automatically (and transparently) converted to a bignum.

You can find all the gory details in PEP 0237.

Is there a way to store a large number precisely in R?

You can try the bigz class from the gmppackage:

> library("gmp")
> 2^10000
[1] Inf
> 2^(as.bigz(10000))
[1] "199506.... and a LOT of more numbers!

It basically stores the number as a string and so avoiding the integer/double limits.



Related Topics



Leave a reply



Submit