Should I use double data structure to store very large Integer values?
Whether double arithmetic is slow as compared to integer arithmetic depends on the CPU and the bit size of the integer/double.
On modern hardware floating point arithmetic is generally not slow. Even though the general rule may be that integer arithmetic is typically a bit faster than floating point arithmetic, this is not always true. For instance multiplication & division can even be significantly faster for floating point than the integer counterpart (see this answer)
This may be different for embedded systems with no hardware support for floating point. Then double arithmetic will be extremely slow.
Regarding your original problem: You should note that a 64 bit long long int
can store more integers exactly (2^63) while double
can store integers only up to 2^53 exactly. It can store higher numbers though, but not all integers: they will get rounded.
The nice thing about floating point is that it is much more convenient to work with. You have special symbols for infinity (Inf
) and a symbol for undefined (NaN
). This makes division by zero for instance possible and not an exception. Also one can use NaN as a return value in case of error or abnormal conditions. With integers one often uses -1
or something to indicate an error. This can propagate in calculations undetected, while NaN
will not be undetected as it propagates.
Practical example: The programming language MATLAB has double
as the default data type. It is used always even for cases where integers are typically used, e.g. array indexing. Even though MATLAB is an intepreted language and not so fast as a compiled language such as C or C++ is is quite fast and a powerful tool.
Bottom line: Using double
instead of integers will not be slow. Perhaps not most efficient, but performance hit is not severe (at least not on modern desktop computer hardware).
How to store a large (10 digits) integer?
Your concrete example could be stored in long
(or java.lang.Long
if this is necessary).
If at any point you need bigger numbers, you can try java.math.BigInteger
(if integer), or java.math.BigDecimal
(if decimal)
storing large integer
Question : Can we store this squareroot in 105.5 bits (round it like
13 bytes + 2 bits etc) and later read and square value to get original
value back?
No. You need to take log_2 of an integer (not a floating point) to see how many bits it needs. Ex: Log_2(256) = 8 bits. That number can be stored as 0x10000000. However, Log_2(256.123456789) ~= 8 bits as well. There is obviously more information in that second number, however.
To get around this, you could multiply your value by a power of 2 or 10 and store that as a integer (this is essentially fixed point: http://en.wikipedia.org/wiki/Fixed-point_arithmetic). So in your example, multiply 57636793900346419278364744407607.475108338 by 10^9 to get the integer: 57636793900346419278364744407607475108338, which is what you would store. Log_2 of that is 135.4, so you need at least 136 bits of information to store that number exactly.
How to store BIG int values
For values which only require up to 28 digits, you can use System.Decimal
. The way it's designed, you don't encounter the scaling issue you do with double
, where for large numbers the gap between two adjacent numbers is bigger than 1. Admittedly this looks slightly odd, given that decimal
is usually used for non-integer values - but in some cases I think it's a perfectly reasonable use, so long as you document it properly.
For values bigger than that, you can use System.Numerics.BigInteger
.
Another alternative is to just use double
and accept that you really don't get a precision down to the integer. When it comes to the distance between galaxies, are you really going to have a value which is accurate to a metre anyway? It does depend on how you're going to use this - it can certainly make testing simpler if you use a nicely-predictable integer type, but you should really think about where the values are going to go and what you're going to do with them.
Handling very large numbers in Python
Python supports a "bignum" integer type which can work with arbitrarily large numbers. In Python 2.5+, this type is called long
and is separate from the int
type, but the interpreter will automatically use whichever is more appropriate. In Python 3.0+, the int
type has been dropped completely.
That's just an implementation detail, though — as long as you have version 2.5 or better, just perform standard math operations and any number which exceeds the boundaries of 32-bit math will be automatically (and transparently) converted to a bignum.
You can find all the gory details in PEP 0237.
Is there a way to store a large number precisely in R?
You can try the bigz
class from the gmp
package:
> library("gmp")
> 2^10000
[1] Inf
> 2^(as.bigz(10000))
[1] "199506.... and a LOT of more numbers!
It basically stores the number as a string and so avoiding the integer/double limits.
Related Topics
Checking If Date Is Between Two Dates in R
Convert Data Frame with Date Column to Timeseries
Rmarkdown: How to End Tabbed Content
Smaller Gap Between Two Legends in One Plot (E.G. Color and Size Scale)
How to Pivot/Unpivot (Cast/Melt) Data Frame
R- How to Dynamically Name Data Frames
R: Reshaping Multiple Columns from Long to Wide
How to Perform Multiple Left Joins Using Dplyr in R
R: How to Split a Data Frame into Training, Validation, and Test Sets
Exactly Storing Large Integers
Non-Redundant Version of Expand.Grid
Analyzing Daily/Weekly Data Using Ts in R
R - When Trying to Install Package: Internetopenurl Failed
How to Make Time Difference in Same Units When Subtracting Posixct
Lm Function in R Does Not Give Coefficients for All Factor Levels in Categorical Data
How to Sort All Dataframes in a List of Dataframes on the Same Column
Is There Anything Wrong with Using T & F Instead of True & False