Handling very large numbers in Python
Python supports a "bignum" integer type which can work with arbitrarily large numbers. In Python 2.5+, this type is called long
and is separate from the int
type, but the interpreter will automatically use whichever is more appropriate. In Python 3.0+, the int
type has been dropped completely.
That's just an implementation detail, though — as long as you have version 2.5 or better, just perform standard math operations and any number which exceeds the boundaries of 32-bit math will be automatically (and transparently) converted to a bignum.
You can find all the gory details in PEP 0237.
Does Python have trouble with large numbers/lists, or is there something wrong with my code?
Somewhere along the way you are getting a floating point error. "But I am using integers!" I hear you say, well Python is doing a float division on this line:
i = n / 2
Seems innocuous enough, but changing to integer division solves the problem:
i = n // 2
After several hundred values, one of those divisions is giving you an error with the value being some epsilon less than the actual integer value, which then rounds down when you call int(n)
.
EDIT: After double checking my last point to find the value that fails, I wasn't quite correct. What is actually happening is that while integer division is always accurate thanks to Pythons BigInt implementation, float division isn't since it still uses regular floating point numbers for speed. If your number is large enough then there simply aren't enough bytes to accurately store the number which will result in rounding errors.
The number in question is 19981441939834942
. With integer division you get 9990720969917471
, while float division yields 9990720969917472.0
.
This will be a problem in any language you use (except that most other languages won't allow you to accidentally use float division on integers), so make sure to use integer divisions!
Working with big numbers in Python and writing them to file
I would like to give a lavish ;-) answer, but don't have the time now. Elaborating on my comment, the decimal
module is what you really want here. It's much faster at computing the power, and very very much faster to convert the result to a decimal string:
>>> import decimal
You need to change its internals so that it avoids floating point, giving it more than enough internal digits to store the final result. We want exact integer arithmetic here, not rounded floating-point. So we fiddle things so decimal
uses as much precision as it's capable of using, and tell it to raise the "Inexact" exception if it ever loses information to rounding. Note that you need a 64-bit version of Python for decimal
to be capable of using enough precision to hold the exact result in your example:
>>> import decimal
>>> c = decimal.getcontext()
>>> c.prec = decimal.MAX_PREC
>>> c.Emax = decimal.MAX_EMAX
>>> c.Emin = decimal.MIN_EMIN
>>> c.traps[decimal.Inexact] = 1
Now create a Decimal
for the base:
>>> base = decimal.Decimal(12345678901234567890123456)
>>> base
Decimal('12345678901234567890123456')
And raise to the power - the exponent will automatically be converted to Decimal
, because the base is already Decimal
:
>>> x = base ** 12345678
That takes less than a minute on my box! The reasons for that are involved. It's not really because it's working in base 10, but because the person who wrote the decimal
module implemented "advanced" algorithms for doing very large multiplications.
Now convert to a string. Because it's already stored in a variant of base 10, converting to a decimal string goes very fast (a few seconds on my box, just because the string has hundreds of millions of digits):
>>> y = str(x)
>>> len(y)
309771765
And, for sanity, let's just look at the last 10, and first 10, digits:
>>> y[-10:]
'6044706816'
>>> y[:10]
'2759594879'
As @StefanPochmann noted in a comment, the last 10 digits can be obtained very quickly with native ints by using modular (3-argument) pow()
:
>>> pow(int(base), 12345678, 10**10)
6044706816
Which matches the last 10 digits of the string above. For the first 10 digits, we can use decimal
again but with much less precision, which will cause it (you'll just to have trust me on this) to use a different approach under the covers:
>>> c.prec = 12
>>> c.traps[decimal.Inexact] = 0 # don't trap on rounding!
>>> base ** 12345678
Decimal('2.75959487945E+309771764')
Rounding that back to 10 digits matches the earlier result, and the exponent is consistent with the length of y
too.
How to compute huge numbers with python?
You can do this with the Decimal library:
from decimal import Decimal
X = 53710695204323513509337733909021562547350740845028323195225592059762435955297110591848019878050853425581981564064692996024279718640577281681757923541806197728862534268310235863990001242041406600195234734872865710114622767319497082014412908147635982838670976889326329911714511434374891326542317244606912177994106645736126820796903212224
p = 79293686916250308867562846577205340336400039290615139607865873515636529820700152685808430350565795397930362488139681935988728405965018046160143856932183271822052154707966219579166490625165957544852172686883789422725879425460374250873493847078682057057098206096021890926255094441718327491846721928463078710174998090939469826268390010887
g = 73114111352295288774462814798129374078459933691513097211327217058892903294045760490674069858786617415857709128629468431860886481058309114786300536376329001946020422132220459480052973446624920516819751293995944131953830388015948998083956038870701901293308432733590605162069671909743966331031815478333541613484527212362582446507824584241
X=Decimal(X)
p=Decimal(p)
g=Decimal(g)
print X.ln() - abs(p).ln()/g.ln()
gives
769.7443428855116199351294830
Related Topics
How to Create a Zip Archive of a Directory
How to Extract Text from a PDF File
Element-Wise Addition of 2 Lists
How to Round to 2 Decimals with Python
How Accurate Is Python's Time.Sleep()
Capturing Repeating Subpatterns in Python Regex
Removing Elements That Have Consecutive Duplicates
Why Is the Pygame Animation Is Flickering
Understanding _Get_ and _Set_ and Python Descriptors
Convert Pyspark String to Date Format
Getting Python Error "From: Can't Read /Var/Mail/Bio"
Create an Empty List with Certain Size in Python
Dynamically Evaluate an Expression from a Formula in Pandas
Passing Variable from Python Script to Bash Script