Handling Very Large Numbers in Python

Handling very large numbers in Python

Python supports a "bignum" integer type which can work with arbitrarily large numbers. In Python 2.5+, this type is called long and is separate from the int type, but the interpreter will automatically use whichever is more appropriate. In Python 3.0+, the int type has been dropped completely.

That's just an implementation detail, though — as long as you have version 2.5 or better, just perform standard math operations and any number which exceeds the boundaries of 32-bit math will be automatically (and transparently) converted to a bignum.

You can find all the gory details in PEP 0237.

Does Python have trouble with large numbers/lists, or is there something wrong with my code?

Somewhere along the way you are getting a floating point error. "But I am using integers!" I hear you say, well Python is doing a float division on this line:

i = n / 2

Seems innocuous enough, but changing to integer division solves the problem:

i = n // 2

After several hundred values, one of those divisions is giving you an error with the value being some epsilon less than the actual integer value, which then rounds down when you call int(n).

EDIT: After double checking my last point to find the value that fails, I wasn't quite correct. What is actually happening is that while integer division is always accurate thanks to Pythons BigInt implementation, float division isn't since it still uses regular floating point numbers for speed. If your number is large enough then there simply aren't enough bytes to accurately store the number which will result in rounding errors.

The number in question is 19981441939834942. With integer division you get 9990720969917471, while float division yields 9990720969917472.0.

This will be a problem in any language you use (except that most other languages won't allow you to accidentally use float division on integers), so make sure to use integer divisions!

Working with big numbers in Python and writing them to file

I would like to give a lavish ;-) answer, but don't have the time now. Elaborating on my comment, the decimal module is what you really want here. It's much faster at computing the power, and very very much faster to convert the result to a decimal string:

>>> import decimal

You need to change its internals so that it avoids floating point, giving it more than enough internal digits to store the final result. We want exact integer arithmetic here, not rounded floating-point. So we fiddle things so decimal uses as much precision as it's capable of using, and tell it to raise the "Inexact" exception if it ever loses information to rounding. Note that you need a 64-bit version of Python for decimal to be capable of using enough precision to hold the exact result in your example:

>>> import decimal
>>> c = decimal.getcontext()
>>> c.prec = decimal.MAX_PREC
>>> c.Emax = decimal.MAX_EMAX
>>> c.Emin = decimal.MIN_EMIN
>>> c.traps[decimal.Inexact] = 1

Now create a Decimal for the base:

>>> base = decimal.Decimal(12345678901234567890123456)
>>> base
Decimal('12345678901234567890123456')

And raise to the power - the exponent will automatically be converted to Decimal, because the base is already Decimal:

>>> x = base ** 12345678

That takes less than a minute on my box! The reasons for that are involved. It's not really because it's working in base 10, but because the person who wrote the decimal module implemented "advanced" algorithms for doing very large multiplications.

Now convert to a string. Because it's already stored in a variant of base 10, converting to a decimal string goes very fast (a few seconds on my box, just because the string has hundreds of millions of digits):

>>> y = str(x)
>>> len(y)
309771765

And, for sanity, let's just look at the last 10, and first 10, digits:

>>> y[-10:]
'6044706816'
>>> y[:10]
'2759594879'

As @StefanPochmann noted in a comment, the last 10 digits can be obtained very quickly with native ints by using modular (3-argument) pow():

>>> pow(int(base), 12345678, 10**10)
6044706816

Which matches the last 10 digits of the string above. For the first 10 digits, we can use decimal again but with much less precision, which will cause it (you'll just to have trust me on this) to use a different approach under the covers:

>>> c.prec = 12
>>> c.traps[decimal.Inexact] = 0 # don't trap on rounding!
>>> base ** 12345678
Decimal('2.75959487945E+309771764')

Rounding that back to 10 digits matches the earlier result, and the exponent is consistent with the length of y too.

How to compute huge numbers with python?

You can do this with the Decimal library:

from decimal import Decimal

X = 53710695204323513509337733909021562547350740845028323195225592059762435955297110591848019878050853425581981564064692996024279718640577281681757923541806197728862534268310235863990001242041406600195234734872865710114622767319497082014412908147635982838670976889326329911714511434374891326542317244606912177994106645736126820796903212224
p = 79293686916250308867562846577205340336400039290615139607865873515636529820700152685808430350565795397930362488139681935988728405965018046160143856932183271822052154707966219579166490625165957544852172686883789422725879425460374250873493847078682057057098206096021890926255094441718327491846721928463078710174998090939469826268390010887
g = 73114111352295288774462814798129374078459933691513097211327217058892903294045760490674069858786617415857709128629468431860886481058309114786300536376329001946020422132220459480052973446624920516819751293995944131953830388015948998083956038870701901293308432733590605162069671909743966331031815478333541613484527212362582446507824584241
X=Decimal(X)
p=Decimal(p)
g=Decimal(g)

print X.ln() - abs(p).ln()/g.ln()

gives

769.7443428855116199351294830


Related Topics



Leave a reply



Submit