How Math.Pow (And So On) Actually Works

How Math.Pow (and so on) actually works

pow is usually evaluated by this formula:

x^y = exp2(y*log2(x))

Functions exp2(x),log2(x) are directly implemented in FPU. If you want to implement bignums then they can also be evaluated by basic operators with use of precomputed table of sqrt-powers like:

2^1/2, 2^1/4, 2^1/8, 2^1/16, 2^1/32 ...

to speed up the process

In case you need to handle also rooting for negative bases see this:

real domain pow based on complex domain math

What's the algorithm behind Math.pow() in Java

This is a fun question. If you look into the the source code for Java's Math class, you will find that it calls StrictMath.pow(double1, double2), and StrictMath's signature is public static native double pow(double a, double b);

So, in the end, it is a truly native call that might differ depending on the platform. However, there is an implementation somewhere, and it isn't very easy to look at. Here is the description of the function and the code for the function itself:

Note

Looking through the math, trying to understand it might inevitably lead to even more questions. But, by searching through this Github on Java Math Function Source Code and glancing out the mathematical summaries, you can definitely understand the native functions better. Happy Exploring :)

Method Description

Method:  Let x =  2   * (1+f)
      1. Compute and return log2(x) in two pieces:
              log2(x) = w1 + w2,
         where w1 has 53-24 = 29 bit trailing zeros.
      2. Perform y*log2(x) = n+y' by simulating muti-precision
         arithmetic, where |y'|<=0.5.
      3. Return x**y = 2**n*exp(y'*log2)

Special Cases

      1.  (anything) ** 0  is 1
      2.  (anything) ** 1  is itself
      3.  (anything) ** NAN is NAN
      4.  NAN ** (anything except 0) is NAN
      5.  +-(|x| > 1) **  +INF is +INF
      6.  +-(|x| > 1) **  -INF is +0
      7.  +-(|x| < 1) **  +INF is +0
      8.  +-(|x| < 1) **  -INF is +INF
      9.  +-1         ** +-INF is NAN
      10. +0 ** (+anything except 0, NAN)               is +0
      11. -0 ** (+anything except 0, NAN, odd integer)  is +0
      12. +0 ** (-anything except 0, NAN)               is +INF
      13. -0 ** (-anything except 0, NAN, odd integer)  is +INF
      14. -0 ** (odd integer) = -( +0 ** (odd integer) )
      15. +INF ** (+anything except 0,NAN) is +INF
      16. +INF ** (-anything except 0,NAN) is +0
      17. -INF ** (anything)  = -0 ** (-anything)
      18. (-anything) ** (integer) is (-1)**(integer)*(+anything**integer)
      19. (-anything except 0 and inf) ** (non-integer) is NAN

Accuracy

       pow(x,y) returns x**y nearly rounded. In particular
                      pow(integer,integer)
       always returns the correct integer provided it is
       representable.

Source Code

#ifdef __STDC__
        double __ieee754_pow(double x, double y)
#else
        double __ieee754_pow(x,y)
        double x, y;
#endif
{
        double z,ax,z_h,z_l,p_h,p_l;
        double y1,t1,t2,r,s,t,u,v,w;
        int i0,i1,i,j,k,yisint,n;
        int hx,hy,ix,iy;
        unsigned lx,ly;

        i0 = ((*(int*)&one)>>29)^1; i1=1-i0;
        hx = __HI(x); lx = __LO(x);
        hy = __HI(y); ly = __LO(y);
        ix = hx&0x7fffffff;  iy = hy&0x7fffffff;

    /* y==zero: x**0 = 1 */
        if((iy|ly)==0) return one;

    /* +-NaN return x+y */
        if(ix > 0x7ff00000 || ((ix==0x7ff00000)&&(lx!=0)) ||
           iy > 0x7ff00000 || ((iy==0x7ff00000)&&(ly!=0)))
                return x+y;

    /* determine if y is an odd int when x < 0
     * yisint = 0       ... y is not an integer
     * yisint = 1       ... y is an odd int
     * yisint = 2       ... y is an even int
     */
        yisint  = 0;
        if(hx<0) {
            if(iy>=0x43400000) yisint = 2; /* even integer y */
            else if(iy>=0x3ff00000) {
                k = (iy>>20)-0x3ff;        /* exponent */
                if(k>20) {
                    j = ly>>(52-k);
                    if((j<<(52-k))==ly) yisint = 2-(j&1);
                } else if(ly==0) {
                    j = iy>>(20-k);
                    if((j<<(20-k))==iy) yisint = 2-(j&1);
                }
            }
        }

    /* special value of y */
        if(ly==0) {
            if (iy==0x7ff00000) {       /* y is +-inf */
                if(((ix-0x3ff00000)|lx)==0)
                    return  y - y;      /* inf**+-1 is NaN */
                else if (ix >= 0x3ff00000)/* (|x|>1)**+-inf = inf,0 */
                    return (hy>=0)? y: zero;
                else                    /* (|x|<1)**-,+inf = inf,0 */
                    return (hy<0)?-y: zero;
            }
            if(iy==0x3ff00000) {        /* y is  +-1 */
                if(hy<0) return one/x; else return x;
            }
            if(hy==0x40000000) return x*x; /* y is  2 */
            if(hy==0x3fe00000) {        /* y is  0.5 */
                if(hx>=0)       /* x >= +0 */
                return sqrt(x);
            }
        }

        ax   = fabs(x);
    /* special value of x */
        if(lx==0) {
            if(ix==0x7ff00000||ix==0||ix==0x3ff00000){
                z = ax;                 /*x is +-0,+-inf,+-1*/
                if(hy<0) z = one/z;     /* z = (1/|x|) */
                if(hx<0) {
                    if(((ix-0x3ff00000)|yisint)==0) {
                        z = (z-z)/(z-z); /* (-1)**non-int is NaN */
                    } else if(yisint==1)
                        z = -1.0*z;             /* (x<0)**odd = -(|x|**odd) */
                }
                return z;
            }
        }

        n = (hx>>31)+1;

    /* (x<0)**(non-int) is NaN */
        if((n|yisint)==0) return (x-x)/(x-x);

        s = one; /* s (sign of result -ve**odd) = -1 else = 1 */
        if((n|(yisint-1))==0) s = -one;/* (-ve)**(odd int) */

    /* |y| is huge */
        if(iy>0x41e00000) { /* if |y| > 2**31 */
            if(iy>0x43f00000){  /* if |y| > 2**64, must o/uflow */
                if(ix<=0x3fefffff) return (hy<0)? huge*huge:tiny*tiny;
                if(ix>=0x3ff00000) return (hy>0)? huge*huge:tiny*tiny;
            }
        /* over/underflow if x is not close to one */
            if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny;
            if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny;
        /* now |1-x| is tiny <= 2**-20, suffice to compute
           log(x) by x-x^2/2+x^3/3-x^4/4 */
            t = ax-one;         /* t has 20 trailing zeros */
            w = (t*t)*(0.5-t*(0.3333333333333333333333-t*0.25));
            u = ivln2_h*t;      /* ivln2_h has 21 sig. bits */
            v = t*ivln2_l-w*ivln2;
            t1 = u+v;
            __LO(t1) = 0;
            t2 = v-(t1-u);
        } else {
            double ss,s2,s_h,s_l,t_h,t_l;
            n = 0;
        /* take care subnormal number */
            if(ix<0x00100000)
                {ax *= two53; n -= 53; ix = __HI(ax); }
            n  += ((ix)>>20)-0x3ff;
            j  = ix&0x000fffff;
        /* determine interval */
            ix = j|0x3ff00000;          /* normalize ix */
            if(j<=0x3988E) k=0;         /* |x|<sqrt(3/2) */
            else if(j<0xBB67A) k=1;     /* |x|<sqrt(3)   */
            else {k=0;n+=1;ix -= 0x00100000;}
            __HI(ax) = ix;

        /* compute ss = s_h+s_l = (x-1)/(x+1) or (x-1.5)/(x+1.5) */
            u = ax-bp[k];               /* bp[0]=1.0, bp[1]=1.5 */
            v = one/(ax+bp[k]);
            ss = u*v;
            s_h = ss;
            __LO(s_h) = 0;
        /* t_h=ax+bp[k] High */
            t_h = zero;
            __HI(t_h)=((ix>>1)|0x20000000)+0x00080000+(k<<18);
            t_l = ax - (t_h-bp[k]);
            s_l = v*((u-s_h*t_h)-s_h*t_l);
        /* compute log(ax) */
            s2 = ss*ss;
            r = s2*s2*(L1+s2*(L2+s2*(L3+s2*(L4+s2*(L5+s2*L6)))));
            r += s_l*(s_h+ss);
            s2  = s_h*s_h;
            t_h = 3.0+s2+r;
            __LO(t_h) = 0;
            t_l = r-((t_h-3.0)-s2);
        /* u+v = ss*(1+...) */
            u = s_h*t_h;
            v = s_l*t_h+t_l*ss;
        /* 2/(3log2)*(ss+...) */
            p_h = u+v;
            __LO(p_h) = 0;
            p_l = v-(p_h-u);
            z_h = cp_h*p_h;             /* cp_h+cp_l = 2/(3*log2) */
            z_l = cp_l*p_h+p_l*cp+dp_l[k];
        /* log2(ax) = (ss+..)*2/(3*log2) = n + dp_h + z_h + z_l */
            t = (double)n;
            t1 = (((z_h+z_l)+dp_h[k])+t);
            __LO(t1) = 0;
            t2 = z_l-(((t1-t)-dp_h[k])-z_h);
        }

    /* split up y into y1+y2 and compute (y1+y2)*(t1+t2) */
        y1  = y;
        __LO(y1) = 0;
        p_l = (y-y1)*t1+y*t2;
        p_h = y1*t1;
        z = p_l+p_h;
        j = __HI(z);
        i = __LO(z);
        if (j>=0x40900000) {                            /* z >= 1024 */
            if(((j-0x40900000)|i)!=0)                   /* if z > 1024 */
                return s*huge*huge;                     /* overflow */
            else {
                if(p_l+ovt>z-p_h) return s*huge*huge;   /* overflow */
            }
        } else if((j&0x7fffffff)>=0x4090cc00 ) {        /* z <= -1075 */
            if(((j-0xc090cc00)|i)!=0)           /* z < -1075 */
                return s*tiny*tiny;             /* underflow */
            else {
                if(p_l<=z-p_h) return s*tiny*tiny;      /* underflow */
            }
        }
    /*
     * compute 2**(p_h+p_l)
     */
        i = j&0x7fffffff;
        k = (i>>20)-0x3ff;
        n = 0;
        if(i>0x3fe00000) {              /* if |z| > 0.5, set n = [z+0.5] */
            n = j+(0x00100000>>(k+1));
            k = ((n&0x7fffffff)>>20)-0x3ff;     /* new k for n */
            t = zero;
            __HI(t) = (n&~(0x000fffff>>k));
            n = ((n&0x000fffff)|0x00100000)>>(20-k);
            if(j<0) n = -n;
            p_h -= t;
        }
        t = p_l+p_h;
        __LO(t) = 0;
        u = t*lg2_h;
        v = (p_l-(t-p_h))*lg2+t*lg2_l;
        z = u+v;
        w = v-(z-u);
        t  = z*z;
        t1  = z - t*(P1+t*(P2+t*(P3+t*(P4+t*P5))));
        r  = (z*t1)/(t1-two)-(w+z*w);
        z  = one-(r-z);
        j  = __HI(z);
        j += (n<<20);
        if((j>>20)<=0) z = scalbn(z,n); /* subnormal output */
        else __HI(z) += (n<<20);
        return s*z;
}

How to correctly work with Math.Pow()?

Pretty much none of these answers are right.

Your question is:

Where is the bug in my code?

If you are using double-precision arithmetic to solve a problem in the integers, you are doing something wrong. Do not use Math.Pow at all, and particularly do not use it to extract cube roots and expect that you will get an exact integer answer.

So how should you actually solve this problem?

Let's be smarter about not doing unnecessary work. Your program discovers that 1³ + 12³ = 9³ + 10³, but also that 12³ + 1³ = 10³ + 9³, and so on. If you know the first one then you can know the second one pretty easily.

So what should we do to make this more efficient?

First, b must be larger than a. That way we never waste any time figuring out that 1³ + 12³ = 12³ + 1³.

Similarly, d must be larger than c.

Now, we can also say that c and d must be between a and b. Do you see why?

Once we put these restrictions in place:

    for (int a = 1; a <= 1000; ++a)
        for (int b = a + 1; b <= 1000; ++b)
            for (int c = a + 1; c < b; ++c)
                for (int d = c + 1; d < b; ++d)
                    if (a * a * a + b * b * b == c * c * c + d * d * d)
                        Console.WriteLine($"{a} {b} {c} {d}");

Your program becomes a lot faster.

Now, there are ways to make it faster still, if you're willing to trade more memory for less time. Can you think of some ways that this program is wasting time? How many times are the same computations done over and over again? How can we improve this situation?

We might notice for example that a * a * a is computed every time through the three inner loops!

    for (int a = 1; a <= 1000; ++a)
    {
        int a3 = a * a * a;
        for (int b = a + 1; b <= 1000; ++b)
        {
            int b3 = b * b * b;
            int sum = a3 + b3;
            for (int c = a + 1; c < b; ++c)
            {
                int c3 = c * c * c;
                int d3 = sum - c3;
                for (int d = c + 1; d < b; ++d)
                    if (d3 == d * d * d)
                        Console.WriteLine($"{a} {b} {c} {d}");
            }
        }
    }

But we could be even smarter than that. For example: what if we created a Dictionary<int, int> that maps cubes to their cube roots? There are only 1000 of them! Then we could say:

    for (int a = 1; a <= 1000; ++a)
    {
        int a3 = a * a * a;
        for (int b = a + 1; b <= 1000; ++b)
        {
            int b3 = b * b * b;
            int sum = a3 + b3;
            for (int c = a + 1; c < b; ++c)
            {
                int c3 = c * c * c;
                int d3 = sum - c3;
                if (dict.HasKey(d3))
                {
                    d = dict[d3];
                    Console.WriteLine($"{a} {b} {c} {d}");
                }
            }
        }
    }

Now you don't have to compute cubes or cube roots of d; you just look up whether it is a cube, and if it is, what its cube root is.

Why is math.pow not natively able to deal with ints? (floor/ceil, too)

java.lang.Math is just a port of what the C math library does.

For C, I think it comes down to the fact that CPU have special instructions to do Math.pow for floating point numbers (but not for integers).

Of course, the language could still add an int implementation. BigInteger has one, in fact. It makes sense there, too, because pow tends to result in rather big numbers.

ceil and floor by definition return integers, so how come they don't return ints

Floating point numbers can represent integers outside of the range of int. So if you take a double argument that is too big to fit into an int, there is no good way for floor to deal with it.

Power function giving different answer than math.pow function in C

You are mixing up two different types of floating-point data. The pow function uses the double type but your loop uses the float type (which has less precision).

You can make the results coincide by either using the double type for your x, power and copyx variables, or by calling the powf function (which uses the float type) instead of pow.

The latter adjustment (using powf) gives the following output (clang-cl compiler, Windows 10, 64-bit):

3^22 = 31381059584.000000
3^22 = 31381059584.000000

And, changing the first line of your main to double x = 3, power = 1, copyx; gives the following:

3^22 = 31381059609.000000
3^22 = 31381059609.000000

Note that, with larger and larger values of n, you are increasingly likely to get divergence between the results of your loop and the value calculated using the pow or powf library functions. On my platform, the double version gives the same results, right up to the point where the value overflows the range and becomes Infinity. However, the float version starts to diverge around n = 55:

3^55 = 174449198498104595772866560.000000
3^55 = 174449216944848669482418176.000000

Am I going crazy or is Math.Pow broken?

You can use BigInteger.Pow. Or use my power method for long.

Why is Math.pow(int,int) slower than my naive implementation?

As others have said, you cannot just ignore the use of double, as floating point arithmetic will almost certainly be slower. However, this is not the only reason - if you change your implementation to use them, it is still faster.

This is because of two things: the first is that 2^2 (exponent, not xor) is a very quick calculation to perform, so your algorithm is fine to use for that - try using two values from Random#nextInt (or nextDouble) and you'll see that Math#pow is actually much quicker.

The other reason is that calling native methods has overhead, which is actually meaningful here, because 2^2 is so quick to calculate, and you are calling Math#pow so many times. See What makes JNI calls slow? for more on this.

How is pow() calculated in C?

If you're curious how the pow function might be implemented in practice, you can look at the source code. There is a kind of "knack" to searching through unfamiliar (and large) codebases to find the section you are looking for, and it's good to get some practice.

One implementation of the C library is glibc, which has mirrors on GitHub. I didn't find an official mirror, but an unofficial mirror is at https://github.com/lattera/glibc

We first look at the math/w_pow.c file which has a promising name. It contains a function __pow which calls __ieee754_pow, which we can find in sysdeps/ieee754/dbl-64/e_pow.c (remember that not all systems are IEEE-754, so it makes sense that the IEEE-754 math code is in its own directory).

It starts with a few special cases:

if (y == 1.0) return x;
if (y == 2.0) return x*x;
if (y == -1.0) return 1.0/x;
if (y == 0) return 1.0;

A little farther down you find a branch with a comment

/* if x<0 */

Which leads us to

return (k==1)?__ieee754_pow(-x,y):-__ieee754_pow(-x,y); /* if y even or odd */

So you can see, for negative x and integer y, the glibc version of pow will compute pow(-x,y) and then make the result negative if y is odd.

This is not the only way to do things, but my guess is that this is common to many implementations. You can see that pow is full of special cases. This is common in library math functions, which are supposed to work correctly with unfriendly inputs like denormals and infinity.

The pow function is especially hard to read because it is heavily-optimized code which does bit-twiddling on floating-point numbers.

The C Standard

The C standard (n1548 §7.12.7.4) has this to say about pow:

A domain error occurs if x is finite and negative and y is finite and not an integer value.

So, according to the C standard, negative x should work.

There is also the matter of appendix F, which gives much tighter constraints on how pow works on IEEE-754 / IEC-60559 systems.

Difference between the built-in pow() and math.pow() for floats, in Python?

Quick Check

From the signatures, we can tell that they are different:

pow(x, y[, z])
math.pow(x, y)

Also, trying it in the shell will give you a quick idea:

>>> pow is math.pow
False

Testing the differences

Another way to understand the differences in behaviour between the two functions is to test for them:

import math
import traceback
import sys

inf = float("inf")
NaN = float("nan")

vals = [inf, NaN, 0.0, 1.0, 2.2, -1.0, -0.0, -2.2, -inf, 1, 0, 2]

tests = set([])

for vala in vals:
  for valb in vals:
    tests.add( (vala, valb) )
    tests.add( (valb, vala) )

for a,b in tests:
  print("math.pow(%f,%f)"%(a,b) )
  try:
    print("    %f "%math.pow(a,b))
  except:
    traceback.print_exc()
  
  print("__builtins__.pow(%f,%f)"%(a,b) )
  try:
    print("    %f "%__builtins__.pow(a,b))
  except:
    traceback.print_exc()

We can then notice some subtle differences. For example:

math.pow(0.000000,-2.200000)
    ValueError: math domain error

__builtins__.pow(0.000000,-2.200000)
    ZeroDivisionError: 0.0 cannot be raised to a negative power

There are other differences, and the test list above is not complete (no long numbers, no complex, etc...), but this will give us a pragmatic list of how the two functions behave differently. I would also recommend extending the above test to check for the type that each function returns. You could probably write something similar that creates a report of the differences between the two functions.

`math.pow()`

math.pow() handles its arguments very differently from the builtin ** or pow(). This comes at the cost of flexibility. Having a look at the source, we can see that the arguments to math.pow() are cast directly to doubles:

static PyObject *
math_pow(PyObject *self, PyObject *args)
{
    PyObject *ox, *oy;
    double r, x, y;
    int odd_y;

    if (! PyArg_UnpackTuple(args, "pow", 2, 2, &ox, &oy))
        return NULL;
    x = PyFloat_AsDouble(ox);
    y = PyFloat_AsDouble(oy);
/*...*/

The checks are then carried out against the doubles for validity, and then the result is passed to the underlying C math library.

builtin `pow()`

The built-in pow() (same as the ** operator) on the other hand behaves very differently, it actually uses the Objects's own implementation of the ** operator, which can be overridden by the end user if need be by replacing a number's __pow__(), __rpow__() or __ipow__(), method.

For built-in types, it is instructive to study the difference between the power function implemented for two numeric types, for example, floats, long and complex.

Overriding the default behaviour

Emulating numeric types is described here. essentially, if you are creating a new type for numbers with uncertainty, what you will have to do is provide the __pow__(), __rpow__() and possibly __ipow__() methods for your type. This will allow your numbers to be used with the operator:

class Uncertain:
  def __init__(self, x, delta=0):
    self.delta = delta
    self.x = x
  def __pow__(self, other):
    return Uncertain(
      self.x**other.x, 
      Uncertain._propagate_power(self, other)
    )
  @staticmethod
  def _propagate_power(A, B):
    return math.sqrt(
      ((B.x*(A.x**(B.x-1)))**2)*A.delta*A.delta +
      (((A.x**B.x)*math.log(B.x))**2)*B.delta*B.delta
    )

In order to override math.pow() you will have to monkey patch it to support your new type:

def new_pow(a,b):
    _a = Uncertain(a)
    _b = Uncertain(b)
    return _a ** _b

math.pow = new_pow

Note that for this to work you'll have to wrangle the Uncertain class to cope with an Uncertain instance as an input to __init__()

How Math.Pow (And So On) Actually Works