What Would Be the Fastest Method to Test for Primality in Java

What would be the fastest method to test for primality in Java?

Here's another way:

boolean isPrime(long n) {
    if(n < 2) return false;
    if(n == 2 || n == 3) return true;
    if(n%2 == 0 || n%3 == 0) return false;
    long sqrtN = (long)Math.sqrt(n)+1;
    for(long i = 6L; i <= sqrtN; i += 6) {
        if(n%(i-1) == 0 || n%(i+1) == 0) return false;
    }
    return true;
}

and BigInteger's isProbablePrime(...) is valid for all 32 bit int's.

EDIT

Note that isProbablePrime(certainty) does not always produce the correct answer. When the certainty is on the low side, it produces false positives, as @dimo414 mentioned in the comments.

Unfortunately, I could not find the source that claimed isProbablePrime(certainty) is valid for all (32-bit) int's (given enough certainty!).

So I performed a couple of tests. I created a BitSet of size Integer.MAX_VALUE/2 representing all uneven numbers and used a prime sieve to find all primes in the range 1..Integer.MAX_VALUE. I then looped from i=1..Integer.MAX_VALUE to test that every new BigInteger(String.valueOf(i)).isProbablePrime(certainty) == isPrime(i).

For certainty 5 and 10, isProbablePrime(...) produced false positives along the line. But with isProbablePrime(15), no test failed.

Here's my test rig:

import java.math.BigInteger;
import java.util.BitSet;

public class Main {

    static BitSet primes;

    static boolean isPrime(int p) {
        return p > 0 && (p == 2 || (p%2 != 0 && primes.get(p/2)));
    }

    static void generatePrimesUpTo(int n) {
        primes = new BitSet(n/2);

        for(int i = 0; i < primes.size(); i++) {
            primes.set(i, true);
        }

        primes.set(0, false);
        int stop = (int)Math.sqrt(n) + 1;
        int percentageDone = 0, previousPercentageDone = 0;
        System.out.println("generating primes...");
        long start = System.currentTimeMillis();

        for(int i = 0; i <= stop; i++) {
            previousPercentageDone = percentageDone;
            percentageDone = (int)((i + 1.0) / (stop / 100.0));

            if(percentageDone <= 100 && percentageDone != previousPercentageDone) {
                System.out.println(percentageDone + "%");
            }

            if(primes.get(i)) {
                int number = (i * 2) + 1;

                for(int p = number * 2; p < n; p += number) {
                    if(p < 0) break; // overflow
                    if(p%2 == 0) continue;
                    primes.set(p/2, false);
                }
            }
        }
        long elapsed = System.currentTimeMillis() - start;
        System.out.println("finished generating primes ~" + (elapsed/1000) + " seconds");
    }

    private static void test(final int certainty, final int n) {
        int percentageDone = 0, previousPercentageDone = 0;
        long start = System.currentTimeMillis();
        System.out.println("testing isProbablePrime(" + certainty + ") from 1 to " + n);
        for(int i = 1; i < n; i++) {
            previousPercentageDone = percentageDone;
            percentageDone = (int)((i + 1.0) / (n / 100.0));
            if(percentageDone <= 100 && percentageDone != previousPercentageDone) {
                System.out.println(percentageDone + "%");
            }
            BigInteger bigInt = new BigInteger(String.valueOf(i));
            boolean bigIntSays = bigInt.isProbablePrime(certainty);
            if(isPrime(i) != bigIntSays) {
                System.out.println("ERROR: isProbablePrime(" + certainty + ") returns "
                    + bigIntSays + " for i=" + i + " while it " + (isPrime(i) ? "is" : "isn't" ) +
                    " a prime");
                return;
            }
        }
        long elapsed = System.currentTimeMillis() - start;
        System.out.println("finished testing in ~" + ((elapsed/1000)/60) +
                " minutes, no false positive or false negative found for isProbablePrime(" + certainty + ")");
    }

    public static void main(String[] args) {
        int certainty = Integer.parseInt(args[0]);
        int n = Integer.MAX_VALUE;
        generatePrimesUpTo(n);
        test(certainty, n);
    }
}

which I ran by doing:

java -Xmx1024m -cp . Main 15

The generating of the primes took ~30 sec on my machine. And the actual test of all i in 1..Integer.MAX_VALUE took around 2 hours and 15 minutes.

Algorithm for finding a prime with the least amount of computations

I would use the Miller Rabin test, which can easily be made deterministic for numbers smaller than 341,550,071,728,321 (and 2^31 is much smaller than that).

Pseudocode: there are a number of different cases.

x smaller than 9: Return (x & 1) != 0 || x == 2
x smaller than about 200 (tweakable): use trial division (what you used)
x smaller than 1373653: use Miller Rabin with bases 2 and 3.
x smaller than 4759123141 (that is everything else): use Miller Rabin with bases 2, 7 and 61.

Fastest primality test

The only deterministic, polynomial-time algorithm for primality testing I know of is the AKS primality test (http://en.wikipedia.org/wiki/AKS_primality_test). However, there are a lot of very good randomized primality tests that are fast and have extremely good probability of success. They usually work by finding whether the number is composite with exponentially good probability, so they'll either report that the number is composite or will require you to say "maybe" with very good confidence.

Fastest prime test for small-ish numbers

Yes, you'll find with most algorithms that you can trade space for time. In other words, by allowing the use of more memory, the speed is greatly increased ^*a.

I don't actually know the Miller-Rabin algorithm but, unless it's simpler than a single shift-left/add and memory extraction, it will be blown out of the water by a pre-calculated sieve.

The important thing here is pre-calculated. It's a good idea, in terms of performance, to pre-calculate things like this since the first million primes will be unlikely to change in the near future :-)

In other words, create your sieve with something like:

unsigned char primeTbl[] = {0,0,1,1,0,1,0,1,0,0,0,1};
#define isPrime(x) ((x < sizeof(primeTbl) ? primeTbl[x] : isPrimeFn(x))

with all the usual caveats about not passing things like a++ into macros. This gives you the best of both worlds, a blindingly fast table lookup for "small-ish" primes, dropping back to a calculation method for those outside the range.

Obviously you would write a program using one of the other methods to generate that lookup table - you don't really want to have to type it all in by hand.

But, as with all optimisation questions, measure, don't guess!

^*a A classic case of this was some trig functions I once had to write for an embedded system. This was a competitive contract bid and the system had a little more storage than CPU grunt.

We actually won the contract since our benchmark figures for the functions blew the competition away.

Why? Because we pre-calculated the values into a lookup table originally calculated on another machine. By judicious use of reduction (bringing the input values down below 90 degrees) and trig properties (the fact that cosine is just a phase shift of sine and that the other three quadrants are related to the first), we got the lookup table down to 180 entries (one per half degree).

The best solutions are those that are elegant and devious :-)

For what it's worth, the following C code will generate such a table for you, all the primes below four million (283,000 of them).

#include <stdio.h>

static unsigned char primeTbl[4000000];

int main (void) {
    int i, j;

    for (i = 0; i < sizeof(primeTbl); i++)
        primeTbl[i] = 1;

    primeTbl[0] = 0;
    primeTbl[1] = 0;
    for (i = 2; i < sizeof(primeTbl); i++)
        if (primeTbl[i])
            for (j = i + i; j < sizeof(primeTbl); j += i)
                primeTbl[j] = 0;

    printf ("static unsigned char primeTbl[] = {");
    for (i = 0; i < sizeof(primeTbl); i++) {
        if ((i % 50) == 0) {
            printf ("\n   ");
        }
        printf ("%d,", primeTbl[i]);
    }
    printf ("\n};\n");
    printf ("#define isPrime(x) "
        "((x < sizeof(primeTbl) ? primeTbl[x] : isPrimeFn(x))\n");

    return 0;
}

If you can bump up the primeTbl table to sixteen million entries (16M), you'll find that's enough to keep the prime count above a million (the first 1,031,130 primes).

Now there are ways to make that take less storage such as only storing odd numbers and adjusting the macro to take care of that, or using a bit mask instead of unsigned characters. I prefer simplicity of algorithms myself if the memory is available.

Primality test. Is this program performing faster than all others?

Your code is doing so called Trial Division primality test, which is slow for big numbers. It has time complexity O(Sqrt(N)) of primitive operations.

If you have really big numbers, 1024 bits or bigger then Trial Division will be a way too slow.

There are two kinds of Primality Tests - deterministic and probabilistic. Trial Division is one example of deterministic algorithm. All deterministic algorithms are much slower then probabilitic.

Probabilistic algorithms are never certain about if a number is really prime. But for any small chosen probability P they can tell in very little computational time if the number is prime with certainty of this probability. Deterministic algorithm are always certain if a number is prime or not.

There exist faster than yours algorithms of deterministic primality testing, for example Elliptic Curve Primality Test, which is fast but difficult to implement. But you can easily test any number for free with fast software like Primo for Linux, this is free software but with closed sources and for Linux only (there is no Windows version at all).

I decided to implement for you from scratch one probabilistic primality test, which is Fermat Primality Test, it is very fast and easy to implement in few lines of code, which I did in below C++ code. It has complexity of O(Log2(N)) which is blazingly fast time even for 20000-bit numbers.

Try it online!

#include <cstdint>
#include <iostream>

using Word = uint32_t;
using DWord = uint64_t;

Word PowMod(Word a, Word b, Word const & c) {
    // https://en.wikipedia.org/wiki/Modular_exponentiation
    Word r = 1;
    while (b != 0) {
        if (b & 1)
            r = (DWord(r) * a) % c;
        a = (DWord(a) * a) % c;
        b >>= 1;
    }
    return r;
}

Word Rand(Word const & prev, Word const & begin, Word const & end) {
    Word constexpr magic_prime = uint32_t(-1) - 4;
    return Word((DWord(prev) * magic_prime) % (end - begin)) + begin;
}

bool IsFermatProbablePrime(Word const & n, int trials = 32) {
    // https://en.wikipedia.org/wiki/Fermat_primality_test
    if (n <= 16)
        return n == 2 || n == 3 || n == 5 || n == 7 || n == 11 || n == 13;
    Word witness = 2;
    for (size_t i = 0; i < trials; ++i) {
        witness = Rand(witness, 2, n - 2);
        if (PowMod(witness, n - 1, n) != 1)
            return false;
    }
    return true;
}

int main() {
    Word const prime_num = 3941362591U, comp_num = 2245837171U;
    std::cout
        << std::boolalpha
        << "num1 is prime: " << IsFermatProbablePrime(prime_num) << ", "
        << "num2 is prime: " << IsFermatProbablePrime(comp_num)  << std::endl;
}

Output:

num1 is prime: true, num2 is prime: false

(and really first number prime_num is prime and second comp_num is composite)

You can see that I used two typedefs for Word and DWord, like this:

using Word = uint32_t;
using DWord = uint64_t;

This kind of typedefs only allow you to check primality of 32-bit numbers, for your case to check big numbers just replace with:

using Word = BigInt;
using DWord = BigInt;

to use your class BigInt that you already used in your code.

How does this primality test make the code 5000x faster?

Duh. The function isInIntSet is taking an intSet as an argument directly, so the entire set is being copied. I meant to pass by reference (intSet &set). That takes the search time down to .000003 seconds with an unordered_set.

What Would Be the Fastest Method to Test for Primality in Java