How to Implement an Efficient Infinite Generator of Prime Numbers in Python

How to make an infinite generator in python to all prime numbers?

You don't augment count after finding a prime, so you're always returning the same value (3).

Your prime generator is going to take longer and longer between each prime as you move forward.

Here's an infinite prime generator that is more efficient. It is inspired from the sieve of Eratosthenes but uses a dictionary to only propagate multiples as it reaches a non-prime number and moves the prime multiple to the next multiple that hasn't been flagged as non-prime yet:

def genPrimes():
yield 2 # get the first prime out of the way
skips = dict() # multiples to skip {Multiple:2xPrime}
multiples = ((p*p,2*p) for p in genPrimes()) # multiples of primes
skipMark,_ = next(multiples) # skipping coverage
N = 1 # prime candidate (odd numbers)
while True:
N += 2 # next prime candidate
if N >= skipMark: # extend skips coverage
skipMark,stride = next(multiples) # 1st multiple and stride
skips[skipMark] = stride
if N in skips: # not a prime (multiple of a prime)
stride = skips.pop(N) # get prime multiple steps
multiple = N + stride # advance skip to next multiple
while multiple in skips:
multiple += stride # not already skipped
skips[multiple] = stride
else: # N is prime
yield N # return it

oputut:

for p in genPrimes(): print(p)
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
...

The skips dictionary contains roughly one entry per √P (where P is the number of primes found so far) and doesn't require pre-allocating memory. This approach trades space for a gain in time.

Generating infinite prime number squence using generators

Stepping through one step at a time:

while True:
n = next(N)

n is 2.

    yield n
N = (i for i in N if i%n != 0)

This wraps N in a generator which removes values that are multiples of n. Note that we said multiples of n, not multiples of 2.

On the next loop, we grab the next element out of naturals(), getting 3, modulus it against n, which is 2, and get 1, which is not zero. So we assign 3 to n and yield it. We then wrap the previous N in another generator which does the same thing as the previous wrapper did, which slows it down but has no other effect.

Then, on the next loop, we grab the next element out of naturals(), getting 4, modulus it against n, which is 3, and get 1, which is not zero. Then we do the modulus again and get the same result. So we assign 4 to n and yield it...

Simple prime number generator in Python

There are some problems:

  • Why do you print out count when it didn't divide by x? It doesn't mean it's prime, it means only that this particular x doesn't divide it
  • continue moves to the next loop iteration - but you really want to stop it using break

Here's your code with a few fixes, it prints out only primes:

import math

def main():
count = 3
 
while True:
isprime = True
 
for x in range(2, int(math.sqrt(count) + 1)):
if count % x == 0:
isprime = False
break
 
if isprime:
print count
 
count += 1

For much more efficient prime generation, see the Sieve of Eratosthenes, as others have suggested. Here's a nice, optimized implementation with many comments:

# Sieve of Eratosthenes
# Code by David Eppstein, UC Irvine, 28 Feb 2002
# http://code.activestate.com/recipes/117119/

def gen_primes():
""" Generate an infinite sequence of prime numbers.
"""
# Maps composites to primes witnessing their compositeness.
# This is memory efficient, as the sieve is not "run forward"
# indefinitely, but only as long as required by the current
# number being tested.
#
D = {}
 
# The running integer that's checked for primeness
q = 2
 
while True:
if q not in D:
# q is a new prime.
# Yield it and mark its first multiple that isn't
# already marked in previous iterations
#
yield q
D[q * q] = [q]
else:
# q is composite. D[q] is the list of primes that
# divide it. Since we've reached q, we no longer
# need it in the map, but we'll mark the next
# multiples of its witnesses to prepare for larger
# numbers
#
for p in D[q]:
D.setdefault(p + q, []).append(p)
del D[q]
 
q += 1

Note that it returns a generator.

Python generator and set(generator) get different results

By switching to working by segments between squares of consecutive primes, creating these sets for each segment one after another.

For each segment you'll have to calculate the starting point of the enumeration of a prime's multiples, for each known prime which is not greater than the segment's top value (i.e. the next "core" prime's square).

The "core" primes, to get the squares of, you can get separately, independently, by a recursive application of the same algorithm.

An example of this approach (the separate primes supply that is) is How to implement an efficient infinite generator of prime numbers in Python?

To make it parallel, you'll need to find means to use the set in a shared fashion between all the enumerations, which each will set each of its enumerated multiples off in the same shared set. Order of operations is not important, as long as they are all finished. The access need not be guarded, as setting the same location off twice (or more) is perfectly fine.

This will also be very efficient.

prime number generator in python: accumulation of numbers

This should work faster:

while 1:
i=input('Enter the number for which all previous shall be tested for primality: ')
n=5
a=[2,3]
while n<=i:
n=n+1
isPrime = True
for b in a:
if n%b==0:
isPrime = False
break
if isPrime:
a.append(n)
print a

But I do not think you can get much faster that O(See comment of sebastian) except if you are using more advanced algorithms and numbers very huge 10**100.

It will always get slower with bigger numbers.

Generator function for prime numbers

You need to reset prime to True as the first statement inside the while block, not before it. As it is, once you hit a single composite number, prime will never be true again, so you'll never yield any more numbers.

Can I optimize my prime number summing function with generators?

Your focus on making a generator function is an example of the XY problem. You've decided that the solution to your code's performance problems is to use a generator, but that's not actually correct. When you get non-generator-related answers, you think they're not helpful, and the rest of us are just a bit confused about why generators are relevant in the first place.

Lets examine why you're having performance issues. The main problem is that your code takes O(n) time to determine if each number n is prime. You have to do this for each numbers from two up to whatever your limit is. This means the whole algorithm takes O(N**2) time where N is the largest number to check (e.g. two million). For a large N your code will take a very long time.

Using a generator for your primes won't, by itself, improve that. It will still take just as long to figure out if each candidate value is prime, and you still need to check all the same numbers if you stick with your current algorithm. At best it would be as good as adding the prime numbers immediately to a running sum, rather than putting them into a list and summing at the end. That is, it could save you memory, but not time.

The real way to greatly improve your performance is to use a smarter algorithm that does less work. There are a bunch of good ways to find primes in less time. Some can be implemented as generators, but in this situation, computing all the primes at once and using extra memory in exchange for better performance is probably a reasonable trade-off.

That's because your computer can hold billions of integers in memory at one time. Numbers less than a few billion use about 28 byes in Python, so two million of them takes around 56 MB, plus about 18 MB more for the list data structure. So you can do a memory intensive algorithm without needing to worry about your memory usage.

Here's a very fast implementation of the Sieve of Eratosthenes algorithm for computing all of the primes less than N in pure Python. The implementation was originally by Robert Williams Hanks in this answer, but this version was tweaked a bit by Bruno Astrolino to work a little more efficiently in Python 3.6+ in this answer.

from itertools import compress

def rwh_primes1v1(n):
""" Returns a list of primes < n for n > 2 """
sieve = bytearray([True]) * (n//2)
for i in range(3,int(n**0.5)+1,2):
if sieve[i//2]:
sieve[i*i//2::i] = bytearray((n-i*i-1)//(2*i)+1)
return [2,*compress(range(3,n,2), sieve[1:])]

You would want to run sum(rwh_primes1v1(2_000_000)). On my computer that takes about 30 milliseconds, compared to your code which takes 30 seconds (1000 times longer) for N=100_000 (a bound twenty times less). I wasn't willing to wait for the three hours or so the inefficient algorithm would need for N=2_000_000.

Note that if you really do want a generator that yields the primes for some other reason, there are some good implementations of infinite prime generators in the answers to this question. It's unlikely that using any of them for your summing problem is going to result in faster code than what I provided above until you get to such a large N that you can't fit the whole sieve in memory at once (and only some of the generators will help with that, some have significant memory overheads themselves).



Related Topics



Leave a reply



Submit