The point of the sieve of Eratosthenes algorithm is that you never need to do a division test, you can eliminate values which cannot possibly be prime without needing to test whether or not they are prime.
If you are doing this:
“Then for each potential prime n
I test it against all the previous primes up to sqrt(n)
.”
then you’re not using the sieve of Eratosthenes, you are using some other algorithm (likely a non-naive trial division, or possibly a wheel algorithm).
By the way, once you get past the first two primes 2, 3, you only need to test the number just before, and just after, each multiple of 6. That allows you to skip testing 2 out of 3 potential candidate primes instead of just 1 out of 2 (the even numbers).
(This is not the sieve of Eratosthenes either.)
def primes():
yield 2
yield 3
i = 6
while True: # Run forever.
a = i - 1
b = i + 1
if isprime(a): yield a
if isprime(b): yield b
i += 6
There are more complex wheel arrangements that can skip even more.
I recommend you read the Wikipedia article you linked to and look carefully at the animation. In my opinion, the animation could be improved, but it gives an idea of the process.
There is a version of Euler’s sieve popular in Haskell circles, where it is often wrongly described a the sieve of Eratosthenes.
primes = sieve [2..]
sieve (p : xs) = p : sieve [x | x <- xs, x `mod` p > 0]
That is delightfully concise, but hugely inefficient. Asymptotic behaviour is worse than trial division.
A Python version might be:
def sieve():
# Very slow, very inefficient version of Euler's sieve.
nums = itertools.count(2)
while True:
prime = next(nums)
yield prime
nums = filter(lambda v, p=prime: (v % p) != 0, nums)
I don’t know if it is possible to write an efficient Euler’s sieve version.
In my own limited testing, the fastest method I found to produce smallish primes was this version of the Croft Spiral.
def croft():
# Implementation is based on erat3 from here:
# http://stackoverflow.com/q/2211990
# and this website:
# http://www.primesdemystified.com/
# Memory usage increases roughly linearly with the number of primes seen.
# dict ``roots`` stores an entry p**2:p for every prime p.
for p in (2, 3, 5):
yield p
roots = {9: 3, 25: 5} # Map d**2 -> d.
primeroots = frozenset((1, 7, 11, 13, 17, 19, 23, 29))
selectors = (1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0)
for q in itertools.compress(
# Iterate over prime candidates 7, 9, 11, 13, ...
itertools.islice(itertools.count(7), 0, None, 2),
# Mask out those that can't possibly be prime.
itertools.cycle(selectors)
):
# Using dict membership testing instead of pop gives a
# 5-10% speedup over the first three million primes.
if q in roots:
p = roots[q]
del roots[q]
x = q + 2*p
while x in roots or (x % 30) not in primeroots:
x += 2*p
roots[x] = p
else:
roots[q*q] = q
yield q
At some point, once you hit truly enormous candidates, it becomes impractical to store all the previously seen primes in order to perform trial division. At that point I guess the only practical method is to swap to a probabilistic method such as Miller-Rabin.
If Tim Peters were around, he would probably argue in favour of just using Miller-Rabin (augmented by a couple of trial division tests on the small primes to weed out obvious composites) for everything. But where’s the fun in that?