Prime factorization algorithm

This is a code to find the prime factorization of consecutive numbers between 1 and n. It’s part of a larger program I’m working on. Since to find prime factorizations I only need to test for primes, for each number, starting from 1, in this program I check to see if the number is divisible by any of the primes in the primes list. If not, then it is added to the primes list. This way each larger number builds off previous primes found. Is there a way to make this algorithm faster?

primes = []

def factorize(n):
    global primes; global primitivesp_nos; x = []
    for i in primes:
        while n % i == 0: x.append(i); n = n//i
        if i * i > n: break
    if n > 1: x.append(n)
    if len(x) == 1: primes.append(x[0])
    return x  
print([factorize(i) for i in range(1, 10)]) 
#Output: [[], [2], [3], [2, 2], [5], [2, 3], [7], [2, 2, 2], [3, 3]]
  1. Use a prime sieve to generate candidate factors.
  2. Implement the algorithm in another language like C, and call it from Python via ctypes. Python is great at a lot of things, but number crunching isn’t where it excels.
1 Like

You’ll know about the Sieve of Eratosthenes? Each time you find a prime you eliminate all its multiples from further consideration, so they cost you almost nothing.

That only finds primes, but you want to list the factors, which is a nice twist. So suppose you start with:

a = [[i] for i in range(n)] 

Special treatment for a[0] and a[1], but then then for each k > 1 where len(a[k])==1, for all i a multiple of k where len(a[i])==1 you replace a[i] with [k, a[i//k]]. I think this means that by the time you have processed k=2, your list looks something like:

[[], [1], [2], [3], [2, [2]], [5], [2, [3]], [7], [2, [2, [2]]], [9], [2,[5]], ... [2,[9]], ... ]

Now why have I inserted a[i] and not flattened it? It is so that it occurs as a reference back to the earlier entry. The entry for a[8] took no more work than for a[4], and a[18] is still [2,[9]], but will magically become [2,[3,[3]]] as soon as we process a[3].

I have not tried this.

I wonder if there is a way to optimise out the len()==1 tests?

@abessman is right that Python is not made for flat-out speed, but your first step is always a good algorithm, and Python is great for exploring.

Ok, this is not quite right if you just do an assignment. You have to replace the contents of the list object a[i] without changing its identity or the references are still to the old value. a[i][:] = [k, a[i//k]] (I think).