Python // vs. divmod?

I have these functions that are intended to find all the prime divisors of every number from 2 to n:

from time import time 

start = time()

primes = [] 

def primedivs1(n): 
    global primes
    primedivs = []
    a = n
    for i in primes:
        while not a % i: 
            primedivs.append(i)
            a = a//i 
        if i * i > a:
            break
    if a > 1:
        primedivs.append(a)
        if a == n:
            primes.append(n)  
    return primedivs

print([primedivs1(i) for i in range(2, 10000)]) # test

end = time()

and

from time import time 

start = time()

primes = [] 

def primedivs2(n): 
    global primes
    primedivs = []
    a = n
    for i in primes:
        q, r = divmod(a, i) 
        while r == 0: 
            primedivs.append(i)
            a = q
            q, r = divmod(a, i)
        if i * i > a:
            break
    if a > 1:
        primedivs.append(a)
        if a == n:
            primes.append(n)  
    return primedivs

print([primedivs2(i) for i in range(2, 10000)]) # test

end = time()
      
print("Execution time 1: ", round(end - start, 2), "s")

In my original function, primedivs1, for each n, the function has to perform division twice with a % i (I presume that to find the remainder, you have to find the quotient) and a = a//i. Thus, I made a small edit to primedivs1 and got primedivs2, which uses divmod and therefore only needs to do one division since the quotient is saved with divmod.
Why is primedivs2 actually slower than primedivs1? I thought that since division is computationally expensive, it should be the other way around.

Also, is there some alternative to divmod that does the same thing but is faster?

Is it? I can’t find your times…

On average, it is slower.For example:|Column 1 | Column 2 | Column 3 | Column 4|
|— | — | — | —|
|Trial | primedivs1 | primedivs2 | |
|1 | 1.03 | 1.07 | |
|2 | 1.03 | 1.05 | |
|3 | 1.04 | 1.02 | |
|4 | 1.03 | 1.02 | |
|5 | 1.03 | 1.06 | |

Not by much, though.

Ok, and what makes you think that division is computationally expensive? If someone told you that, were they talking about Python? And about such small numbers?

From this website: Make your programs run faster: avoid expensive instructions - Johnny's Software Lab.

Mostly compared to multiplication/addition.

I get times like these (with x=1000):

26.6 ns  x * 3
38.0 ns  x // 3
27.9 ns  x % 3
28.1 ns  x + 3
83.5 ns  divmod(x, 3)

So division indeed seems a bit slower, but modulo doesn’t.

We’re talking about Python here. Where the actual CPU division is only a minor part of the whole operation.

And that page you linked also says “avoid function calls”. So… divmod?

benchmark
from timeit import repeat

for _ in range(3):
    for s in 'x * 3', 'x // 3', 'x % 3', 'x + 3', 'divmod(x, 3)':
        print(round(min(repeat((s+';') * 100, 'x = 1000', number=10000)) * 1000, 1), 'ns ', s)
    print()

Attempt This Online!

I guess I missed that. If not division, which operation is likely to take the most time in my program?

The printing of the large list.

What times do you get if you only compute the list but don’t print it?

For primedivs1: 0.02 s
For primedivs2: 0.03 s

When I change the range from [2, 10000] to something large like [2, 1000000], primedivs2 is significantly slower than primedivs1.

Is it? I can’t find your times…

Without divmod, it was around 4.5 seconds, but with divmod, it was over 7.

There are a number of differences between your programs you’re probably not yet aware of.

In the first you test for 0 via

 while not a % i:

but in the second

 while r == 0: 

Not the same! The second explicitly loads a 0 and asks for an equality comparison. The first way has fewer explicit operations, which means fewer trips around the “eval loop”, which in turn generally means “faster”.

The first’s

  while not a % i:

also saves actual, real work over the second’s

    q, r = divmod(a, i) 
    while r == 0: 

Why? Because the most common case is that the remainder isn’t 0, so the quotient isn’t needed. The first program doesn’t even compute the quotient then, but the second does anyway.

Something I don’t think has been mentioned yet is that divmod creates. and populates, a 2-tuple object with its result, and then q, r = ... unpacks that object and garbage collection recycles the now-useless 2-tuple object. All that takes longer than doing the arithmetic.

At a higher level, repeatedly testing for i * i > a does needless work. Faster to import isqrt from math, & start like so instead:

    limit = isqrt(a)
    for  i in primes:
        if i > limit:
            break

No multiplies in the loop then. But the limit should be updated too if a factor is found, so the body should be more like

    if not a % i:
        primedivs.append(i)
        a //= i
        while not a % i:
            primedivs.append(i)
            a //= i
        limit = isqrt(a) # only updated if `a` changed

Finally, simple arithmetic on floats and small ints runs very much faster under the PyPy implementation of Python, but that’s a whole different topic.

Have fun!