Unexpected large execution time

curiousg · May 11, 2023, 1:53pm

Hi,
I have a very simple Python code but I have come to a very strange issue that has baffled me: Could you find any explanation for a simple python code where execution of single statement in a very large iteration (2000 * 500 * 750 * 7) would suddenly take a very long time up to an hour? (note that the statement is a simple integer addition of m = m+1)?

Some further inputs:
1- Note that I am using Python version 3.10.11
2- I have tested the code on two different computers but with the same results. Both computers were up to date and with plenty of resources.
3- No other processes were running on the computer.
4- Multiple runs would show this anomaly for different iteration number having different execution time.
5- The execution time is always large, where most of the time the simple execution takes about 0.003 ms on average, all the sudden it jumps to about an hour.
6- I even have tried to disable Python garbage collection but to no avail.
7- In measuring the delay, I have used both “time” as well as “datetime”
8- Finally I have tested the code, using both “for” loop as well as “while” loop but got the same issue.

Here is the code itself in case you were interested and want to try it on your computer (if you do not see it for the first time, run the code again!):

from datetime import datetime
import gc

gc.disable() 
m=0
s = 2000
t = 500
j = 7
i = 750
all_multiplied = s * t * j * i
k = 0
#for k in range (all_multiplied):
while (k < all_multiplied):
    start_time = datetime.now()
    m = m + 1
    end_time = datetime.now()
    elapsed_time = (end_time - start_time).total_seconds() * 1000
    if (elapsed_time > 200.0):
        print ("k = ", k, "m = ", m)
        print("Error1 exceeding limits for trail ", i, " of 750")
        print (f"      Elapsed time: {elapsed_time} miliseconds")
        exit()
    elif ((elapsed_time > 25.0) and (elapsed_time <= 200)):
        print("Warning: On iteration number ", k , "of a total = ", all_multiplied, \
               f"      Elapsed time: {elapsed_time} miliseconds")
    if ((m % 1000) == 0):
        print ("m = ", m)
    k = k + 1

DamirHanov · May 11, 2023, 2:15pm

Hey!
When you run this code you measure time it takes to compute m = m + 1. But the part where you do printing also takes time. So consider that it will effect on the execution time.

As far as I understand, you don’t need to disable gc because you don’t create and free a lot of objects, thus this doesn’t affect you in any way. (need to test that later to be sure)

Also I’d recommend to use time.perf_counter()
or time.perf_counter_ns() instead of datetime.

If you need performance for sums consider using built-in sum() function.

curiousg · May 11, 2023, 2:22pm

Thank Damir,
However, the fact that printing or using other statement would take time is understood. What however does not make sense here is that for the majority of the time, these execution times are within very reasonable boundries, but all the sudden it jumps up to very unexpected values.

Dutcho · May 11, 2023, 7:41pm

You iterate sufficiently often (> 5 billion) that m increases beyond the int digit size. Could it be the increase happens around iteration 1_073_741_823?

If so, the explanation is Python’s unlimited int type. For smaller numbers, a single 30-bit ‘digit’ is used; larger numbers need more digits, and therefore more time.

Just a hunch (cannot replicate because I’m on mobile device)

tjreedy · May 11, 2023, 10:36pm

After changing m % 1000 to m % 1000_000, so as to print only 1 lines per second, I ran this to completion (without gc.disable(), 5250 m = lines) with 3.12.0a7 in IDLE on Win10 and there were no warning/error messages.

kknechtel · May 12, 2023, 12:40am

Do you also observe the output suddenly pause at this point and stay paused for that whole time?

Aside from that: even if it did consistently only take 0.003 ms (my machine is slower, but in the ballpark), we are talking about ~5 billion iterations here; the code would still take several hours to run, and apparently might not consistently reproduce the issue. It’s a little much to expect others to test this out.

What exactly is the goal here? For example, is this code the result of trying to get a minimal reproducible example for a larger problem? Timing code properly is hard; the timeitstandard library module exists for that purpose.

Rosuav · May 12, 2023, 12:55am

Calling datetime.now() is extremely slow compared to simple integer arithmetic. I suspect that, whatever timing problems you’re having, they’re actually from the timing itself, not the incrementing.

dstromberg · May 12, 2023, 2:52am

The standard answer to performance problems almost always includes a profiler.

In your case, a line-oriented profiler would be best.

I like the arbitrary precision int conjecture though.

tjreedy · May 12, 2023, 4:36am

I did, with latest release, as already reported. It took about 2 hours. No problems.

curiousg · May 12, 2023, 1:30pm

Hi Terry,
I have tested this code on two different Macs, and not on Windows. Yet, the issue persisted quite repeatedly. I do not think that has to do with (m%1000) or m=m+1 for that matter. I suggest if you do find time, re run the code multiple times. I think it would show itself up (unless Macs are different from PCs?!)

curiousg · May 12, 2023, 1:36pm

Hi Karl,
In response to your questions:
1- I have marked the code such that if it encounter a long execution time, to exist. So the code does not proceed further. If the code would run through its entirety, then it would take about 5 hours maximum. Yet, even if you remove all the other lines and just leave it to itself, it definitely takes more than 5 hours which is an indication that something is amiss.

2- There is no goal here to pursue, but encountering unexpected and unexplained situation which at least I am incapable
of finding an answer for!

curiousg · May 12, 2023, 1:41pm

Hi Chris,
You are absolutely write that the issue is unrelated to m=m+1. I have not tested it yet, but I am sure even if I would have said m=0 in its place, the issue would have existed.

On the datetime taking too long, I disagree. In fact you can change the code so that the difference (elapsed_time) be printed for every iteration. You could see that it is minuscule. Yet all the sudden on a random iteration, it jumps to about to 45 min to an hour!

curiousg · May 12, 2023, 1:44pm

Hi Daniel,
I have not ever used a profiler! But on arbitrary precision? Well, m could be set to zero and it would not make a difference I guess. But more importantly, if it had to do with precision, then you would have observed the issue at a very specific iteration every time that you ran the code. Yet it happens quite randomly at different iteration pass.

curiousg · May 12, 2023, 1:46pm

Hi Terry,
My suggestion?! Run the code multiple times to observe the issue (Unless I am the unluckiest person to keep running into this issue! )

Rosuav · May 12, 2023, 3:00pm

Yep definitely. Or if you removed it altogether, which would probably be worth confirming.

It usually takes a short time, but it sometimes takes a long time. That sounds, to me, like it could be any of several things, all related to the datetime call:

It’s not actually taking all that time, it’s a measurement error. You would know if this is the case because the messages just stop happening for half an hour or more; if they don’t, it’s some kind of measurement issue and not actually taking all that time. There are a lot of possibilities but they mostly come down to a change in the definition of “now”.
The program stops running for a time (due to the system being busy, or something getting shut down, or maybe its output is going to a pipe and it gets full). It’s pure coincidence what point you happen to be at when that hits, and it’s nothing to do with the actual time taken.
For some obscure reason, calculating the current time occasionally takes three parts of forever. REALLY hard to track down, but possible.

My recommendation? Switch out datetime.now() for time.monotonic() and see if the same thing happens. Then try time.perf_counter() and time.thread_time(), similarly. Both of these return floating-point seconds, so you can count elapsed time the same way you’re currently doing, only you don’t have to use the total_seconds() method (it’d just be (end_time - start_time) * 1000 for milliseconds).

Pinning down exactly what’s happening will be much easier with that additional information. You can leave the m = m + 1 in there if you want the consistency.

curiousg · May 12, 2023, 3:32pm

Hi Chris,
Thanks for your insights. I need however to inform you that I have used both time.montonic() as well as time.pref_counter() before and yet the issue is persistent ( I did not try time.thread_time()).

Also as mentioned, no processes seems to be running in the background to make the system busy. (in fact if that kept happening that should have affected all long running Python codes, which so far at least I have not observed)

Not sure if this is the real culprit. At any rate, the code is at your disposal, and if you find the solution, I would be happy to hear from you.

curiousg · May 12, 2023, 5:43pm

Hi Steven
In fact, I added disabling the garbage collection because I thought that may be the culprit. It really doesn’t matter if you turn it on or off, the symptom would still show up.

At any rate, the entire code is at your disposal in case you wanted to experiment and validate my observations.

Rosuav · May 12, 2023, 6:03pm

I haven’t been able to replicate your results though.

curiousg · May 12, 2023, 6:08pm

Hi Chris
Could the culprit be a Mac vs PC?! It is very unlikely but I don’t want to turn any stone unturned. Mind that I tested it on two different Mac with the same results.

I wish someone could come up with an explanation on this obscure issue. They can blame me for all care, but at least a reasonable explanation!

dstromberg · May 12, 2023, 7:50pm

Try this out: