Unexpected large execution time

And THAT is why I wanted to see this with all the different clocks simultaneously, not just one clock.

It’s definitely sounding like your Mac is stopping the program from running for a while.

Do you mean that you yourself observed the program taking 10 hours? Or that you had the computer record the time by some means, and saw later that it took 10 hours?

Do you mean to include back datetime.now() along with the current function time.monotonic()? Or you had other functions in mind?

Yeah! I observed the program running for about 10 hours despite the fact that the program was telling me it took only half an hour!

I mean this:

because different clocks measure different things.

Hi Daniel,
Thank you for the reference. Yet maybe you can decipher couple of lines in the suggested solution for me as I have never used liner profiler or alike. Here is what was suggested on how to use line profiler in your reference:

from line_profiler import LineProfiler
import random

def do_stuff(numbers):
    s = sum(numbers)
    l = [numbers[i]/43 for i in range(len(numbers))]
    m = ['hello'+str(numbers[i]) for i in range(len(numbers))]

numbers = [random.randint(1,100) for i in range(1000)]
lp = LineProfiler()
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()

OK! So here are my understandings:

lp = LineProfiler() #This must be an instantiation of the line profiler
lp_wrapper = lp(do_stuff) #This must be telling the line profiler that I am interested \
                          # to record statistics about the function 'do_stuff'
lp_wrapper(numbers) # Now I have no idea why we are stating this and what it does! \
                    # Why we need it and what happens if we make no reference to it

Maybe you can clarify my understanding of the above example.

So if I understand your suggestion, I just need to add process_time to the code along with time.monotonic() and re-run the code. Right?

Or add ALL of the clocks available to you.

Apologies, maybe this is a better refence:

You’ll probably want to put your code in a function (EG def main), and then apply the decorator to main.

I tried a few things using this code:

import time

start_process_time = time.process_time()
start_perf_counter = time.perf_counter()
start_monotonic = time.monotonic()
start_time = time.time()

try:
    while True:
        pass
except KeyboardInterrupt:
    print("process_time", time.process_time() - start_process_time)
    print("perf_counter", time.perf_counter() - start_perf_counter)
    print("monotonic", time.monotonic() - start_monotonic)
    print("time", time.time() - start_time)

The program keeps running until I interrupt it, and then prints the times. So what I’m looking at is not the difference in times between runs (it’s just whenever I choose to stop it), but the different between the results from the different clocks.

If I just run it for a short time, I get this:

process_time 30.64817
perf_counter 30.648469999898225
monotonic 30.648472583969124
time 30.647507905960083

They’re all basically the same value.

If I suspend the process in the middle of it, then continue it, then interrupt it, I get this:

process_time 15.989780999999999
perf_counter 30.943234625039622
monotonic 30.943238334031776
time 30.942265033721924

Here, process_time stops counting while it’s suspended, but the other clocks keep running.

Now, if I put the computer in sleep mode in the middle of it, I get this:

process_time 29.642594
perf_counter 29.93822104204446
monotonic 29.938223292003386
time 44.82132887840271

Here, perf_counter and monotonic also stop counting, and only time keeps running.

I’m wondering if your computer automatically went to sleep in the middle of running your code. It seems that would be consistent with the program taking a long time, but time.monotonic() showing a much shorter time.

I can’t really answer your question. Maybe it did and that could be the culprit. I remember, however, that in one experiment, I printed something relevant at the end of each iteration (so it would diffuse the fact that computer may have gong to sleep), and then redirected the print to a file. However, this file just became too big, and I had to reduce it by printing it say every 1000 iterations. But even then, the issue persisted.

I am running a number of runs where I too have used different time functions that I will share with everyone. So far I see datetime.now() really representing the passed time, while the others are not.

I am just curious if you run the whole thing on your computer, MULTIPLE TIMES, whether or not you too encounter the same issue, or is it just me?! (If it is me, then I must be very unlucky!)

I saw nothing notable when doing this. But I suggest backing up and resume what should have been done from the start: produce a minimal example that shows odd behavior. That doesn’t just mean getting rid of bizarre code, but also minimizing the amount of time people have to burn to try to reproduce your problem. Like so:

N = 10**9

import datetime
now = datetime.datetime.now

start = now()
print(f"{start} starting {N=:_}")

k = 0
while k < N:
    k += 1

end = now()
print(f"{end} elapsed {end-start}")

assert k == N

Your symptom has to do with wall-clock time, and that’s what datetime gives us. So use it. Quibbling over the details of high-performance fine-grained timers is pointless when you’re talking about hours of “lost” time.

Typical output:

2023-05-14 20:54:31.645496 starting N=1_000_000_000
2023-05-14 20:57:20.221634 elapsed 0:02:48.576138

Now for you to do: how small can you make the exponent in the first (N = ...) line before a problem becomes apparent? If it’s already apparent at the given N = 10**9, then you can save everyone else a world of wasted time by saying so.

BTW, there’s no reason yet to suspect this has anything to do with Python. I understand that you’re using Python, but there’s no reason given to believe you wouldn’t see similar things if you were running any other language.

Python itself is in fact an unlikely suspect. The program is doing “almost nothing”. Those who know about CPython’s implementation can easily see that the loop is merely doing simple integer addition and a whole lot of dict lookups/updates (you’re accessing and rebinding module globals, which are stored in the module’s __dict__ using their string names as keys). Garbage collection is irrelevant , and all the data should easily fit in L1 cache.

1 Like

I thought that the code was fairly straightforward, even with a small number of lines, and I would be surprised if anyone would have entangled or confused in the amount of code suggested. At any rate, that is water under the bridge now …

In fact from the very start, the datetime.now() was used. People however insisted that I need to use other timing functions, and that is what I did.

I think this the end of this experiment for me. I have spent enough time on this and still cannot find a resolution on this post. So I need to move on. Sometimes, you just have to admit a defeat,… No need to beat a dead horse anymore.

It could very well be the case in other programming languages as well. I never claimed that this issue is relevant to Python only. Yet, the issue was and is unexplainable to me and this has nothing to do with CPython too. And the fact that the loop was doing a very silly thing is actually the point that caught my attention, that is it was done on purpose to see what could cause such a large delay. Garbage collection was turned on and then off also based on recommendation from those who participated in this post. Again, this is how different theories are tested. Finally, the code was small enough, but even if it was not and could not fit in L1 cache, it certainly could have been fit in L2 cache of the processor or main RAMs.

OK, here is the result of my last experiment for anyone who is interested:
1- I used 3 different clocking schemes, namely datetime.now(); time.monotonic(); and time.process_time()

2- In all multiple runs that I had, time.monotonic() and time.process_time() were consistent yet wrong compared to the wall clock. dateime.now() however was the one that showed when the code finished and was compatible with the wall clock.

3- In all these experiments, the code was accomplished on average in 45 min. Here is the result of one these runs:

All done! Number of iterations = 5250000001 , of the total= 5250000000
start time using datetime.now()= 2023-05-14 12:34:07.080544
end time using datetime.now()= 2023-05-14 13:24:06.610949
Elapsed time: 2999530.405 milliseconds when using datetime.now()

start time using time.monotonic()= 172552.919527151
end time using time.monotonic()= 173587.240459823
Elapsed time: 1034320.9326719807 milliseconds when using time.monotonic()

start time using time.process_time()= 0.071838
end time using time.process_time()= 1003.417505
Elapsed time: 1003345.667 milliseconds when using time.process_time()

4- The only one that was an outlier was this one which took about 90 mins:

All done! Number of iterations = 5250000001 , of the total= 5250000000
start time using datetime.now()= 2023-05-14 19:04:20.371047
end time using datetime.now()= 2023-05-14 20:33:46.885792
Elapsed time: 5366514.745 milliseconds when using datetime.now()

start time using time.monotonic()= 180923.678196712
end time using time.monotonic()= 182034.708197858
Elapsed time: 1111030.0011460204 milliseconds when using time.monotonic()

start time using time.process_time()= 0.069705
end time using time.process_time()= 1105.324087
Elapsed time: 1105254.3819999998 milliseconds when using time.process_time()

Thank you all for your time and inputs. I do appreciate that.
All the best.

I’m sorry you didn’t try to reduce the time needed to run this, as suggested. Requiring literally hours to run one sequence of tests vastly reduces what other people are willing to give to this. But note something from your last sequence anyway:

You already have a mystery here: the wall-clock time is already about 3x longer than than the process time. In your “bad” run, the process time didn’t materially change, but the wall-clock time was about 5+x longer (instead of 3x).

The real question isn’t why it’s sometimes 5x but usually 3x: the question is why it’s 3x to begin with. You surely don’t need a test that runs for 45 minutes to demonstrate this. For some reason(s), your OS is loathe to give the process cycles to run in, in all cases.

Are you, perhaps, running lots of other CPU-bound processes at the same time? Have you tried on a machine running as little other stuff as possible?

On my box, a 4-core machine with only 1 core running full-time doing other stuff, elapsed wall-clock and process times are about the same. Which is what I expect.

1 Like

Use Google. I don’t run Macs, but a very quick Google search said “App Nap” is a little-known “feature” of macOS since 2013, which puts applications to sleep if the OS thinks you’re not interacting with them. Which would be supremely relevant to what you’re seeing, if that’s what’s happening.

1 Like

I’ve run the code multiple times now, and I had no issues. This is with the version of your code with no timing within the loop (only outside of it). It took about 7 or 8 minutes to finish each time. So, I’d have to say that this is only happening to you.

That said, I think that the clue is in time.monotonic() showing much less time elapsed than the actual time. On macOS—other operating systems might be different—the clock used by time.monotonic() is system-wide, and it keeps going continuously as long the system isn’t asleep. It looks like what must be happening is that your computer is going to sleep while the code is running. That’s what I believe, unless somebody knows another reason that time.monotonic() would stop incrementing for a period.

1 Like

Maybe try this. It’s a port of your little CPython program to C. I’ve changed the constants a bit to get comparable execution duration.

#include <stdio.h>
#include <sys/time.h>


unsigned long timestamp()
{
    struct timeval tv;
    gettimeofday(&tv,NULL);
    unsigned long time_in_microseconds = 1000000 * tv.tv_sec + tv.tv_usec;
    return time_in_microseconds;
}

int main()
{
    long m = 0;
    long s = 20000;
    long t = 5000;
    long j = 70;
    long i = 75;
    long all_multiplied = s * t * j * i;
    long k = 0;
    while (k < all_multiplied)
    {
        unsigned long start_time = timestamp();
        m = m + 1;
        unsigned long end_time = timestamp();
        unsigned long elapsed_time_in_milliseconds = (end_time - start_time) / 1000;
        if (elapsed_time_in_milliseconds > 200)
        {
            printf("k = %ld, m = %ld\n", k, m);
            printf("Error1 exceeding limits for trail %ld of 750\n", i);
            printf("      Elapsed time: %lu miliseconds\n", elapsed_time_in_milliseconds);
            return(1);
        }
        else if (elapsed_time_in_milliseconds > 25 && elapsed_time_in_milliseconds <= 200)
        {
            printf(
                "Warning: On iteration number %ld of a total %ld, elapsed time: %lu milliseconds\n",
                k,
                all_multiplied,
                elapsed_time_in_milliseconds
            );
            fflush(stdout);
        }
        if ((m % 10000000) == 0)
        {
            printf("m = %ld (%f%%)\n", m, m * 100.0 / all_multiplied);
            fflush(stdout);
        }
        k = k + 1;
    }
}

The idea being:

  1. If it’s equally jittery, that means it’s not a CPython problem.
  2. If it doesn’t have the problem, that increases the likelihood that it is something about CPython on macOS.

I ran the CPython version of the program 50 times, and the C version of the program 50 times also - all on an otherwise-quiescent mac mini.

The CPython version exited false zero times in 50.

The C version exited false one time in 50.

I suspect the OP has (re)discovered that MacOS is not a real time operating system.

2 Likes