Timer time.process_time() Discrepancy

Hi,

I am currently brushing up on measuring time intervals between processes. I am using the following function for test purposes (note that the time.clock() function has been deprecated):

import time

def timer(func, *args):
    start = time.process_time()
    for i in range(1000):
        func(*args)
    return time.process_time() - start

>>> timer(pow, 2,10000)
0.03125
timer(pow, 2,8500)
0.015625
>>> timer(pow, 2,8500)
0.0
>>> timer(pow, 2,9500)
0.015625
timer(pow, 2,8500)
0.015625
>>> timer(pow, 2,9500)
0.0

It appears as if the time interval results are not consistent with the function process_time() from the timer module.

Can someone recommend a good reliable time tested method for measuring time intervals?

Note: I am using IDLE Shell 3.12.0 on Windows 11 Home

Do you want to measure elapsed time or CPU time used?
CPU is returned by process_time (but process_time_ns is recommended)
Elapsed by time.monotonic_ns.

Also note that a benchmark needs to run for longer and average multiple runs to eliminate OS scheduling effects.

Thank you for responding to my post.

I would like to measure raw time (ns, ms, s, etc.). The actual time elapsed. Fyi, I am on Chapter 21 of Learning Python. In the book, published 2013, they used the time.clock() function in their examples. But as you know and I pointed out in my previous post, this has since been deprecated. So, I am looking for a reliable alternative to follow along with.

Note the following different timing results despite the same values being passed in.

from time import perf_counter

def timer(func, *args):
    startTime = perf_counter()
    for i in range(1000):
        func(*args)
    return perf_counter() - startTime

>>> timer(pow, 2,2000)
0.002466699999786215
>>> timer(pow, 2,2000)
0.0022763999995731865
>>> timer(pow, 2,2000)
0.0019138999996357597
>>> timer(pow, 2,2000)
0.0017414999992979574
>>> timer(pow, 2,2000)
0.0017892999994728598

Is there a reliable timing clock out there with consistent results with high precision?

Since clock timing tools are used to measure performance between different code alternatives, and these tools appear to not provide consistent timing results, how can one make an educated decision based on unreliable tools?

Are you blaming perf_counter for the different times?

I am not blaming anything. I am looking for a reliable timing clock that provides consistent timing results.

But the way you’re saying it sounds like you’re blaming the different times on the timer function instead of on the fact that the same code run multiple times doesn’t always run equally fast.

From my previous post.

Subsequent posts are just documenting my observations when attempting to use different timing tools and noting their inconsistencies.

There is no reliable clock that can do that: This has very little to do with the way you measure the time the code takes, and instead more that the code takes different amounts of time each execution. The way to make this reliable is to repeatably execute the code. Look into the timeit stdlib module.

1 Like

From my posts, I just showed that by running code multiple times showed inconsistent results.

Exactly! That is expected. You then need to average the results.

Zoinks.

:hushed: … :disappointed_relieved:

Or use the minimum, as the timeit doc recommends. I mostly use a mix of both, like running 25 times (alternating between the alternative solutions) and showing mean and stdev of the fastest 5 times.

A modern CPU has many features that make reproducible timing near impossible. Best you can achieve is an average.

There are many layers of memory caching.
Branch prediction logic.
Each time there is an interrupt from hardware your code stops running.
Each time the OS needs the CPU core your code is running on you stop running.
If the CPU is cold it runs faster, then slows down as it heats up, this is turbo boost.
Not to forget that the CPU clock has jitter added to it to prevent radio interference.

1 Like

Wouldn’t they just put a type of faraday cage around it?

Probably cheaper to do it via SW (jitter) as you stated I suppose as opposed to the added cost of the faraday cage.

By the way, I have never heard of someone purposefully adding jitter to anything. I thought this was a byproduct of natural physics at play.

Thank you for this recommendation. I will try something close to this for testing.

They’re perfectly consistent. They’re just low resolution, because that’s the API that your operating system exposed to Python that Python wrapped up using process_time.

If the goal is simply to do performance timing of code, the standard library already includes the timeit module. Its purpose is to wrap up the best guesses about the right way to do it on your platform, along with minimizing the overhead of the actual timing code in order to focus on the time spent on the task itself. It is extremely well tested and has almost a 20-year history, being continually updated to reflect the consequences of other changes to Python.

You should use that. You should also read its documentation in order to understand why things are this way. I also have a blog article about it from last year.

I don’t understand. It seems that you think that the clock must not be “reliable”, because it does not give “consistent” results for the same code. But there are many reasons why, if you run the same code several times in a row, it would take a slightly different amount of time to complete. This can happen at the software level (and keep in mind that there is both the Python executable itself besides the bytecode of your compiled program), at the operating system level, and even at the raw hardware level. It has affected every program in every programming language for decades, and the situation has only gotten more complex over time.

1 Like

Thank you for the in-depth response. I just have to adjust / expand my thinking is all. Coming from a HW background, checking/verifying a clock frequency to me means toggling an I/O bit, and monitoring it on an oscilloscope. If I come back the next day and the measured clock period is not consistent, then that IC would be considered unreliable and or defective.

Although we are not measuring the clock (period) here per se, we are using the clock to measure performance (i.e., number of clock cycles required to execute said job of which the clock period is a determining factor).

Yes, in Python, I will have to consider all of the valid points that you have pointed out.

Thank you again. Much appreciated.

Even at a pure hardware level, it is vastly more complex than that on modern computers.

For example, if I make a few attempts to ask Linux the current speed of each core on my CPU:

$ grep "^[c]pu MHz" /proc/cpuinfo
cpu MHz		: 798.200
cpu MHz		: 798.214
cpu MHz		: 798.180
cpu MHz		: 798.172
$ grep "^[c]pu MHz" /proc/cpuinfo
cpu MHz		: 997.635
cpu MHz		: 994.702
cpu MHz		: 997.612
cpu MHz		: 996.495
$ grep "^[c]pu MHz" /proc/cpuinfo
cpu MHz		: 945.135
cpu MHz		: 870.794
cpu MHz		: 924.329
cpu MHz		: 1099.728
$ grep "^[c]pu MHz" /proc/cpuinfo
cpu MHz		: 3030.049
cpu MHz		: 3080.533
cpu MHz		: 3058.685
cpu MHz		: 3152.374

For the first three, I left the computer more or less idle; for the last, I had

$ python
Python 3.8.10 (default, Nov 22 2023, 10:22:35) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> while True: pass
... 

running in a separate terminal window. (Interestingly, all cores spin up like this even though the Python process will only use one at a time.)

This hardware is almost ten years old, by the way. It’s not a new phenomenon.

The actual clock itself - the underlying quartz crystal oscillator - is presumably very consistent, and self-correcting where necessary. (I guess that it oscillates many times faster than the CPU clock runs, so that the clock signal can be divided to create different target CPU clock rates.) But outside of that, all bets are off.

Yes, I am not disagreeing with you on the fact that a modern computers (along with the added wrappings of SW) are much more complex. However, with the ICs that I have worked with, the RC internal clocks have spec’s of either 1% or 5%. As an example, if I take the measurement of 945 MHz as the reference and 1099 MHz as the maximum measured clock frequency (from one of your CPU measurements), this means a percent error from the spec of:

(945 - 1099) / 945 → 16.3 % (call the manager pronto, something is wrong here)

I just have to take into account all of the reasons you highlighted in your previous post.

Right. My argument is that there is also hardware (or at least firmware) translation between the clock and the CPU. For example, the Intel web site describes such throttling, although I couldn’t find technical details about how it’s implemented.