Clarification on cProfile Timing Resolution and the ".001 seconds" Clock Tick Statement

Hello
While reading the cProfile documentation’s Limitations section
, I came across the statement

The most obvious restriction is that the underlying “clock” is only ticking at a rate (typically) of about .001 seconds.

I’m trying to understand what this actually means.

From what I can see in the cpython’s source code, cProfile appears to use clock_gettime(CLOCK_MONOTONIC_RAW, ...) on Linux for time measurement. When I check the clock resolution with clock_getres on my system, it returns 1ns, so I’m confused why the documentation suggests a 1ms resolution (“about .001 seconds”) — that seems like a big mismatch in orders of magnitude.

Could anybody explain this mismatch?

That are two seperate properties of a clock.

  1. The rate it ticks.
  2. The precision of the reported time.

The clock is likely ticking every 1ms and the time is report to 1ns resolution.

1 Like

Check timer source:
cat /sys/devices/system/clocksource/clocksource0/current_clocksource

Tick Rate
#include <stdio.h>
#include <time.h>

int main() {
    struct timespec t1, t2;
    clock_gettime(CLOCK_MONOTONIC, &t1);
    do {
        clock_gettime(CLOCK_MONOTONIC, &t2);
    } while (t1.tv_nsec == t2.tv_nsec && t1.tv_sec == t2.tv_sec);
    
    long delta_ns = (t2.tv_sec - t1.tv_sec) * 1e9 + (t2.tv_nsec - t1.tv_nsec);
    printf("Observed tick: %ld ns\n", delta_ns);
}
import time

clock = time.CLOCK_MONOTONIC

t1 = time.clock_gettime_ns(clock)
while True:
    t2 = time.clock_gettime_ns(clock)
    if t2 != t1:
        break

delta_ns = t2 - t1
print(f"Observed tick: {delta_ns} ns")
1 Like

@barry-scott @elis.byberi

Thank you very much! I understand the difference between the clock tick rate and reported resolution. On my environment, the clock rate is about every 100ns, which is much higher than 1ns. However, still, the orders of magnitude (10000x) gap between this measurement and the statement in the docs, which made me still confused

h-ishida@umejuice:~/tmp/tmp$ cat tmp.py 
import time

for _ in range(10):
    clock = time.CLOCK_MONOTONIC
    t1 = time.clock_gettime_ns(clock)
    while True:
        t2 = time.clock_gettime_ns(clock)
        if t2 != t1:
            break
    delta_ns = t2 - t1
    print(f"Observed tick: {delta_ns} ns")
h-ishida@umejuice:~/tmp/tmp$ python3 tmp.py 
Observed tick: 390 ns
Observed tick: 151 ns
Observed tick: 150 ns
Observed tick: 90 ns
Observed tick: 111 ns
Observed tick: 110 ns
Observed tick: 90 ns
Observed tick: 80 ns
Observed tick: 100 ns
Observed tick: 100 ns

@barry-scott @elis.byberi
Thank you very much! I understand the difference between the clock tick rate and reported resolution. On my environment, the clock rate is about every 100ns, which is much higher than 1ns. However, still, the orders of magnitude (10000x) gap between this measurement and the statement in the docs (1ms), which made me still confused

It seems you are on a linux system. The default is 1000Hz for the clock tick.

You need to use a waiting sleep call to see the effect of the clock tick rate. Try asking for time after a sleep of 0.00001. I expect you to see reported time about 1ms apart not 0.1ms apart.

If you call a query in a tight loop you only get to see the sampling accuracy.

I believe it should be “scheduler clock,” not just “clock.”

Scheduling latency
import time
import multiprocessing


def busy():
    while True:
        pass


if __name__ == "__main__":
    clock = time.CLOCK_MONOTONIC
    t1 = time.clock_gettime_ns(clock)

    p = multiprocessing.Process(target=busy)
    p.start()

    while True:
        t2 = time.clock_gettime_ns(clock)
        if t2 != t1:
            break

    delta_ns = t2 - t1
    print(f"Scheduling latency: {delta_ns} ns")

    p.terminate()

Result:
Scheduling latency: 5264307 ns

The current wording in the documentation is confusing. It should specify “scheduling clock” or “kernel timer,” rather than just “clock.”

I used this script to test the scheduler grainularity. I named it tmpdir/t.py

import time
import sys

t1 = time.clock_gettime_ns(time.CLOCK_MONOTONIC)
time.sleep(float(sys.argv[1]))
t2 = time.clock_gettime_ns(time.CLOCK_MONOTONIC)
print('%f' % ((t2-t1)/1000_000_000,))

Then I ran it with on my Fedora 42 system that has kernel 6.15.5-200.fc42.x86_64

$ python3 tmpdir/t.py 0.1
0.100258
$ python3 tmpdir/t.py 0.01
0.010096
$ python3 tmpdir/t.py 0.001
0.001079
$ python3 tmpdir/t.py 0.0001
0.000159
$ python3 tmpdir/t.py 0.00001
0.000079
$ python3 tmpdir/t.py 0.000001
0.000059
$ python3 tmpdir/t.py 0.0000001
0.000059

I thought that the 0.001s was true for Windows systems, but I have the same behavior with your script on my Windows 10 system, where it doesn’t get below 0.00006 s.

Actually I asked on cpython’s issue and according to the contributor, it seems that the documentation is just outdated.

1 Like

It’s not exactly outdated; it’s still relevant, but it’s unclear what the section is trying to explain. It may require in-depth knowledge of implementation details, especially since it discusses limitations.

Scheduler kernel timers are still observable, as I demonstrated in my previous post using multiprocessing (or threading).

The purpose of the Limitations section is to convey that certain constraints exist due to software or hardware limitations, without going too deep into technical detail.

It seems more like a note to one’s future self, though. :slightly_smiling_face:

The linux scheduler gets a lot of work done on it that seems to be why the 1ms tick is no longer a limit.

I would not be surprised to find simular improvements in scheduling on macOS and Windows-11.