Timer time.process_time() Discrepancy

Rosuav · January 1, 2024, 11:25pm

Yeah, that’s not error, that’s deliberate clock speed adjustment to balance performance and power consumption It’s even more pronounced when you have a large number of cores, some with hyperthreading and some without, some performance and some eco, some cores binned higher and thus factory-configured to a higher max clock speed, and then run one Flash game on it. All the CPU cores are running at different frequencies!

kknechtel · January 1, 2024, 11:41pm

Right; and my machine only has four cores, which wasn’t even bleeding-edge ten years ago

onePythonUser · January 2, 2024, 12:39am

Hi,

yes, I was not stating that that was an error. I was merely stating that if that had happened to an
IC that I am normally accustomed to working with, which generally have 1% or 5% clock tolerance,
then it would have been an error, depending on the version.

onePythonUser · January 7, 2024, 12:13am

Hello,

here is another related timing query.

Why is there such a discrepancy between the two following timing measuring methods:

import time, timeit

def time_func(*func):
    start = time.perf_counter()
    func
    final = time.perf_counter() - start
    print('Total run time is: ', final)

def sum_lists():
    a = [1, 2, 3]
    b = [4, 5, 6]
    c = a + b

code_to_test = """
a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
"""

time_func(sum_lists)                  # Method 1
print(timeit.timeit(code_to_test))    # Method 2

The results that I obtained were:

Total run time is:  6.00004568696022e-07  # Via method 1
0.16569090000120923                       # Via method 2

The discrepancy is a factor of: 0.1657/0.6 e-07 = 27.6e-03

Why would there be such a difference? Shouldn’t they be within the same range?

MegaIng · January 7, 2024, 12:15am

Can I suggest learning a bit more python before worrying about performance? The first method just doesn’t do anything.

onePythonUser · January 7, 2024, 12:21am

Hello,

I am actually learning Python from the book Learning Python, 5th Edition.
I am reading it from beginning to end, in a linear manner. I am currently on
the chapter where it discusses timing methods.

By the way, the function sum_lists() does add the two lists.
Run it by adding a print statement to it.

print('The value of c is: ', c)

MegaIng · January 7, 2024, 12:24am

Yes, but are you doing that? Read your code carefully.

Rosuav · January 7, 2024, 12:29am

First fix the bug in your code, as mentioned by Cornelius, and then read the documentation about the different clocks that Python provides. If you do not understand what each clock is measuring, you will always be confused.

onePythonUser · January 7, 2024, 12:29am

Thank you.

Found the bug:

def time_func(func):
    start = time.perf_counter()
    func()
    final = time.perf_counter() - start
    print('Total run time is: ', final)

Test results:

The value of c is:  [1, 2, 3, 4, 5, 6]
Total run time is:  0.03026490000047488
0.13459729999885894

That still seems like a big difference, …, relatively speaking of course. (a factor of four plus)

MegaIng · January 7, 2024, 12:38am

And now please read the documentation of timeit.

onePythonUser · January 7, 2024, 1:02am

Thank you both for the nudge. I now see that the timeit function has a default number value of: number = 1000000

Thus, I manually set it to one to match the number of times the other method is run.

If I now run it with this code, the values are approximately equal (The print statements also took a lot of time to execute so I removed them from the body):

import time, timeit

def time_func(func):
    start = time.perf_counter()
    func()
    final = time.perf_counter() - start
    return final

def sum_lists():
    a = [1, 2, 3]
    b = [4, 5, 6]
    c = a + b

code_to_test = """
a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
"""

print('Total time is: ', time_func(sum_lists))
print(timeit.timeit(code_to_test, number = 1))

The timing results:

Total time is:  1.1000010999850929e-06
2.4000037228688598e-06

Stefan2 · January 7, 2024, 1:05am

So instead of making your own method good, you made your timeit usage bad …

onePythonUser · January 7, 2024, 2:11am

From documentation:

 |  timeit(self, number=1000000)
 |      Time 'number' executions of the main statement.
 |
 |      To be precise, this executes the setup statement once, and
 |      then returns the time it takes to execute the main statement
 |      a number of times, as float seconds if using the default timer.   
 |      the argument is the number of times through the loop, defaulting
 |      to one million.  The main statement, the setup statement and
 |      the timer function to be used are passed to the constructor.

The default timer for the timeit function is perf_counter, same as the function that I am using.

FUNCTIONS
    default_timer = perf_counter(...)
        perf_counter() -> float

    Performance counter for benchmarking.

repeat(stmt='pass', setup='pass', timer=<built-in function perf_counter>, repeat=5, number=1000000, globals=None)
    Convenience function to create Timer object and call repeat method.

timeit(stmt='pass', setup='pass', timer=<built-in function perf_counter>, number=1000000, globals=None)
    Convenience function to create Timer object and call timeit method.

Rosuav · January 7, 2024, 2:30am

Yes, and what does it measure? When, as in your example here, you only do the test once, how much overhead and what sort of overhead is there? Etcetera.

As Stefan pointed out, you started with timeit and with a hand-rolled one, and to make them a bit more similar, you worsened timeit to make it as meaningless as the one you’d hand-rolled.

The timeit module exists for a reason. Do you know what it does, and why you would want to not use it? Do you even know what kind of timescale you’re looking at here, and how much of the numbers you’re seeing is just noise?

onePythonUser · January 7, 2024, 3:00am

Are you stating that the setup time is so much greater than one iteration that this skewers the final tally (i.e., the noise)? Therefore, the reason for the default number value being so high (1000000) is so that collectively these iteration executions times will make the setup times insignificant?

As per documentation:
To be precise, this executes the setup statement once …

The default timer returns seconds as a float.

The default timer being time.perf_counter().

Rosuav · January 7, 2024, 4:10am

Yes. And you’re adding overhead each call, which all has to be factored in.

The timeit moduke deals with all this.

Stefan2 · January 7, 2024, 11:03am

You reported 0.13 seconds for a million iterations (which btw also includes “freeing” the lists of each previous iteration when you re-assign the variables). So 0.13 μs per iteration. Then you reported 2.4 μs for a single iteration (and that doesn’t include a freeing). So 0.13 μs for what you want to measure (and a little more, actually) and 2.27 μs overhead. Would you not say that 2.27 is so much greater than 0.13 that this skewers the final tally?

Stefan2 · January 7, 2024, 11:20am

Here’s a mostly proper/fair version, and the times are pretty much the same.

import time, timeit
from itertools import repeat

def time_func(func):
    start = time.perf_counter()
    func()
    final = time.perf_counter() - start
    print('Total run time is: ', final)

def sum_lists():
    for _ in repeat(None, 10**6):
        a = [1, 2, 3]
        b = [4, 5, 6]
        c = a + b

code_to_test = """
a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
"""

time_func(sum_lists)                  # Method 1
print(timeit.timeit(code_to_test))    # Method 2

Attempt This Online!

onePythonUser · January 7, 2024, 3:22pm

Thank you Rosuav and Stefan2 for your kind help and setting me straight with my misunderstanding.

I now understand this issue a lot better.

Much appreciated.

onePythonUser · January 7, 2024, 9:58pm

Hello,

just another follow up on a related topic. As I was reading both the Learning Python, 5Ed textbook and Python documentation, I came across a subject that perhaps can be worded a bit more explicit. This is with regards to the setup option for the repeat function in the timeit module.

From the book:
The timeit module also allows you to provide setup code that is run in the main statement’s scope, but whose time is not charged to the main statement’s total—potentially useful for initialization code you wish to exclude from total time, such as imports of required modules, test function definition, and test data creation. Because they’re run in the same scope, any names created by setup code are available to the main test statement; names defined in the interactive shell generally are not.

To specify setup code, use a –s in command-line mode (or many of these for multiline
setups) and a setup argument string in API call mode

From documentation (help(timeit)):
-s/–setup S: statement to be executed once initially (default ‘pass’).
Execution time of this setup statement is NOT timed.

So, the book has the following example:

from timeit import repeat
min(repeat(number = 1000, repeat = 3,
setup = 'from mins import min1, min2, min3\n'
 'vals = list(range((1000))',
stmt = 'min3(*vals)'))

I modified it to:

from timeit import repeat
min(repeat(number = 1000, repeat = 3,
setup = 'print("Hello everyone!"); print("Goodbye!"); vals=list(range(1000))',
 'vals = list(range((1000))',
stmt = 'min(*vals)'))

When I ran the code, the print statements were executed three times. I was expecting it once as stated by both the book and documentation (…, statement to be executed once initially).
It appears that the setup code is executed every main repeat. In this example, repeat = 3,
and the print statements were executed 3 times. I changed this value to two, and the print statements
were executed twice.

Can the documentation be made a bit more clear/explicit stating that the setup code is executed once for every repeat? Even though it is ‘not charged’ as part of the test code timing.

Am I reading/interpreting this incorrectly? If so, how?