Different CPython memory management for loop iteration

Hello all, I’m testing on the python instructions. I wrote a simple program with a loop, and I used Intel Pin to analyze the instructions executed. I found something interesting.

Here is my code:

##### simple_mem.py ####
import sys
import ctypes as ct 
import gc
# from ctypes import c_int
addr = ct.c_int(0)

# temp_list = []
def address_write():
    for _ in range(int(sys.argv[1])):
        addr.value = 1
        # temp_list.append(addr)

# to execute this script
python3  simple_mem.py 500
  1. For single run, I found that when the iteration number i larger than 256, it will execute more binary instructions(x86_64). More detail inside.
  2. For multiple runs, there might be some different instruction numbers executed. The main reason is due to this function PyMem_Realloc. More detail inside

Does anyone know the reason beyond? I’m curious about these results?


Then main differances are:

Another difference for single run:

Difference for multiple runs:
some run executes more PyMem_Realloc instructions:

As an optimisation python has interned small value integers.
I am not sure of the full set on ints that are intern but int values up to 256 that are affect your test.

If you run the loop with ints from range(1000, 1000+int(sys.argv[1]) you should see no difference any more.

I found this that goes into more detail Integer Interning in Python (Optimization)

I agree with that small-valued numbers(<256) are interned. But during the loop, the python interpreter executes _PyUnicode_Append function, which should be used to contact two unicoded string or something. I don’t think we need this function here. Is there any way to monitor/trace the python bytecode executing? I want to know what’s going on inside the interpreter.

Then checkout the sourcecode! You have detailed traces for what functions are called and you can even go further and use a debug build and a debugger to step through the execution.