Hello all, I’m testing on the python instructions. I wrote a simple program with a loop, and I used Intel Pin to analyze the instructions executed. I found something interesting.
Here is my code:
##### simple_mem.py ####
import sys
import ctypes as ct
import gc
gc.disable()
# from ctypes import c_int
addr = ct.c_int(0)
# temp_list = []
def address_write():
for _ in range(int(sys.argv[1])):
addr.value = 1
# temp_list.append(addr)
address_write()
# to execute this script
python3 simple_mem.py 500
For single run, I found that when the iteration number i larger than 256, it will execute more binary instructions(x86_64). More detail inside.
For multiple runs, there might be some different instruction numbers executed. The main reason is due to this function PyMem_Realloc. More detail inside
Does anyone know the reason beyond? I’m curious about these results?
As an optimisation python has interned small value integers.
I am not sure of the full set on ints that are intern but int values up to 256 that are affect your test.
If you run the loop with ints from range(1000, 1000+int(sys.argv[1]) you should see no difference any more.
I agree with that small-valued numbers(<256) are interned. But during the loop, the python interpreter executes _PyUnicode_Append function, which should be used to contact two unicoded string or something. I don’t think we need this function here. Is there any way to monitor/trace the python bytecode executing? I want to know what’s going on inside the interpreter.
Then checkout the sourcecode! You have detailed traces for what functions are called and you can even go further and use a debug build and a debugger to step through the execution.