How to dynamically instrument Python code?

Hi, I am starting a project which attempts at detecting actual or possible race conditions in multithreaded Python code. In order to do that, it is required to simplistically perform some instrumentation on the load and store operations performed by Python and if needed collect symbol information, which will then be used by a detector of my choice to further analyze.

I’m trying to figure out a way to dynamically instrument the Python code for that purpose – and as far as my understanding, this would have to be done at the interpreter level or right before that - at the bytecode level. So, my question is: which path is more practical and are there any tools or modules that are exhaustive enough to be able to perform this kind of instrumentation? Preferably, if I wanted to do that at the interpreter level, it’s not clear for me how effortful would that be (i.e. would I have to severely alter the interpreter source code for that matter?) Otherwise if I’m changing the bytecode, is there a convenient interface to do that?

Another approach I have in mind, which is more of a static instrumentation, is to alter the AST and perform some actions there using the ‘ast’ module. However, I’m trying to look for a lower-level approach and from there study the options.

Howdy Mohamad,

I am not sure whether it is a good idea to go below the AST as the bytecode depends on the Python version - as far as I know…

But I am not an expert in this topic - just take it as a hint, what to consider :slight_smile:

Cheers, Dominik