PEP 669: Low Impact Monitoring for CPython

nedbat · September 4, 2022, 7:29pm

This is a really interesting PEP. I apologize for taking so long to get to it. I have a number of comments/questions…

In the list of events, what is the logic for when they are named PY_* vs C_* vs no prefix? Why isn’t LINE called PY_LINE? Why isn’t PY_START called PY_CALL? I’m assuming there’s a reason, but it seems asymmetric at a first reading.
It took me a while to understand sys.monitoring.use_tool_id(id, name:str) -> None. Perhaps we can have fleshed out docstrings for these functions. IIUC, use_tool_id means I want to claim an id, and I am associating name with it. I’m not sure what use will be made of name though?
The pre-defined ids make some presumptions about the composability of tools. For example, it assumes that I can’t coverage-measure a coverage tool. It is difficult, and coverage.py uses some tricks to accomplish it, but it’s valuable. Since there’s no enforcement to the idea that only one tool of each kind can be running at once, I suppose everything is fine, but I wonder if this idea will appear in other places with real consequences?
I should know what this means, but I don’t:

You won’t be able to register a C function pointer, but you can implement the PEP 590 vectorcall interface on the callable, for performance close to that of a raw function pointer.

Coverage.py uses C-implemented trace functions now. Will that still be possible?
This sentence could use some clarification:

If a callback function returns DISABLE, then that function will no longer be called for that (code, instruction_offset) until sys.monitoring.restart_events() is called.

5a. LINE takes (code, line_number) rather than (code, instruction_offset); I assume DISABLE will apply to those arguments also.

5b. The BRANCH event is called with (code, instruction_offset, destination_offset). If I return DISABLE, does that disable all (code, instruciton_offset) events, or only those with the same three arguments (code, instruction_offset, destination_offset)? It won’t be useful to disable unless it’s the latter.
In the Coverage Tools section:

Coverage tools need to track which parts of the control graph have been executed. To do this, they need to register for the PY_ events, plus JUMP and BRANCH.

I don’t understand why I would need the JUMP event? Maybe I’m not understanding the full implications of the events. Coverage.py watches line numbers being executed, and today tracks branches by remembering the previous line when a line is executed, and tracking the (previous, current) pairs of line numbers.

Perhaps JUMP and BRANCH make sense for instruction-offset-based tracing rather than line-based tracing? I’m happy to talk more about the way coverage.py currently works, and how it might work in the future.