PEP 669: Low Impact Monitoring for CPython

scoder · November 11, 2023, 9:35pm

Hi, looks like I’m late to the party.

The PEP was accepted without mentions of a C-API and without a way to let users send events to listeners. This is unfortunate because it means that software that wants to mimic Python’s execution events (like Cython or other tools that execute code and want to support profiling or tracing) cannot do this. In addition, it seems that the implementation is not backwards compatible with the previous thread state behaviour and thus, events sent the old way no longer reach new-style listeners.

It’s probably too late to revive profiling and tracing support in Python 3.12, but I would like to see it re-enabled at least in 3.13.

For that, we need a C-API that allows injecting events into the system efficiently. It looks like this would require exposing the _PyInterpreterFrame, which is now hidden in pycore_frame.h. Code that wants to generate CPython compatible events probably needs this. Overall, the event interface seems very much tied into CPython’s execution internals, which is really unfortunate since it’s part of the design. I see a couple of references to an “instruction offset” in the event arguments, which seems meaningless without byte code. Finally supporting branching events (e.g. for coverage analysis) would also have been nice, but again, a “destination offset” is probably not easily provided.

Overall, it seems that this is yet another incarnation of the problem that CPython’s own C-API is not good enough to implement its own features.

How can we get the event sending side back to a usable state?

scoder · November 12, 2023, 8:24am

I looked some more into the details.

We probably don’t need frames. That’s great, because it’ll remove a lot of ugly complexity from Cython’s tracing/profiling code (in 5-10 years, when we drop Python 3.11 support).
We need a way to signal events. Since events have more than one signature, we might end up needing more than one C-API function for this, but we’ll see.
We need a way to map 3D source code positions (file name, line, character) to 1D integer offsets. Code objects help quite a bit, but branches might cross source file boundaries, so that’s more tricky. For most use cases, a mapping between (line, character) positions and an integer offset would probably suffice.

I created issue C-API for signalling monitoring events · Issue #111997 · python/cpython · GitHub to discuss the implementation.

nedbat · November 29, 2023, 10:53am

I am almost done updating coverage.py to use sys.monitoring for line measurement (branches are still in the future). One thing I noticed now in the API that seemed odd to me:

sys.monitoring.restart_events() → None

Enable all the events that were disabled by sys.monitoring.DISABLE for all tools.

Why does this affect all tools? Everything else is scoped to a particular tool id. It seems like a big hammer for me to restart events for everyone when I need to restart events for me.

markshannon · December 6, 2023, 11:53am

Yes, sys.monitoring.restart_events() is quite a large hammer.
It is designed for attaching a debugger and the like, where a clean start is needed.

OOI, what are you using it for?

nedbat · December 6, 2023, 12:11pm

In coverage.py for statement coverage, I am disabling line events once they have fired. But I also can stop coverage and re-start it, so I call restart_events when coverage is started to make sure I will get the correct events in the second coverage measurement.

Soon, I will be supporting context measurement with sys.monitoring also. This lets you determine (for example) what tests covered what parts of the code. To do that, I’ll need to restart events when the test changes, so many times in a single process.

Perhaps it is fine, but it gave me pause when I saw it. I don’t know what else is using sys.monitoring, and my need to call restart_events is necessarily affecting those other tools. Is it hard to have restart_events scoped to a single tool, just as the rest of the API is?

One particularly tricky aspect of coverage.py is when it measures itself running its own test suite. I haven’t yet gotten to ensuring that works correctly with sys.monitoring, but I suspect there are some entanglements there as well.

zoranuri · May 6, 2024, 11:09am

I’ve been working with the sys.monitoring framework for the past few weeks, great stuff.

I’ve had issues using it with dynamically compiled expressions though (using the compile builtin) which are later ran using eval or exec. Is monitoring expected to work on code objects created through compile?

guido · May 7, 2024, 12:06am

@zoranuri Could you show a simple example of something that doesn’t work as you expected?