PEP 669: Low Impact Monitoring for CPython

FWIW, I think coverage.py and CPython have progressed together to be able to efficiently measure branch coverage: Faster branch coverage measurement | Ned Batchelder

We need you to try it out though!

5 Likes

Is there a fundamental reason why an event such as PY_UNWIND is not available as a local event? The PY_RETURN event is somewhat incomplete as it doesn’t allow trapping the more general event of exiting from a function (be it with a result or with an exception). Having PY_UNWIND as a local event would make this more complete. For a more concrete use-case, one might be interested in “wrapping” around a specific function to react to when the function is entered (to report e.g. invocation arguments) and then exited (to grab either the return value or any exceptions that might have been thrown, plus the state of the frame locals).

Unwinding is part of exception handling, not normal execution. So, like RAISE, it doesn’t have a specific location in the code where it occurs.
Local events are tied to specific locations in the code, so can be instrumented.

It would be possible to generate an unwind point for functions, essentially wrapping all functions in a try-except, but it would bulk out the code object, slow down exception handling in the normal case and complicate the compiler.

These all feel like weak arguments to me

Unwinding is part of exception handling, not normal execution. So, like RAISE, it doesn’t have a specific location in the code where it occurs.

Unwinding involves the step of exiting from the current function/frame. Like when a new frame is created and pushed to the stack, its popping is also a well-defined event within the runtime. The fact that PY_START has a location within the code object to instrument is a mere accident of the implementation. It could have been implemented without any instrumentation by emitting an event as soon as the frame is created/executed.

It would be possible to generate an unwind point for functions, essentially wrapping all functions in a try-except, but it would bulk out the code object, slow down exception handling in the normal case and complicate the compiler.

I don’t think this is needed. I have drafted Make PY_UNWIND available as a local event by P403n1x87 · Pull Request #142179 · python/cpython · GitHub where I add a “local check” to monitor_unwind.

These all feel like weak arguments to me

You asked why PY_UNWIND is not available as a local event. I told you why.
I’m not arguing whether it should be a local event or not. I’m merely stating why it is that way.

The PEP was accepted over three years ago, so it is a bit late to change the PEP.

If you want to make non-trivial changes or improvements to the implementation, then opening an issue (not a PR first) on CPython is the best way to do it. As long as changes are backwards compatible we’re open to improvements. e.g. Add BRANCH_TAKEN and BRANCH_NOT_TAKEN events to sys.monitoring · Issue #122548 · python/cpython · GitHub

1 Like

Ok, thanks for clarifying. I was just trying to state the case that PY_UNWIND makes sense as a local event and gauge whether there would be any push-back in making such a change.

First, let me apologize for not seeing this or participating in this sooner.

(On the other hand, I’ve been working on Python debuggers for 15 years, and other debuggers even longer than that, so it would have been nice to have gotten a ping or a mention about this at some point.)

I’ve been trying to update the trepan3k debugger for this new protocol. While PEP 669 is a great improvement over the 30 or so years before, I am still finding some serious challenges that I’d like to address.

For concreteness, the GitHub project I am using for this is the sys-monitoring-rewrite branch of
pytracer
. Right now, I have a few test programs that use the code at sys-montoring-rewrite/examples. [As a new user, I am not allowed to add more than two links in discussions.]

Let me state at the outset that what I have always strived for in the debuggers I’ve written is precision and transparency.

As an example of precision, these debuggers do not let you set a breakpoint at a line number in a file for which there can be no line number in the code. At some point after I had this in my debugger, pdb added this (and probably still implements this) as a hack by looking for a negative regular expression match on some strings, like blank line and comment. In contrast, my debuggers have always done this by consulting the line number table of the code object found for the appropriate object indicated (often a file path). Going forward, I expect that users will be able to set breakpoints by specifying a code offset, line/column, or column span range.

As an example of transparency, see the section called about-debugging-overhead in the readthedocs document describing the debugger overhead problem. (Again, no direct link am I allowed to give.)

Given this long introduction, now to the problem.

I am trying to implement in a fast, robust, and reliable way basic debugger primitives “step into” (gdb “step”), “step over” (gdb “next”), and “step out” (gdb “finish”). I don’t see an easy way to do this given the current interface.

To be fast and efficient, what you’d want to use is sys.monitoring.set_local_events. Inside the callback, local events are set on the code object of the current frame, based on the user’s stepping desired. It feels like the local events should be set per frame. So it would be great to have a version of set_local_events that works on a frame as well.

To implement “step in”, what I currently do is trap on a CALL event and set the local events mask for the code of the new frame there in the CALL event callback. And then, when a code-leaving event occurs, I need to clear the code events if there are no breakpoints in that code. (Breakpoint discussion will be coming up soon.)

The problem with this is that a frame can be exited via an exception. When that happens, there is no RETURN event. Furthermore, events like PY_UNWIND are not local events. Therefore, to catch those kinds of things, it has to be done as a global event, which, of course, defeats the efficiency benefits of local event handling.

Another problem with local event setting inside code objects occurs when there is threading or recursion. It may be in one thread I want to “step out” in function foo while in another thread I want to “step in” or “step over”. The same kind of problem occurs in a recursive call, in one frame of function foo I want to “step in” but in another frame, I want “step out”.

These problems can be worked around by having a side table which records an events mask per frame (which is more stringent than events mask per thread and code object).

Again, right now this would be doable (if a little cumbersome) were it not for the problem of events which leave frames and are non-local.

But when combined with handling breakpoints, things get even more complicated.

Right now, the only way I see to handle breakpoints in code is to set event tracing on the code object it is in. In the callback, trigger on only those matching a breakpoint at a particular location. Again, we have some slowness here. This could be removed by adding a “BREAKPOINT” or “TRACE” opcode/instruction. For this, a mechanism would need to be added to allow this instruction to overwrite an instruction at any offset in the bytecode. The replaced instruction would need to be saved in a side table somewhere so that the replaced opcode could be run on “BREAKPOINT” or “TRACE” return.

However, right now this kind of thing does not exist. So essentially any code that has a breakpoint in it has to be treated like step in/over. This means that “step out” has to be slower.

I think I have figured out how I can implement “step into” so that it can recover from there being an exception in that which bypasses the return-event handling needed to clear out local code events set.

Inside a call, I can add on the side a forward pointer from a calling frame to its last caller. This would normally be removed on a return event. But if the return event is skipped and a frame is stale, that forward pointer to a frame a frame that was called will still be around. And then the code object for that can be cleared.

All local events will need to add a check for stale frames.

The debugger code is full of this kind of inelegance.