Allow FrameType to be instantiated

A frame has a stable API to instantiate it: PyFrameObject*
PyFrame_New(PyThreadState *tstate, PyCodeObject *code,
PyObject *globals, PyObject *locals). Despite this, for some reason, FrameType is not allowed to be instantiated in Python. This creates problems when trying to (for example) control an exception traceback, which is common for debuggers, among other uses. This is also inconsistent with most other internal types that exposed to Python and in the types module.

The only reason I can think of is because PyFrame_New requires a PyThreadState object, which is not exposed to Python. However, there are several ways around this:

  • Run PyThreadState_GetUnchecked, raising RuntimeError (or another error) if NULL is returned
  • Expose frames’ PyThreadState to the interpreter
  • Pass another frame as a parameter, and get its thread state

This should be relatively trivial to implement and I can’t think of any downsides, backwards compatibility concerns, or maintenance overhead. I can probably write a PR if one of the solutions to the thread state parameter is decided.

(I originally opened this as issue #149844 but it closed due to being a non-trivial change)

Or, if your goal is to make a variant of a frame, the frame itself could have a replace() method that changes other aspects but keeps the same thread state?

1 Like

That would definitely make more sense then passing a frame as a parameter. imo there should also be an instantiator in addition to that because if the goal is to manipulate tracebacks, thread state doesn’t matter

…now what?

A core dev will have to take notice. You may be able to get more attention to this issue if you provide a pull request.

1 Like

By PR, I prefer that you open a PoC on your fork and link it here. We don’t want PRs without consensus (we already have enough PRs and otherwise they sit there).

But I can tag some core devs about that @ZeroIntensity @gaogaotiantian

1 Like

Seems reasonable to me.

In a method, there’s always a thread state available (see the docs on thread state attachment). The best option is to just call PyFrame_New(_PyThreadState_GET(), ...) in the tp_init/tp_new.

1 Like

By PR, I prefer that you open a PoC on your fork and link it here. We don’t want PRs without consensus (we already have enough PRs and otherwise they sit there).

That’s why I didn’t open a PR for it already (I saw that on the template), but I’ve started a PoC on my fork: GitHub - Anonymous941/cpython at frame-instantiation · GitHub. It’s very WIP at the moment, but it lets you instantiate frames successfully.

>>> FrameType((lambda: None).__code__, {}, {})
<frame at 0x..., file '<python-input-...>', line 1, code <lambda>>
>>> FrameType((lambda: None).__code__, {}, {}, lasti=4).f_lasti
4
>>> FrameType((lambda: None).__code__, {}, {}, lasti=1000)
ValueError: lasti out of range
>>> FrameType((lambda: None).__code__, {}, {}, lasti=-1)
ValueError: cannot instantiate incomplete frame
>>> FrameType((lambda: None).__code__, {}, {}, lasti=1)
ValueError: lasti must be a multiple of 2

Current signature (pyi format):

from types import CodeType
from typing import Any


class FrameType:
    def __new__(
        cls,
        code: CodeType,
        globals: dict[str, Any],
        locals: dict[str, Any],
        *,
        back: FrameType | None = None,
        lasti: int | None = None
    ) -> FrameType:
        """
        Create a frame object.

        The thread state is set to that of the calling frame.
        """
        ...

    # unimplemented
    def replace(
        self,
        *,
        code: CodeType | None = None,
        globals: dict[str, Any] | None = None,
        locals: dict[str, Any] | None = None,
        back: (
            FrameType | None | NULL
        ) = NULL,  # because None is significant here, it will need to be another value
        lasti: int | None = None,
        keep_thread_state: bool = True # if false, passes the current thread state
    ) -> FrameType:
        """
        Return a copy of the frame object with new values for the specified fields.
        """
        ...

Right now my code raises ValueError instead of instantiating a non-ready frame (lasti parameter too low), with a unique message. Am I correct in assuming that non-ready frames are never supposed to be exposed to Python? Otherwise you can cause assertion failures with things like raise Exception().with_traceback(TracebackType(None, non_ready_frame, 0, 0)). If so, maybe it should be documented somewhere.
Also, should it be something like RuntimeError since the arguments are technically correct, but the interpreter isn’t designed to handle them?

The frame constructor has more parameters:

  • builtins
  • name
  • qualname
  • defaults
  • kwdefaults
  • closure

You should accept them as keyword-only parameters.

Not saying that I don’t like the idea, but I’m curious as to what you really mean by “controlling an exception traceback” since usually we just want to be able to read and format exception tracebacks, and if you want to make modifications there’s nothing in the traceback library that checks if a frame is actually of FrameType so you’re free to construct a “frame” or reconstruct the whole stack with your own compatible types through duck typing.

The reason I didn’t include them in the signature (not that I disagree) is because they’re not accessible to Python. As far as I’m aware, other then f_builtins the inaccessible parameters are always the same as f_code, so I’m not sure why they’re stored in the frame. Should we make them accessible (especially f_builtins)?

I didn’t know that traceback lets you do that, that’s a good point. I was referring BaseException.with_traceback, which requires a TracebackType object which in turn requires a valid FrameType. For example, a debugger might be debugging a particular function and want the stack trace to include a part of the function in the traceback (possibly after showing the formatted traceback to the user), but still allow the exception be caught by other parts of the code. Here’s an example that’s currently working with my PoC fork:

from types import FrameType, TracebackType


def function():
    pass


mock_frame = FrameType(code=function.__code__, globals={}, locals={}, lasti=6)
mock_traceback = TracebackType(None, mock_frame, 6, 1)

raise Exception().with_traceback(mock_traceback)
Traceback (most recent call last):
  File "example.py", line 11, in <module>
    raise Exception().with_traceback(mock_traceback)
  File "example.py", line 5, in function
    pass
Exception

To be honest, most of the uses I can think of for instantiating frames are relatively niche, and it’s mainly for consistency with other internal types, along with the fact that it’s easy to implement and doesn’t break anything.

If you were allowed to construct a generator with a specific frame, or resume a frame, this would be much more useful imo. It would, among other things, allow copying generators without a C extension. That might be an interesting idea for a separate proposal if this one gets accepted.

1 Like

I’m having trouble getting the thread state of an existing PyFrameObject. When passing the thread state, it ignores it if GIL is enabled, so I can simply pass NULL in that case. However, without GIL, it eventually calls _PyCode_GetTLBCFast. Is there a way to get the thread state back from a code entry?

Also, there’s a mistake in my last post. f_builtins is in fact accessible from Python.