When might `f_globals` and `f_builtins` not be a `dict`?

jeff5 · December 20, 2022, 12:42pm

The CPython bytecode interpreter takes the trouble to check that f_globals and f_builtins are both dictionaries here:
https://github.com/python/cpython/blob/ba8e30c56ba4833f572318e1cf4108d9f206d1a0/Python/ceval.c#L2996-L3004
It takes a slow path if they are not. In 3.11 these fields are behind the macros GLOBALS() and BUILTINS().

Under what circumstances is this alternative path needed? Has it any application?

In other places the interpreter seems happy to address f_globals through the PyDict_* API without an alternative path should some PyDict_Check() internal to the method then fail.
https://github.com/python/cpython/blob/ba8e30c56ba4833f572318e1cf4108d9f206d1a0/Python/ceval.c#L2907-L2916
It is difficult to see how it could get to be anything else in normal use. Even low-level constructors for the frame (_PyFrame_New_NoTrack) insist on checking. Where it comes from func_globals, that has checked in e.g. PyFunction_NewWithQualName.

The interpreter provides an alternative path also in LOAD_NAME, after looking locally. This is one of the places f_globals is assumed to be dict (PyDict_GetItemWithError is used unchecked), but it allows for f_builtins not to be dict.

f_builtins seems also to be guarded at its origins. We check that the purported __builtins__ is a module. A module guards its dictionary in construction and makes __dict__ read-only so you can’t replace it.

One explanation I gave myself is that the checks the API are generally PyDict_Check, not an exact check: a sub-class of dict would pass. But the implementation of the PyDict_* methods go straight to the built-in method. I’m pretty sure STORE_GLOBAL would ignore a sub-class definition of (say) __setitem__, even if I could inveigle my sub-class into a frame.

Interestingly, the check on f_builtins in LOAD_GLOBAL and LOAD_NAME are PyDict_CheckExact, before it applies PyDict_GetItemWithError,
so for a __getitem__ it may be covered.

I know ceval.c has had a lot of attention, so I start with the assumption everything in there that takes space or time is properly necessary.

Edit: I looked more carefully and see the check is exact in most places ceval.c does it.

kj0 · December 20, 2022, 1:34pm

Pre-existing code might monkey patch __bulitins__ with their own namespace. E.g.

>>> class MyNameSpace:
...     def __getitem__(self, name):
...         return name
...
>>> __builtins__ = MyNameSpace()
>>> a
'a'

We can’t break backwards compatibility by not supporting them. So we need to check whether it’s a dict or something that just supports the PyObject_GetItem protocol.

jeff5 · December 20, 2022, 1:53pm

Ah, of course. I was thinking we would always get it from a module (and I was misreading _PyEval_BuiltinsFromGlobals).

What about f_globals?

kj0 · December 20, 2022, 4:56pm

I’m not sure. eval and exec both specify that globals must be a dictionary and not a subclass. Built-in Functions — Python 3.11.1 documentation. They error when passing in ~a subclass~ (edit I was wrong: they error when passing in a mapping).

However, based on Hyrum’s law, there is probably some use case somewhere that relies on this.

There might be a case for removing the check, but I would tread carefully. I haven’t given this too much thought so there might be a use case that I have overlooked.

kj0 · December 20, 2022, 6:27pm

Nevermind, dict subclasses work, just not custom classes that implement the mapping protocol

class SubClass(dict): pass


eval("1+1", SubClass())

So we can’t break this due to backwards compatibility.

jeff5 · December 20, 2022, 6:44pm

Thanks for considering this.

What I now think is that this double test is necessary because of the casts that follow:

            if (PyDict_CheckExact(GLOBALS())
                && PyDict_CheckExact(BUILTINS()))
            {
                v = _PyDict_LoadGlobal((PyDictObject *)GLOBALS(),
                                       (PyDictObject *)BUILTINS(),
                                       name);

but that in the slow path PyDict_GetItemWithError would have been safe:

            else {
                /* Slow-path if globals or builtins is not a dict */


                /* namespace 1: globals */
                v = PyObject_GetItem(GLOBALS(), name);
                if (v == NULL) {

PyDict_* API is used in the very similar circumstances of LOAD_NAME, and a couple of other places, for which f_globals has to be a dict or sub-class.

Maybe the checks correspond to design assumptions made in _PyDict_LoadGlobal. Internally it calls _Py_dict_lookup without further checks, but most public API that relies on it makes only the inexact check, so an inexact check would have been consistent here. It seems to be accepted, for better or worse, that when the core handles objects, it may use the type-specific API and you won’t necessarily get the sub-class behaviours.

I’m not proposing a change to ceval.c, however. I’m trying to reproduce the interpreter in Java and it helps with both correctness and efficiency if I can strongly-type variables that may be PyObject * in CPython, but are in practice guaranteed to be something specific. I think I can for f_globals, but not for f_locals and f_builtins.

jeff5 · December 20, 2022, 6:58pm

Sorry, we crossed posts.

For my purposes a sub-class of dict is a dict, and I can type it as such in my implementation.

LOAD_GLOBAL differs from LOAD_NAME, STORE_GLOBAL and DELETE_GLOBAL in reverting to PyObject_GetItem on encountering a sub-class, where the others use PyDict_* methods. It’s not an obstacle for this question.

Topic		Replies	Views
[SOLVED] Porting Python module to CPython: how to translate global scope? Python Help solved	9	1337	June 16, 2019
Mismatch between func.__globals__ and globals retrieved using globals in function in CPython 3.12 Python Help help	2	212	November 13, 2023
Are all builtins required for each function? Python Help	2	331	September 1, 2022
Pass module to exec(), use f_namespace rather than f_globals Ideas	4	1560	March 24, 2019
Has this been forgotten at Python development? Python Help help	6	500	March 21, 2021

When might `f_globals` and `f_builtins` not be a `dict`?

Related Topics