PEP 667 proposes replacing the dict returned frame.f_locals with a write-through proxy.
The locals() function will still return a dict, but for functions it will be a snapshot of frame.f_locals.
This will fix a number of bugs and improve consistency.
I’d like to get this implemented for 3.13, so I’m soliciting feedback before I submit the PEP to the steering council.
+1, this would be a good improvement, particularly for debuggers/stepping tools.
You call it a “write-through proxy”, but it appears to also be a read-through proxy (looking at the Implementation section)? I’d prefer it to proxy both ways (caveat below), but I don’t see where the snapshot comes into it? You can always dict(locals()) to take a snapshot, no?
I’m not sure how much benefit it would be to cache the instance, but it seems we could break the cycle on exit from the frame and have it deallocate at that point? Perhaps that’s the time to also take a snapshot if there are still outstanding references to the proxy, so then we can free the frame but keep the last set of values in it. I could also see us just raising a warning at function exit if the proxy is still in use - the typical scenarios for this are to access/modify locals by name in the current function or something it calls, not to return them out (and again, dict(locals()) should solve the warning).
(It also seems like the cycle would exist as soon as you do x = locals() even without caching, but I assume you have a plan for that.)
Currently, outside of a debugger, writing to the locals() dict doesn’t modify the underlying frame. Since there is code that writes to locals(), we need locals() to be a snapshot to avoid strange errors.
We need to cache the proxy only for the result of C calls to PyEval_GetLocals() which unfortunately returns a borrowed reference. Once the deprecation period of PyEval_GetLocals is over and it is removed, then we won’t need to cache the proxy.
Storing a reference to the proxy in a local variable will create a cycle, much like storing an exception in a local variable. There’s nothing that can be done about it.
Okay, so you’re calling it a write-through proxy because it’s already a read-through one, and that’s the change? Makes sense, though wouldn’t hurt to be explicit.
IIRC the high-level view here is that locals() (in a function) must keep making a copy because it’s always done that and people rely on it (and also on it being a dict, which is documented). Additionally, locals() in a function will also include any nonlocals (closure variables) referenced by the function.
OTOH we have more freedom for frame.f_locals, which has always had rather nebulous semantics and for most intents and purposes already felt like a full proxy (except when playing with threads, as described in Local variable assignment is broken when combined with threads + tracing + closures · Issue #74929 · python/cpython · GitHub). The plan is to make this a better proxy, whose effects are immediate (i.e. the actual “fast local” in the frame is modified before __setitem__ returns, and __getitem__ looks in the actual fast local).
Again, only in functions – in class and module scopes it’s the actual locals dict.
There’s another wrinkle, which is that historically, debuggers have allowed users to set “locals” in a frame that didn’t exist (i.e., for which no space in the underlying array of fast locals is allocated). In order to keep supporting this, the f_locals proxy will keep those in a separate “extra” dict, which should be allocated as needed.
Let’s also not forget that there’s an alternative PEP, PEP 558. Historically the proposal there was rather different, though more recently it has evolved to be fairly similar. A possibly biased comparison is included in PEP 667.
Finally, @gaogaotiantian has written a prototype implementation for PEP 667, so if you’re interested you can try in out. (PEP 558 also has an implementation, linked from its Implementation section.)
Having a transparent access to the locals is critical to the debugger. Currently there are more than one bug in pdb where you’ll lose your local variable changes if you do something that reads f_locals (and sometimes it’s unintentional).
The worst part is, there’s no way for the debugger to ensure the variable change to be kept because the user can run arbitrary code which could happen to just read f_locals and erase all the local changes before we can “convert” locals to fast.
As far as I can tell, even though PEP 558 makes the semantics of locals() clear, it does not solve the problem for the debugger - it still can’t reliably change local variables for the users.
Section C-API → PyFrame_FastToLocals, etc. recommends replacing PyFrame_FastToLocals with PyFrame_GetLocals. However, the section above (C-API → PyEval_GetLocals) advices to abstain from using PyFrame_GetLocals.
Probably, the example:
PyObject *locals = PyFrame_GetLocals(frame);
if (frame == NULL)
goto error_handler;
should be this instead:
PyObject *locals = PyEval_GetFrameLocals(frame);
if (frame == NULL)
goto error_handler;
and a phrase:
should be modified to call PyFrame_GetLocals() instead
should be replaced with:
should be modified to call PyEval_GetFrameLocals() instead
If the expected maintenance overhead and security risk of the deprecated behavior is small (e.g. an old function is reimplemented in terms of a new, more general one), it can stay indefinitely (or until the situation changes).
Of course, that would be overridden by the SC accepts PEP 667 as it is. But I don’t see a reason to choose the minimum deprecation period.
Similar for PyEval_GetLocals & co. Using those can lead to bugs, but not necessarily. It creates a reference cycle, but IMO that’s not worth breaking code that works.
It does presumably need a pointer per frame to cache the result. That might be a good reason to get rid of it eventually, but why not something like 3.18? From PEP 387 again:
If the deprecated feature is replaced by a new one, it should generally be removed only after the last Python version without the new feature reaches end of support.
Calls to [PyFrame_FastToLocals, etc.] are no longer required. C code that directly accesses the f_locals field of a frame should be modified to call PyFrame_GetLocals() instead: [example]
This reads like a new change, but it’s already the case since Python 3.11. Consider linking to the the What’s New entry instead of giving an example.
No special reason. I’ll remove the removal from the PEP
Similar for PyEval_GetLocals & co.
PyEval_GetLocals is a special case as it has a cost to support it (we need an extra field in the frame object) and it is unsafe because it returns a borrowed reference.
So we do want to remove it. Let’s say that it might be removed in 3.15, that way we can remove if we really need to, or wait for 3.18 if it isn’t a burden.
FWIW I personally read the PEP’s conservative deprecation of the existing borrowing C APIs as okay in large part because it was fuzzy about when, which even if it says “3.18 at the latest” I read anything so far out to honestly mean “if still we find reason not to buy then, we’ll work through it and delay further as appropriate”. Because that’s what I would expect any release manager at the time to ask for.
if you remove the removal from the PEP as suggested in your message above I guess that removes that need to read between the lines.
C API WG says the PEP is fine.
There are some details of the API surface that aren’t explicit in the PEP that we want to get right, but they don’t need to be written in the PEP nor block the implementation.
Personally, I still recommend changing this bit to prevent confusion:
Calls to [PyFrame_FastToLocals, etc.] are no longer required. C code that directly accesses the f_locals field of a frame should be modified to call PyFrame_GetLocals() instead: [example]
This reads like a new change, but it’s already the case since Python 3.11. Consider linking to the the What’s New entry instead of giving an example.
That said, I think PEP 667 does need an update to explicitly state how the Python level documentation for the locals() builtin will change in 3.13+ when the PEP is accepted and implemented (the current version of the PEP 558 documentation proposal applies to both PEPs, but the PEP 667 text doesn’t explicitly state that).
(Edited to add: I also just noticed that the PEP 667 comparison to PEP 558 is describing the version that existed prior to the 2021 semantic convergence, where 558 really did keep a separate snapshot of the local variables so it could make use of the regular builtin dictionary mapping API helper classes. I eventually accepted that PEP 667’s assertion that that implicit caching behaviour was impossible to reliably reason about was correct, hence the convergence on PEP 667’s proposed Python level semantics)
I don’t see why the PEP needs to include what the documentation should say.
Obviously the docs need updating, and should reflect the changes to the specification, but I don’t see why the changes to the docs need to be made explicit in the PEP.
I’ll add a note to the PEP stating the comparison with 558 is out of date.
If PEP 667 intends to leave the behaviour of locals() formally unspecified, then it doesn’t need to propose a change to the specification.
It may be I misunderstood the intended scope of PEP 667, in which case I would just rework PEP 558 to depend on PEP 667 and make 558 purely a spec update, relying on 667 for the implementation details (fixing the CPython bugs and formally making the updated behaviour part of the language spec are genuinely different questions, after all).
Python implementations aren’t required to provide a frame API, so not really.
No need to change 667 though, since the language spec change was always the main point of 558 anyway. Keeping the topics distinct also allows the SC to separate the “improve the reference implementation” question from the “make that improvement an expectation for all implementations” decision.