PEP 649: Deferred evaluation of annotations, tentatively accepted

larry · April 21, 2023, 4:01pm

Hello, I’m the author of PEP 649. Thanks for taking the time to suggest your alternate proposal; I appreciate that you want to make Python better. Ultimately I don’t prefer your approach. I also believe some of your criticisms of PEP 649 are mistaken. Please see my comments below.

I concede that I don’t use metaprogramming in Python, and I’m not conversant in what might be common techniques in that world. I don’t really understand the problem you describe.

However, I copied and pasted your class Bug: ... sample into a local source file, changed the last line to print the annotations, and ran it under Python 3.11. It showed val was annotated with the Correct class. I then added from __future__ import co_annotations and ran it under the most recent version of my co_annotations branch (hash 63b415c, dated April 19 2021). This also showed val was annotated with the Correct class. I checked, and yes it was creating the __co_annotations__ attribute, and when I ran that method manually it returned the Correct annotation for val. So, as far as I can tell, the code sample you suggested would fail under PEP 649 actually works fine.

I appreciate that metaprogramming is a complicated discipline, and I’m willing to believe that PEP 649 can cause observable and possibly undesirable behavior changes in metaprogramming. I’d be interested if you could construct a test case that did demonstrate different results with the current co_annotations tree–particularly if it’s a plausible example of real-world code, rather than a contrived and unlikely example. (There have been a lot of changes proposed to PEP 649, but they wouldn’t affect this part of the mechanism, so the April 2021 version is fine for you to test with.)

Zachary Rothberg:

class A:
    clsvar:dict[int,str]
 
    def dict(self):...
The annotation is now referring to the function dict instead of the built in dict. This is going to artificially restrict function names.

No, but it would mean that classes simultaneously shadowing existing names and using those names as part of an annotation will have to find an alternate expression that resolves to their desired value, e.g.

import builtins
class A:
    clsvar: builtins.dict[int,str]
 
    def dict(self):...

Perhaps the inconvenience of fixing these sorts of sites is offset by the improved readability, where the reader doesn’t have to remember the order of execution in order to remember “which” dict was being used in the annotation.

Unfortunately, it’s not as “reasonably simple” as you suggest.

First, you would need to create three new bytecodes: LOAD_NAME as you suggest, but also LOAD_GLOBAL, and LOAD_DEREF. All local variables referenced in an annotation would have to be relocated out of fast locals and into a closure (as already happens in PEP 649) because the Python compiler doesn’t do dataflow analysis, and so doesn’t know at compile-time whether or not a particular local variable has been defined at any particular point.

Also, managing the ForwardRef instances is going to add complexity. The ForwardRef needs to be the “stringizer” class, which means implementing every dunder method and creating new stringizers in a kind of crazy way. Rather than expose that behavior to users, I propose to make it a “mode” that you can switch on and off, and I would shut it off on all ForwardRef objects before returning them to users.

But tracking all the ForwardRef objects that could be created is tricky. Consider this example:

    class C:
        a: typing.ClassVar[undefined_a | undefined_b]

It isn’t sufficient to simply iterate over all the final values in the annotations dict and shut off “stringizer” mode on all the top-level ForwardRef objects you see. ForwardRef objects may be buried inside another object; in the above example, ForwardRef('undefined_a | undefined_b') would be stored inside a typing.ClassVar. It’s not reasonable to exhaustively recurse into all objects in an annotations dict to find all the ForwardRef objects that might be referenced somewhere inside.

Similarly, it’s not sufficient to simply remember every ForwardRef object constructed by the “fake globals” dict’s __missing__ method. In the above example, undefined_a | undefined_b is a ForwardRef constructed by calling ForwardRef('undefined_a').__or__(ForwardRef('undefined_b')). So this is a ForwardRef object created by another ForwardRef object, not by the __missing__ method.

My plan for the implementation of PEP 649 is to create a list to track every ForwardRef created in the computation of an annotations dict. It would be created by the “fake globals” environment, and every time anything created a new ForwardRef object–either the __missing__ method on the “fake globals” dict, or a dunder method on a ForwardRef object–the new object would be added to the list, and also carry with it a reference to the list. After the __annotate__ method returns, but before I return the new annotations dict to the user, I iterate over the list and deactivate the mode on every ForwardRef object.

I think this will work. But there’s an awful lot of magic behind the “stringizer” and the “fake globals” mode that permits it to work. I’m comfortable putting that magic into the Python library. I’m not comfortable building that much magic into the Python language.

I assume the “easy half way” you mention is your “LOAD_NAME which creates a ForwardRef for missing symbols” proposal, incorporating the “stringizer” functionality (which you called DefRef in
your “Alternatives” thread.)

I don’t like this approach. Also, it doesn’t currently satisfy all the use cases satisfied by 649. The latter problem is fixable; I don’t think the former problem is.

The latter problem is simply that you provide no recourse for getting the “stringized” annotations. Runtime documentation users enjoy the “stringized” annotations provided by PEP 563. Also, I made sure PEP 649 supported “stringized” annotations as a sort of failsafe. I worry there may be users out in the wild who haven’t spoken up, who have novel and legitimate uses for “stringized” annotations. If we deprecate and remove the implementation of PEP 563, without providing an alternate method of producing “stringized” annotations, these users would have their use case taken away from them.

This part is relatively easy for you to fix: simply add to your proposal some mechanism to provide the “stringized” strings. Since you don’t propose writing the annotations into their own function like PEP 649 does, I assume you’d more or less keep the PEP 563 approach around, but rename it to a different attribute (e.g. __stringized_annotations__). You might also have to propose a lazy-loading technology for it, as there was some concern that this approach would add memory bloat at Python runtime for an infrequently-used feature.

The reason why I still don’t like this approach: I think o.__annotations__ should either return the values defined by the user, or fail noisily (e.g. with NameError). I don’t think it’s acceptable for the language to automatically and silently convert values defined by the user into proxy objects, and I consider the techniques necessary to create them to be too magical to define as part of the language. I don’t remember your exact words, but I dimly remember you described PEP 649’s approach of delaying evaluation as being “surprising” or “novel”, which I interpreted as a criticism. That’s fair, but I consider silently replacing missing symbols with proxy objects far more “surprising” and “novel”, and I am definitely critical of this approach.

I’m the kind of guy who literally quotes the Zen Of Python when debating technical issues, and I suggest the Zen has guidance here:

Special cases aren’t special enough to break the rules.
Although practicality beats purity.

I wish annotations weren’t all special enough to need breaking the rules. But the Python world’s experience with annotations over the last few years has shown that they are. Annotations are a complicated mess, and we’re far past being able to solve them with something simple. Here, unfortunately, practicality is going to have to beat purity.

I concede that PEP 649’s delayed evaluation of annotations is novel, and a small “breaking” of “rules”. But I consider it far less novel, and a much smaller infraction, than changing the language to silently and automatically construct proxy objects where it would otherwise raise NameError.

Also:

Errors should never pass silently.
Unless explicitly silenced.

This one PEP 649 obeys, and your proposal does not. I consider requesting SOURCE or FORWARDREF format an explicit request to silence NameError exceptions; your proposal silently and implicitly catches those exceptions and swaps in a proxy object.

If you still prefer your proposal, that’s reasonable. But you’re going to have to write your own PEP–and you should probably do it soon. I already revised PEP 649 this week, and resubmitted it to the Steering Council; they previously indicated they wanted to accept it, so it’s possible they could accept it very soon. (Although they said that before the recent revisions. It’s possible they’ll find something they don’t like, and reject the PEP or ask for changes, which could give you more time.) I suggest you can write your PEP as a response to PEP 649, and simply cite the material in it, which will make your PEP far shorter and faster to write.