PEP 649: Deferred evaluation of annotations, tentatively accepted

DavidCEllis · May 22, 2024, 3:49pm

I think one place where this is going to bite people (me at least) is if their code is still supporting 3.9 or earlier, where you can’t rely on cls.__annotations__ directly as it will be inherited. The PEP does call this out, but it’s both in the best practices guide as the way to support 3.9 and doing it this way works all the way through 3.13 and will only finally break with 3.14, so I’m not sure how unusual or uncommon it is (dataclasses worked this way up until 3.12 for instance).

I’m also not clear on how this will work in the case where you can currently look at __annotations__ in the namespace in __new__ of a metaclass, but I think I can see that typing.py is also going to have to solve this first.

Jelle · May 22, 2024, 4:41pm

Yes, that’s awkward and you’d probably need to branch on sys.version_info right now. inspect.get_annotations abstracts this away, but it was added in 3.10. Though it’s worth noting that Python 3.9 reaches its EOL in October 2025, which is also when Python 3.14 (which is where PEP 649 would be implemented) should be released.

We could backport inspect.get_annotations in typing_extensions to make this easier.

DavidCEllis · May 22, 2024, 5:13pm

That’s kind of what I’m planning to do.

While inspect.get_annotations does abstract some of this away, it’s both 3.10+ and it’s in inspect which I’ve not been using due to the import time^[1]. I kind of wish get_annotations was available from a much lighter weight module but I’ll deal with the additional complexity.

A backport probably isn’t a bad idea though.

The module I’m using __annotations__ in has a ~1.5ms import time, vs ~20ms for inspect. ↩︎

zrothberg · May 22, 2024, 8:49pm

I know I have used this previously when needing one thing annotated in my code for runtime effects vs picked up by type checker. I would guard the fake annotation around if type checking and then access the annotation dictionary directly, which my type checker ignored. I think there is some other usage for slots that crops up as some annoying edgecase with default values.

I want to point out that I had tested this previously you can see the above comments for the non trival effects this has on meta-programming. The main place this needs to be vetted isn’t just pydantic but code that subclasses pydantic, which was the main stuff my above code and explanations came from. They are also extremely common. Database libs where the ones that did this the most in my experience do to the need to add so many changes.

To do these you normally need to operate before pydantic processes your classbody which requires making changes to annotations before type.__new__ goes. Which was not possible in the PEP at the time not sure if they has changed dramatically with the current implementation.

Not trying to beat a dead horse her and I have said my piece on this matter last year in above comments but want to reiterate a few small points. If the goal here is to get this to work as the default quickly I think some small modifications to the PEP would do so without changing the spirit of it or seriously negatively impacting performance.

Going to partial deferrals vs full deferrals would allow you to capture the variables without evaluating the code. This should result is nearly 100% of code that runs on previous versions without the annotations future continuing to give the same results. It should also not have a significant impact on memory. Given that the current proposal has shifted back into keeping forward references new code that runs in 3.14+ can nearly all of the time reproduce these forward references with pure python backportable libs for <3.14 code bases. I have made multiple pure python hacks of this it is fairly straightforward to do. It would require extra code but it can be easily identified as you just wrap it in version checks.

There is likely some not entirely convoluted way to make a mapping like object that can excute the deferred evaluation without needing to use a descriptor. I understand a descriptor will require less memory but there needs to be a line somewhere between compatibility and performance and this does not seem worth it. It would really only significantly affect the performance of run time checked code as it would be slower to be able to produce the dict one key at a time.

I think it is entirely possible to get 95% of the important parts here without needing to use a future flag or seriously impact current code.

Until circular reference issues and reducing the need to use if TYPE_CHECKING are fully addressed, PEP 649 is not really an alternative to PEP 563. There needs to be further changes to address those problems. That should not prevent this from releasing. There just needs to be more additions to deal with them. TYPE_CHECKING in particular is very bad for runtime type checking as all that code is completely lost.

barry · May 23, 2024, 10:24pm

I remember some of the SC discussions around this topic, and as I remember it there was a strong desire to rip the bandaid off and make the transition quickly. I’m really glad there’s a draft implementation available that folks can play with, and appreciate @Jelle ’s test suite results. I don’t know if it’s ultimately going to be feasible but if we can get away with backporting get_annotations() into typing_extensions and get libraries to begin using that now then maybe we can address the majority of issues before 3.14 is released with PEP 649.

BTW, the Python release version number of 649 needs to be updated.

DavidCEllis · May 24, 2024, 5:30pm

I had a brief test of Jelle’s branch and had one question on something that’s been bugging me about the PEP.

It’s entirely possible there’s something I’ve missed, but is there a specific reason why the generated __annotate__ functions are not planning to support FORWARDREF or STRING formats?

Currently I have code that’s roughly like this, intended to create slots automatically from annotations:

class MakeSlotsMeta(type):
    def __new__(cls, name, bases, ns, **kwargs):
        if "__slots__" not in ns:
            cls_annotations = ns.get("__annotations__", {})
            ... # logic to handle string annotations
            cls_slots = [
                k 
                for k, v in cls_annotations 
                if not is_classvar(v)
            ]
            ns["__slots__"] = cls_slots
            ...  # additional logic to store and remove attributes
        new_cls = super().__new__(cls, name, bases, ns, **kwargs)
        return new_cls

As expected this no longer works on the PEP branch as __annotations__ is not present. __annotate__ exists and is callable, but only with format=1, which will NameError if there are forward references. As far as I can tell for this to work correctly under PEP649 I will need to replicate the full logic of the ‘fake globals’ method that inspect.get_annotations is going to use^[1].

The PEP indicates that the ‘fake globals’ logic is only expected to work correctly on the __annotate__ methods Python generates. In fact it creates a specific opt-in flag for this case.

I’m trying to understand why - given this requirement - the ‘fake globals’ logic isn’t included in the generated __annotate__ functions, but instead requires a function from a separate import which then has to check this flag to see if it’s allowed to use ‘fake globals’, which are only expected to work on generated __annotate__ in the first place?

Ideally I would like to be able to use something like this inside the __new__ method:

cls_annotations = {}
annotate = ns.get("__annotate__")
if annotate:
    cls_annotations = annotate(2)  # NotImplementedError currently

Or make a temporary class and call inspect.get_annotations on that, but as mentioned previously import inspect is too much of a relative performance hit. ↩︎

Jelle · May 24, 2024, 5:52pm

Implementing the “fake globals” logic in the generated __annotate__ methods would mean implementing it in C (or directly in bytecode), which would be a maintenance nightmare.

I think the idea is that you should almost never call __annotate__ directly, and instead rely on inspect.get_annotations. I did some preliminary work implementing this (Some work on inspect · python/cpython@c5b308b · GitHub); I plan to continue that and turn it into a PR once the current PR is merged or close to ready.

It sort of works already:

>>> def f(x: y): pass
... 
>>> inspect.get_annotations(f, format=inspect.SOURCE)
{'x': 'y'}
>>> inspect.get_annotations(f, format=inspect.FORWARDREF)
{'x': <inspect._ForwardRef object at 0x102e8d7c0>}
>>> inspect.get_annotations(f, format=inspect.VALUE)
Traceback (most recent call last):
  File "<python-input-4>", line 1, in <module>
    inspect.get_annotations(f, format=inspect.VALUE)
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jelle/py/cpython/Lib/inspect.py", line 365, in get_annotations
    ann = getattr(obj, '__annotations__', None)
  File "<python-input-1>", line 1, in <annotations of f>
    def f(x: y): pass
             ^
NameError: name 'y' is not defined

My implementation creates a helper function inspect._call_dunder_annotate(dunder_annotate, format) that takes an __annotate__ function and a format and applies it. I think we may want to make that into a public function to make it easier to call an __annotate__ function directly when desired.

I’m not a fan of the new flag (and I haven’t implemented it yet); I think we may be able to do without it.

DavidCEllis · May 24, 2024, 7:24pm

Understandable, although a shame. It seems messy that the PEP gives a formal definition of the function, but the actual methods people are most likely to encounter don’t implement most of it.

I don’t think get_annotations would work in this context without having to create a temporary class for it to analyse.

This would help, but I’m probably going to end up needing to copy/reimplement the relevant code if it’s in inspect. That module is particularly heavyweight, enough so that it lead to dataclasses not being used in configparser despite being otherwise suitable.

I believe typing and inspect are going to need to share this format applying logic so is that going to be duplicated or will one begin to depend on the other?

[Edit: swapped two words around]

Jelle · May 24, 2024, 7:37pm

I’ve been toying with the idea of adding a separate stdlib module for annotation introspection (annotationtools?). Maybe that would be worth it?

typing under PEP 649 as accepted would likely have to import inspect, which is a shame; currently it goes out of its way to avoid importing inspect.

mikeshardmind · May 24, 2024, 7:46pm

Hypothetically, if this were to happen, what would be the intended way for libraries to not pay the import cost for inspect if they need something from typing? typing.TYPE_CHECKING implies importing typing. Not only this, but this also involves the annotations not being resolvable by runtime, which might create tension between libraries with different goals.

DavidCEllis · May 24, 2024, 8:06pm

Unfortunately, looking at the current implementation about half of the import time cost for adding inspect to typing will come either way with the support of format=SOURCE as that relies fairly heavily on ast unless this is deferred until someone asks for that format.

carljm · May 28, 2024, 12:45am

This seems worth it to me. Could the module simply be named annotations, or is that too confusable with from __future__ import annotations?

carljm · May 28, 2024, 12:49am

Given that format=SOURCE is intended for documentation generation tools, and thus likely to be the least widely used, and the least performance sensitive, it seems OK to me to defer heavy imports that only it needs until it is actually requested.

ericvsmith · May 28, 2024, 5:10am

I agree that is would be nice to have a separate library. I’d like to remove inspect as a dataclasses dependency. Unfortunately I also use inspect to generate doc strings, but I’ve never looked at removing it for that purpose. Maybe I could defer the import and make doc strings opt-out.

But a separate library for annotation inspection would be a good start.

erlendaasland · May 28, 2024, 7:47am

annotools (in the tradition of itertools and functools)
annolib
annotationslib

DavidCEllis · May 28, 2024, 10:20am

I did look at this once when looking at dataclasses’ import time (before it used get_annotations). Rather than making them opt-out I ended up making it a descriptor that only performed the import if someone actually tried to look at the docstring^[1].

I don’t know if the current format of docstring is required for something^[2], otherwise you could also generate a different docstring in a way that didn’t require inspect at all.

When you put it like this inspect.annotations seems like an annoyingly appropriate name.

or not at all if being run in -OO mode. ↩︎
as far as I can tell it’s undocumented - there’s no mention of __doc__ or doc string/docstring in the dataclass documentation. ↩︎

ilotoki0804 · September 23, 2024, 8:20am

Format identifiers are always predefined integer values.

inspect.VALUE = 1
inspect.FORWARDREF = 2
inspect.SOURCE = 3

Isn’t it better to use string constants like 'value', 'forwardref', and 'source' rather than integer constants? Is there any special reason why integer constants should be used?

Jelle · September 23, 2024, 12:07pm

Why is it better?

The draft implementation currently on the main branch uses an enum instead, though preserving the integer values in PEP 649: annotationlib — Functionality for introspecting annotations — Python 3.14.0a0 documentation.