Howdy howdy. I’ve been doing a lot more thinking about PEP 649 since the last discussion topic from a few weeks back. I propose to revise some important details, detailed below.
One proviso before I begin. It’s a gnarly topic, and my proposal has evolved a lot, and I feel like it’s been a real struggle to get the solution right. There may be important details I omitted, or text I neglected to update. So if something seems awful, or doesn’t make sense, or I’ve been self-contradictory, please start by making a (kind) request for clarification. Who knows, it may be an honest mistake–or I might even have a good reason that I neglected to mention. Anyway, please hold off tearing my proposal apart until I’ve confirmed the detail was intentional, ok?
Now that I’ve–finally!–posted this, I’m going to start updating PEP 649 in parallel. There’s still a chance to get this into Python 3.12, I think, but time is running short. Hopefully we won’t uncover any more overlooked details requiring major course corrections…!
My thanks to Carl Meyer, Eric V. Smith, and Mark Shannon for participating in the private email thread leading to this post. Extra-special thanks go to Carl for his massive contribution to this discussion! He must have spent countless hours corresponding with me, and he made numerous good suggestions which I’ve incorporated. I’m grateful he was willing to contribute so much of his time and expertise. (He also saved me from some dumb blunders! Phew!)
Here goes!
Let’s start with the easy stuff: renaming some things. First up is __co_annotations__
. It was a placeholder name, a relic from the very early days of PEP 649 when the attribute actually stored a code object. It’s long past time we gave it a better name. Since “maybe we should change the name” has been a TBD item in PEP 649 for more than a year, and nobody has suggested anything, it seems it’s up to me. My best idea is __compute_annotations__
. If you have a better suggestion, go ahead and post it, and I’ll consider it.
For the rest of this post I’ll use the name __compute_annotations__
, even in historical contexts, just for consistency’s sakes.
In the last go-round I also proposed a format
parameter for inspect.get_annotations
. format
could be one of three values: VALUES
, HYBRID
, and STRINGS
. I want to amend these names too.
First, they should be singular: change VALUES
to VALUE
, and change STRINGS
to STRING
.
Second, I’m not convinced STRING
is the best name. I picked it because PEP 563 called these “stringized” annotations. But the name by itself doesn’t convey much–it’s pretty generic. Admittedly this isn’t a major concern. But if we put our minds to it maybe we can arrive at something better. So far my best ideas are SOURCE
, CODE
, and SOURCE_CODE
. I note none of these are strictly accurate; it isn’t actual source code, it’s reconstructed source code. But RECONSTRUCTED_SOURCE
seems too long… and, maybe someday, it actually will be the original source code, which would render the word RECONSTRUCTED
in the name anachronistic. STRINGIZED
also seems a little too long, and would also be inaccurate if we ever switch to preserving the actual source code. So far no really great name has suggested itself to me.
And now I’m not so sure about the HYBRID
name either. Maybe PROXY
is better? I think the format name should describe the value(s) you get back, as opposed to describing a “mode” that inspect.get_annotations
is operating in. And the objects it creates for undefined symbols are proxies for the actual values. I have yet another name to propose here, but I might have a different use for that name coming a little later in this proposal. I’ll tell you about it then.
(I wouldn’t want to rename HYBRID
to MOCK
. The ForwardRef
proxy objects aren’t mock objects, they’re text representations of an expression you might be able to evaluate later. Although there are some similarities, they don’t really behave like actual mock objects like unittest.mock.Mock
.)
I have polls at the bottom of this post so you can vote on all this renaming. If you have an alternate name suggestion, please put that in a comment by itself, and we can use Discuss comment “hearts” to count as votes for that name. (I’d just add the alternate suggestions to the poll itself, but Discuss doesn’t let you modify polls once it’s been open for like fifteen minutes.)
For the rest of this document, I’ll use the singular versions of the names so far–VALUE
, HYBRID
, and STRING
.
(Also: everywhere I talk about inspect.get_annotations
in this document, assume I’m talking about typing.get_type_hints
too. For example, typing.get_type_hints
will also support the format
parameter and all these formats. I’m hoping I can just reimplement typing.get_type_hints
on top of inspect.get_annotations
so I only have to do this work in one place.)
Second change: so far PEP 649 specifies that will initially be gated by from __future__
. Several people, including myself, now think that we should just pull the trigger and make it default behavior in 3.12. What do you think? There’s a poll.
Third change: in the previous thread, Carl Meyer argued that there are use cases for requesting HYBRID
format, then later evaluating the stringized values to get real values. Currently, users who enable PEP 563’s “stringized annotations”, then later try to evaluate those strings, have a lot of trouble evaluating them correctly. It can be hard to get the right globals()
for an annotation, and handling closures correctly is nigh impossible. Carl wanted the placeholder values in HYBRID
format to not simply be strings, but to be evaluateable objects that contain the strings and all the context needed to correctly evaluate them.
(If you aren’t current on the “stringizer” and “fake globals” runtime environment concepts this proposal is based on, please refer to my previous discussion thread where these concepts were first introduced.)
Here’s how I want this to look from the user’s perspective:
- These objects will be
typing.ForwardRef
objects. Actually I expect to hoistForwardRef
out of thetyping
module and put it somewhere else–possibly theinspect
module, possibly into some internal module with an underscore. I don’t think we’ll need to reimplement it in C, but I’m leaving that as an open question for now. ForwardRef
objects will internally contain a reference to the globals, locals, and closure information necessary to evaluate the string. (Handling closures is messy but doable; we’ll have to reconstruct a dict out of the closures, and load that information into locals when callingeval
.)- The API to ask a
ForwardRef
to evaluate itself should be: you call theForwardRef
object without arguments, as in, it supports__call__
. This would evaluate the string and return the result. (Or it might raise an exception.) This is so obviously the correct API that no further discussion seems necessary. CurrentlyForwardRef
objects do have an_evaluate
method, but this is internal-only, unsupported, takes globals and locals arguments, and so on. (Maybe I can remove it when I add__call__
, maybe not.) - Internally,
ForwardRef
will be the “stringizer”–instances of theForwardRef
class will be doing the stringizing. This is necessary forHYBRID
format; it’s not viable to build the values with “stringizer” objects, then replace them withForwardRef
objects at the last minute. There may be objects with arbitrary internal references to the “stringizer” objects and it’s not reasonable to tear apart the constructed objects and replace their “stringizer” objects withForwardRef
objects.
But now we have a problem: when stringizing, __call__
on a “stringizer” has to return a new “stringizer” object. How can ForwardRef.__call__
return the real value to the user, but also return a “stringizer” (aka ForwardRef
) when stringizing? It’ll have to have a special “stringizer” mode that’s off by default. There’ll be an internal, unsupported bit of API that lets you turn on “stringizer mode” on a ForwardRef
object. Don’t worry, we’ll turn off “stringizer mode” before the user ever sees the object. Internally, the “fake globals” runtime environment will keep track of every ForwardRef
object it creates. When __compute_annotations__
finishes, it’ll iterate over all the ForwardRef
objects and switch off “stringizer mode” on each one.
And speaking of “fake globals”. This environment will also need “fake locals”, to catch class namespace lookups for annotated methods. We’ll also have to create a “fake closure tuple” for __compute_annotations__
functions that use closures.
(I even considered even creating “fake constants”, where we replace co_consts
with tuple(ForwardRef(repr(c)) for c in co_consts)
. But this would mean creating a modified version of the code object and running that. It’s probably easier to leave the constants alone. If we’re generating STRING
format, and any of the values in the resulting annotations are constants (e.g. manually stringized type hints), we’ll just call repr
on them during the same pass when we extract the strings from the ForwardRef
objects.)
On to the fourth change. Previously I proposed inspect.get_annotations
would accept a format
parameter, specifying the format for the annotation values. So far I’ve proposed three of these “formats”–VALUE
, HYBRID
, and STRING
. I now propose a fourth: FORWARDREF
. This would be like STRING
format, but instead of the annotation values all being strings, they’d all be ForwardRef
objects. (In case it’s helpful, I first proposed this format in this discuss thread.)
How is this different from HYBRID
format? In HYBRID
format, if the annotation refers to a global or local that’s been bound, it uses the real value. It’s only when the expression uses an undefined global or local that we create a “stringizer” to represent that missing name. In FORWARDREF
format, every global or local (or closure) would be replaced with a “stringizer”. Every value in the dict it returns would be a ForwardRef
object, guaranteed, wheras in HYBRID
format the values you get may differ depending on what’s been imported so far. (Even annotation values that were constants would get wrapped in ForwardRef
objects, for consistency’s sakes.)
Why do I propose adding this? Because we get it for free. In order to compute HYBRID
and STRING
formats, inspect.get_annotations
has to be able to create ForwardRef
objects for every name. And in the case of STRING
format, the last step, just before returning, is to extract the strings from the ForwardRef
objects and build a dict mapping to those. So with our current approach, we literally have to compute FORWARDREF
format in order to compute STRING
format.
I can’t think of an implementation of inspect.get_annotations
that could support HYBRID
format where we don’t essentially get FORWARDREF
format for free. Even if in the future we stored the annotation strings from the source code in the .pyc file somewhere, and so STRING
format was produced in a completely different way, we’d still need to support HYBRID
format, which means we’d still have all the code needed to support FORWARDREF
format too. So, as long as we have to permanently support all the functionality to support this format, we might as do the small amount of extra work to give it to the users, right?
I admit I haven’t come up with a convincing use case for it. The closest I get to a use case is, “it’s more consistent than HYBRID
, but the values are evaluatable unlike STRING
, and that seems like it could be useful.” But that’s pretty thin. So I’m not actually proposing adding it to PEP 649, per se. I included the proposal here so we could discuss it. There’ll be a poll about this at the end–should we add FORWARDREF
or not? In particular, if you have a good use case for FORWARDREF
format, please speak up!
For the rest of the document, I’ll describe FORWARDREF
as if it’s an accepted part of the proposal. But to be clear: you shouldn’t interpret that as me trying to drum up support for it. I don’t really care whether we keep or reject FORWARDREF
; I just want to do the right thing for Python users. If the community doesn’t want or need it, let’s reject it–that’s fine by me. I mean, hey, that reduces my workload! If only very slightly.
Now that I’ve introduced FORWARDREF
format, let me stipulate that these formats will be defined as integer values, specifically:
VALUE=1
HYBRID=2
FORWARDREF=3
STRING=4
They won’t be Enum
values, or strings, or instances of a custom class, etc.
The values are also guaranteed to be contiguous, and the inspect
module will have attributes representing the minimum and maximum format values:
FORMAT_MIN = VALUE
FORMAT_MAX = STRING
This should prove useful for code working with different formats–more on this very soon.
Fifth, I previously asked the question: should __compute_annotations__
be a public or private API? Nearly all respondents said it should be private, at least for now. Since then I’ve realized __compute_annotations__
must be a public API, and for a very good reason: functools.partial
, attrs
, dataclasses
, etc. Any code that wraps an existing class or function, returning a new object with the same or modified annotations, and which wants to support HYBRID
, FORWARDREF
, or STRING
format, will have to write its own __compute_annotations__
. And since you can write such code in pure Python, we need to support this API from Python.
Here’s where it gets a little messy. If we simply declared __compute_annotations__
to be a public API, and otherwise kept the API and implementation the same as previously proposed, third party implementations of __compute_annotations__
would be maddening to write in Python. This is because of the “fake globals” runtime environment that make HYBRID
, FORWARDREF
, and STRING
formats possible. When run in this environment, these third-party __compute_annotations__
functions couldn’t do any real work–because any global symbol they referenced would be replaced with a “stringizer”! They couldn’t evaluate global values, call functions in global modules, etc. All their globals would be fakes.
Now, they could sidestep this by smuggling in the globals they need as default parameters:
def __compute_annotations__(self, inspect=inspect):
...
Or they could do their work in a different method. Only the top-level call, the __compute_annotations__
call itself, runs in the “fake globals” runtime environment. They could call a different method through self
, which would be a real value (because it’s an argument), and that function wouldn’t run in a “fake globals” runtime environment:
def __compute_annotations__(self):
return self.actual_compute_annotations()
This means actual_compute_annotations
could be written conventionally–looking up global values, calling functions in libraries, etc. All its globals would be real, and it would run normally.
But now they have the opposite problem: if they compute the annotations in a different function like actual_compute_annotations
, the HYBRID
, FORWARDREF
, and STRING
formats wouldn’t render properly, precisely because they’re not running in the “fake globals” runtime environment. How can they compute these other formats?
There’s a straightforward solution to this, and it ties back neatly to the fact that these are wrappers: their annotations are defined on the object they’re wrapping. They can simply call inspect.get_annotations
on the original object. That would produce the original annotations in the correct format, and they can then modify the result as needed. Easy peasy.
Except… how would know which format to ask for? As previously defined, __compute_annotations__
is never explicitly told what format it’s producing. It’s implicit in the runtime environment. True, you could use some coding tricks to sniff out what environment you’re running in, but even that is insufficient–FORWARDREF
and STRING
formats actually run in identical runtime environments. The difference between the two is in the cleanup pass run after __compute_annotations__
returns. Unless we change the API, it’s literally impossible for attrs
et al to correctly support all formats.
It’s not a hard fix, but it feels like a big change: __compute_annotations__
must itself take the format
parameter, specifying VALUE
, HYBRID
, FORWARDREF
, and STRING
formats. This allows third-party __compute_annotations__
functions to handle any format, because now we explicitly tell them exactly what we want.
The __compute_annotations__
functions generated by the CPython compiler won’t be sophisticated enough to handle HYBRID
, FORWARDREF
, and STRING
formats themselves. They’ll only know how to compute VALUE
format, aka real values. They’ll still get run in the “fake globals” runtime environment to produce the other formats. But I expect third-party __compute_annotations__
functions to directly support every format. So here’s how the API should work: if __compute_annotations__
supports the requested format, it must return a dict in that format, and if it doesn’t support that format, it must raise NotImplementedError()
. The function would then get run in the “fake globals” runtime environment, requesting VALUE
format.
Alas, the “fake globals” runtime environment is so obnoxious that we should never run any __compute_annotations__
function in that environment unless it explicitly opts in. Carl had the best suggestion for this: add a new flag to co_flags
(the code object bitfield) that specifies “This code object supports being run in a fake globals runtime environment”. It’d be inconvenient for pure-Python wrapper libraries to set this flag for their __compute_annotations__
functions, but I think that’s for the best; I expect they’re going to be real code, with flow control and such, not a simple return
statement. They’ll do most of their work by calling inspect.get_annotations
on the thing they’re wrapping. (It may make sense for extension modules that create their own code objects to set the flag, I’m not sure.)
The logic inside inspect.get_annotations
now works something like this pseudocode:
c_a = o.__compute_annotations__
try:
return c_a(format)
except NotImplementedError:
if not supports_fake_globals(c_a.__code__):
return {}
c_a_with_fake_globals = rebind_with_fake_globals(c_a, format)
return c_a_with_fake_globals(VALUE)
In the general case, it does mean raising an exception, which is a little slow. But this code path isn’t used for VALUE
format, and in any case I don’t expect folks are examining annotations in performance-sensitive code.
Bringing it all together, here’s the new API definition for __compute_annotations__
:
__compute_annotations__(format: int) -> dict
Returns a new dictionary object mapping attribute/parameter names to their annotation values.
Takes a
format
parameter specifying the format in which annotations values should be provided. Must be one of the following:
inspect.VALUE
Values are the result of evaluating the annotation expressions.inspect.STRING
Values are the text string of the annotation as it appears in the source code. May only be approximate; whitespace may be normalized, and constant values may be optimized.inspect.FORWARDREF
Values areForwardRef
expression proxy objects, containing the string of the annotation value as perSTRING
format. TheForwardRef
objects contain references to all the context needed (globals/locals/closure) to evaluate themselves correctly.inspect.HYBRID
Values are real annotation values (VALUE
format) for defined values, and ForwardRef proxies (FORWARDREF
format) for undefined values. Real objects may be exposed to, or contain references to,ForwardRef
proxy objects.If
__compute_annotations__
doesn’t support the specified format, it must raiseNotImplementedError()
.__compute_annotations__
must always supportVALUE
format; it must not raiseNotImplementedError()
when called withformat=VALUE
.When called with
format=VALUE
,__compute_annotations__
may raiseNameError
; it must not raiseNameError
when called requesting any other format.If an object doesn’t have any annotations,
__compute_annotations__
should preferably be deleted or set toNone
, rather than set to a function that returns an empty dict.
Here’s what a __compute_annotations__
function generated by the compiler would look like, if it was written in Python:
def __compute_annotations__(format):
if format != 1:
raise NotImplementedError()
return { ... }
As mentioned before, the code object for this __compute_annotations__
function would have the special “safe for fake globals” flag set.
Note that we compare format
to the hard-coded value 1
. This is set in stone as the constant for VALUE
format. There are various reasons it’s hard-coded here, but here’s the most important: when it’s run in a “fake globals” runtime environment, __compute_annotations__
can’t look up inspect.VALUE
… because it’d get a ForwardRef
! (However, when format
is not 1
, that means it’s not being run in a “fake globals” runtime environment, and therefore it’s safe to look up NotImplementedError
. __compute_annotations__
can rely on the fact that it’ll only be asked for VALUE
format when run in a “fake globals” runtime environment.)
Also, to clarify a topic that came up in private discussions: __compute_annotations__
functions generated by Python never cache anything. They recompute the annotations dict every time they’re called. This isn’t a requirement for the __compute_annotations__
API; third-party __compute_annotations__
functions can cache whatever they like. But the only caching of annotations defined in Python-the-language is the internal cache for the __annotations__
property in functions, classes, and modules. If the internal cache for the __annotations__
property is unset, and __compute_annotations__
is set, and the user asks for __annotations__
, the getter will call __compute_annotations__
and cache and return the result.
Finally let’s consider what __compute_annotations__
might look like for a wrapper object. For simplicity, I’ll contrive a super-simple example. This class is a clone of functools.partial
, but it only handles wrapping one argument, which is always named arg
:
def __compute_annotations__(self, format):
ann = inspect.get_annotations(self.wrapped_fn, format)
del ann['arg']
return ann
Our third-party wrapper’s __compute_annotations__
method doesn’t have to worry about running in a “fake globals” runtime environment, because it hasn’t set the special opt-in flag on its code object. But it also doesn’t need to implement any of the formats itself–it can rely on inspect.get_annotations
to do all the hard work. All it really needs to do is adjust the computed annotations dict as needed, in this case removing the entry for 'arg'
. Happily this __compute_annotations__
is forwards-compatible; if we add support for new formats in the future, it can rely on inspect.get_annotations
to support that new format, and it doesn’t even need to change.
Of course, other wrappers may not be so lucky; they may need to modify annotation values, or add new ones. And they can’t do that for a new format they’ve never seen before. But defining __compute_annotations__
as a public API with this interface at least gives third-party code the chance to do that work and fully support all formats. (As I always think of it: we’re giving them “a lever and a place to stand”. Hat tip to my old pal Archimedes!) I’m optimistic that currently-maintained third party libraries will want to do this work and add first-class support for all annotation formats.
Oh, and, I defined FORMAT_MIN
and FORMAT_MAX
in case third-party code wants to pre-calculate all the formats for __compute_annotations__
. This would permit them to iterate over all formats and cache the results.
One more messy topic: how should inspect.get_annotations
behave when code manually modifies, overwrites, or deletes annotations?
Traditionally __annotations__
wasn’t a special object. It was just an attribute that stored a reference to a dict, and user code could modify the dict as it saw fit. This leaves open the definite possibility for the user manually changing the annotations on an object. User code could potentially:
- modify the dict, adding/removing/changing keys and values,
- set
o.__annotations__
to a new value (hopefully another dict!), or - delete
o.__annotations__
.
If the user does any of these things, how should the output of inspect.get_annotations
change?
First, I want to support this behavior as best I can. My starting goal is 100% backwards compatibility with existing code that manipulates o.__annotations__
. Although I haven’t seen any code deleting o.__annotations__
, I have seen reasonable code that overwrites or modifies o.__annotations__
–and that code must continue to work. (So, I don’t propose changing o.__annotations__
to a read-only dict, or preventing the user from overwriting or deleting the attribute, or anything else that would break existing code.)
However, if you manually change __annotations__
, that means __compute_annotations__
is now out-of-date. And there’s simply no viable way to automatically __compute_annotations__
to match. What should Python do?
Once again I refer to the Zen: “in the face of ambiguity, refuse the temptation to guess”. If the user modifies, deletes, or overwrites o.__annotations__
, we don’t know whether or not the output of o.__compute_annotations__
still matches the new annotations. Rather than keep it around, hoping that maybe it matches, o should drop its reference to __compute_annotations__
. That way it can’t get called and we won’t generate stale values.
In the cases of overwriting or deleting o.__annotations__
, we have it easy. o.__annotations__
is already a property; we just make the “setter” and “deleter” methods on o
drop its reference to its __compute_annotations__
. This is the first component to our solution.
But we don’t have a reasonable way of detecting when the user modifies the o.__annotations__
dict in place.
(Or do we? CPython 3.12 adds a new “watch” facility to PyDict, which lets a callback get notified any time a “watched” dict is modified. But this would be a pretty heavyweight solution. It’d require allocating memory for callback state for every __annotations__
dict generated, to let the callback map the annotations dict back to o
, which we’d then need to look up somehow. And even then it wouldn’t notify you if code mutated a mutable value inside the dict. In any case I don’t want to define the language to depend on this implementation feature–after all, other implementations of Python may not have such a facility. And it’s not defined as part of the language–it’s not exposed anywhere in the language or library. By the same token, I don’t want to define o.__compute_annotations__
as returning a new subclass of dict that explicitly remembers when it’s been changed; I think this is too big and expensive for an incomplete solution, solving what is ultimately a small problem.)
What should we do? Let’s start by breaking down the problem into smaller chunks: what should inspect.get_annotations
do for each of the supported formats?
For VALUE
format, if o.__annotations__
is set, inspect.get_annotations(o, VALUE)
will simply return a copy of it. So if the user overwrites or modifies o.__annotations__
, VALUE
format will automatically reflects those changes. And if the user deletes o.__annotations__
, o will drop its reference to __compute_annotations__
, and inspect.get_annotations(o, VALUE)
will return an empty dict–which would be the correct behavior. VALUE
format already works fine in all scenarios.
What about HYBRID
format? Consider this observation: if o.__annotations__
is set to a value, that means that the annotations dict must be computable–conceptually, all the values needed to compute the annotations are defined. Which means that if we computed HYBRID
format right now, it would turn out identically to VALUE
format! There wouldn’t be any undefined names we’d need to wrap with a ForwardRef
.
Therefore, when you call inspect.get_annotations(o, HYBRID)
, the first step is to see if o.__annotations__
is set. If it is, return a copy of it, just like VALUE
format does… because that’s the correct value. And if the user overwrites or modifies o.__annotations__
, by definition o.__annotations__
must be set. So in all scenarios where user code modifies annotations dicts, HYBRID
format simply works the same as VALUE
–which means it’s in good shape too.
(HYBRID
format only tries running o.__compute_annotations__
in a “fake globals” runtime environment if o.__annotations__
isn’t defined, and if o.__compute_annotations__(HYBRID)
doesn’t return a dict. Since this presupposes that o.__annotations__
isn’t set, we simply can’t have the sticky problem of “the user modified the existing annotations dict” by definition.)
It’s STRING
and FORWARDREF
formats where we run into a problem. We can’t return o.__annotations__
like the other two formats. And we can’t simply turn the annotations values into strings with repr
, like this:
return {k: repr(v) for k, v in o.__annotations__.items()}
because that computes the repr
of the value of the annotation, rather than reproducing the original source code of the annotation. These two strings are often very, very different.
If the user overwrites or deletes o.__annotations__
, a request for STRING
and FORWARDREF
formats will return an empty dict, which is correct. The real unsolved problem here is when the user modifies o.__annotations__
in situ, then asks for STRING
or FORWARDREF
format. We don’t have a good way of detecting and handling this. In this case we’d call the out-of-date __compute_annotations__
method and return stale data.
A cursory examination of code in the wild suggests this won’t be a major problem. Most of the time, third-party code that manually creates annotations overwrites o.__annotations__
, or sets them on an object that didn’t define any annotations at compile time. That will all work fine. Code that modifies the existing __annotations__
dict, on an object that had annotations defined at runtime, seems quite rare. In the discussion around this point, Carl found eight examples of existing code in published third-party libraries that modify o.__annotations__
directly, including three in attrs
. The good news: only one of the eight would actually result in stale data–the others would all produce correct results in practice. (And nope, the bad one wasn’t in attrs
.)
I think we’ve now whittled this problem to be small enough that we can just mention it in the inspect.get_annotations
documentation, as follows:
If you directly modify the
o.__annotations__
dict, by default these changes may not be reflected in the dictionary returned byinspect.get_annotations
when requesting eitherSTRING
orFORWARDREF
format. Rather than modifyingo.__annotations__
directly, consider replacingo.__compute_annotations__
with a function that computes the annotations dicts with your desired values. Failing that, it’s best to overwriteo.__compute_annotations__
withNone
, or deleteo.__compute_annotations__
, to preventinspect.get_annotations
from generating stale results forSTRING
andFORWARDREF
formats.
Now, let’s bring all these semantics together, and write a simplified pseudocode version of inspect.get_annotations
. I’ll elide a lot of border case error handling code, and just concentrate on the main conceptual flow:
def get_annotations(o, format):
if format == VALUE:
return dict(o.__annotations__)
if format == HYBRID:
try:
return dict(o.__annotations__)
except NameError:
pass
if not hasattr(o.__compute_annotations__):
return {}
c_a = o.__compute_annotations__
try:
return c_a(format)
except NotImplementedError:
if not can_be_called_with_fake_globals(c_a):
return {}
c_a_with_fake_globals = make_fake_globals_version(c_a, format)
return c_a_with_fake_globals(VALUE)
It seems important that inspect.get_annotations
should never itself raise NotImplementedError()
. For example, hand-written __compute_annotations__
functions will often call inspect.get_annotations
to actually calculate the annotations; they should be able to rely on inspect.get_annotations
abstracting away this error state. Instead, whenever inspect.get_annotations
is run on something where it can’t produce proper output, it returns an empty dict. This is already the defined API for inspect.get_annotations
and I think it should be preserved.
Finally: PEP 649 never specified how it would interact with PEP 563, “Postponed Evaluation Of Annotations”, aka “stringized annotations”. If you activate from __future__ import annotations
, should Python still generate __compute_annotations__
functions? I think the answer is “no”. It would complicate the implementation of 649, and there’s no user benefit to delayed evaluation of hard-coded strings.
However, Carl had a novel suggestion here to make the transition easier from 563 to 649–and it’s a good one. He proposed the following small hack: if o
is an annotated object from a module that has from __future__ import annotations
active, change inspect.get_annotations(o, STRING)
to return the (stringized) annotations from o
. This means that users currently relying on stringized annotations can immediately switch to calling inspect.get_annotations(o, STRING)
, then turn off the __future__
import at their leisure.
(You can’t directly detect whether or not a module has a particular from __future__
feature enabled. But there’s a reliable indirect way to tell: from __future__ import annotations
really does import an object called annotations
, an instance of __future__._Feature
. You can just check to see if that exists.)
This will have the curious side effect of making this expression true:
inspect.get_annotations(o, STRING) == inspect.get_annotations(o, VALUE)
when o is defined in a module with stringized annotations enabled. Otherwise this expression would never be true. It’s a little weird, but if we document it and explain our reasons I think our users will thank us.
That’s the only change I plan to make in PEP 649 regarding PEP 563 and stringized annotations. I don’t plan to modify how stringized annotations work, and Python won’t generate __compute_annotations__
functions for any of the objects in a module when from __future__ import annotations
is active.
Polls
Should we rename STRING
format?
- No, keep the name
STRING
format. - Yes, change it to
STRINGIZED
format. - Yes, change it to
SOURCE
format. - Yes, change it to
CODE
format. - Yes, change it to
SOURCE_CODE
format. - Yes, but my vote is for a new name in the comments.
Should we rename HYBRID
format?
- No, keep the name
HYBRID
format. - Yes, change it to
PROXY
format. - Yes, change it to
FORWARDREF
format. (I voted against the separateFORWARDREF
format.) - Yes, but my vote is for a new name in the comments.
Should PEP 649 initially be gated behind a from __future__
declaration?
- No, it should be the default behavior immediately.
- Yes, let’s not make it default behavior right away.
Should inspect.get_annotations
(and __compute_annotations__
) support FORWARDREF
format?
- No. Why add support for something nobody needs? YAGNI.
- Yes, we might as well / I have a good use case.