Questionable annotation behaviors

Jelle · June 15, 2024, 2:31am

I discovered a few questionable behaviors recently around typing and annotations. In each case, backwards compatibility probably means we can’t change the behavior, so I am not proposing any concrete change, but I want to make more people aware of these issues.

The three issues are explained in more detail here:

https://jellezijlstra.github.io/odd

In summary, the three issues are:

Annotations on NamedTuple and TypedDict subclasses are converted

>>> class X(NamedTuple):
...     a: None
...     b: "str"
...
>>> X.__annotations__
{'a': <class 'NoneType'>, 'b': ForwardRef('str')}

Unlike normal classes, None is turned into NoneType and strings are turned into ForwardRef objects.

TypedDict `annotations` include those on base classes

>>> class X(TypedDict):
...     a: int
...
>>> class Y(X):
...     b: str
...
>>> Y.__annotations__
{'a': <class 'int'>, 'b': <class 'str'>}

On normal classes (and NamedTuples), __annotations__ only includes the class’s own annotations; on TypedDicts it also includes those of base classes.

Parenthesized names don’t show up in `annotations`

>>> class X:
...     a: int
...     (b): str
... 
>>> X.__annotations__
{'a': <class 'int'>}

And if you ever wondered why the AnnAssign AST node has a simple field, this is why.

I posted a PR (annotations: expand documentation on "simple" assignment targets by JelleZijlstra · Pull Request #120535 · python/cpython · GitHub) making the documentation for this behavior more explicit.

mdrissi · June 15, 2024, 4:52am

The first behavior I know I’ve hit before and added a special case in my runtime type inspection logic to deal with it. The other two behaviors I don’t think I’ve ever noticed. The second behavior I tend to merge type annotations of parent classes anyway to know all attributes while 3rd feels even harder to hit naturally.

I would generally be happy to have behaviors like first “fixed” and that annotation value not have extra undocumented rules.

carljm · June 17, 2024, 11:26pm

I think the second and third behaviors are intentional and arguably justified.

TypedDicts get their “base class” annotations because they don’t actually have base classes (other than dict); their “inheritance” is rather a syntactic shortcut for constructing a new independent TypedDict that gets the fields from the “bases”. The “bases” aren’t actually found in the MRO of the new TypedDict, so if the annotations weren’t copied, it would be difficult/unintuitive to actually introspect the full type signature of a TypedDict at runtime. (Technically it would be possible via __orig_bases__ but that’s pretty obscure.)

(I’m not sure that this is the best/only way for TypedDict to be implemented at runtime, but it’s what we have.)

For the third, it seems reasonable that annotations of things that aren’t just a name wouldn’t go into __annotations__, since it’s not clear what the key would be. Maybe a single parenthesized name could be special-cased, but it’s not clear who would practically benefit.

The first behavior seems to me like an outright bug that I would prefer to just fix, but the backward-compat consequences might indeed be too great.

Jelle · June 18, 2024, 3:34am

That’s true, I hadn’t realized that the __bases__ of a TypedDict. A problem with merging the annotations is that robust runtime introspection tools need to know which scope each annotation came from (in order to resolve string annotations). Fortunately, the first behavior makes this a little easier, because annotations in TypedDicts get resolved to ForwardRefs that are aware of their defining module. If this wasn’t the case, it would not be possible to resolve string annotations in TypedDict fields at all.

Of course, the implementation of PEP 649 will make this a lot more robust.

I agree that it makes sense that annotations for a.b: T or a[b]: T are not stored anywhere. What I find questionable is that (a): T is also discarded: I would have expected it to be equivalent to a: T. The implementation needs special casing to make it so that annotations are not stored in this case.

carljm · June 18, 2024, 4:28pm

Oh, I didn’t realize this. I agree with you; I don’t see much rationale for special-casing to avoid storing annotations in the case of (a): T.

Viicos · September 20, 2024, 9:21am

Actually, there’s a subtle bug related to this. Consider the following example:

module1:

from typing import TypedDict

A = int

class TD1(TypedDict):
    f: 'A'

module2:

from typing import get_type_hints

from module1 import TD1

A = str

class TD2(TD1):
    pass

TD2.__annotations__
#> {'a': ForwardRef('A', module='module1')}
get_type_hints(TD2)
#> {'f': str}

While it might seem impossible for f to be resolved as str because the forward ref has the correct module set, this comes from the backwards compatibility trick of get_type_hints:

github.com

python/cpython/blob/aee219f4558dda619bd86e4b0e028ce47a5e4b77/Lib/typing.py#L2403-L2409


      
          # This is surprising, but required.  Before Python 3.10,
          # get_type_hints only evaluated the globalns of
          # a class.  To maintain backwards compatibility, we reverse
          # the globalns and localns order so that eval() looks into
          # *base_globals* first rather than *base_locals*.
          # This only affects ForwardRefs.
          base_globals, base_locals = base_locals, base_globals

This means that when calling get_type_hints(TD2), the localns ultimately passed to eval (inside ForwardRef._evaluate) is the globalns of module2 (containing A = str) and the globalns is the one from __forward_module__ (module1). Because locals take priority over globals, A resolves to str.

Questionable annotation behaviors

Annotations on NamedTuple and TypedDict subclasses are converted

TypedDict __annotations__ include those on base classes

Parenthesized names don’t show up in __annotations__

TypedDict `annotations` include those on base classes

Parenthesized names don’t show up in `annotations`