What is the right way to extract PEP 593 annotations

couling · January 3, 2024, 8:16pm

What is the right way to extract PEP 593 annotation information attached with Annotated[] from a class/type object?

I want to annotate the members of a class to indicate how to process objects of the class. Ideally I’d like to use a TypedDict as the base class for this.

There’s a suggestion here to use a mix of typing.get_origin and typing.get_args which looked promising.

But this doesn’t seem to play nicely with PEP 655 Required[] and NotRequired[]. Specifically PEP 655 states:

Required[] and NotRequired[] can be used with Annotated[] , in any nesting order

This means that get_origin could return Required and the Annotated might come from get_args or visa versa, or any mix of those.

I can write a function to iterate down those three specific wrappers but this leaves be with two concerns:

Is this list of wrappers a closed set? will future versions of python add more? PEP 593 was python 3.9 and PEP 655 was Python 3.11 so I guess this is really not future proof.
Am I re-inventing the wheel? PEP 593 seems to talk about how to annotate but doesn’t describe how to extract information from annotations. Am I missing something obvious here?

The iteration may look something like this:

while get_origin(foo) in (Required, NotRequired):
    foo = get_args(foo)[0]

But this will obviously break the moment a PEP adds a fourth thing.

Worked Example:

from typing import TypedDict, Annotated, NamedTuple

class ExtractFrom(NamedTuple):
    name: str
    source_id: int

class UsefulInfo(TypedDict):
    a: Raquired[Annotated[int, ExtractFrom("x", 1)]]
    b: Annotated[Required[int], ExtractFrom("y", 2)]
    c: Annotated[int, ExtractFrom("z", 3)]

Given this definition, all I really want is code that will search for ExtractFrom instances and give me a dictionary:

{
    "a": ExtractFrom("x", 1),
    "b": ExtractFrom("y", 2),
    "c": ExtractFrom("c", 3),
}

I feel like I must be over thinking this and missing something obvious.

mdrissi · January 3, 2024, 9:37pm

Required/NotRequired are two things that may wrap Annotated. There are a couple more. Final/ClassVar are also allowed to wrap annotated. Upcoming ReadOnly will be another. There was a time where Annoatated[Final[int], ...] was rejected but Final[Annotated[int, ...]] worked. There was an issue to allow Annotated to stay top most so for recent python versions I think you could require Annotated is top level although nested Annotated also is valid. What do you want to do if Annotated[Annotated[Final[int], ...], ...]).

I’m unaware of better way then get_origin/get_args and writing helpers that recursively normalize/adjust. I mostly just strip Required/Final/ClassVar away when trying to find annotated metadata.

As a note your definition c: Annotated[ExtractFrom("z", 3)] is invalid for type checkers. Annotated must have 2+ arguments where first is a type.

couling · January 3, 2024, 10:32pm

Thanks for confirming my fear that there are others and will be more.

As I pointed out, the problem with this is that you need to know the definitive list of these. Let’s call them “meta annotations” (I don’t see a name for them). When I’m recursing through them, I need to know when I’ve reached the bottom.

That is to say if I get to foo = Required[] then I should recurse in foo = get_args(foo)[0]. However, if I get to foo == list[] I should stop recursing.

But how can I possibly write an if statement like that? The list of “meta annotations” is growing with each version of python and I can’t know every type. So in context of this if statement there is no difference between list[] and Required[] that will not age out with every new version of python.

These are flattened by the runtime and the equivilence is specified in PEP 593

Nested Annotated types are flattened, with metadata ordered starting with the innermost annotation

However you are right that it’s possible to construct something that isn’t flat:

Annotated[Required[Annotated[int, "a"]], "b"]

Oops. I’ve edited to correct the typo.

Daverball · January 3, 2024, 10:36pm

I wouldn’t worry about enforcing correct nesting yourself, that way you don’t actually need to know which special forms can contain an Annotated and which ones don’t. Type checkers and linters will catch incorrect nestings for you.

So you just keep recursing until you get an empty tuple back from get_args and pick up all the instances of Annotated along the way.

Or you could just be pragmatic and force users of your library to always use an outermost Annotated, you don’t have to be nice and support absolutely everything, there’s various other restrictions to runtime annotations anyways, such as not supporting type checking only symbols in forward references unless you somehow inject those yourself into globals/locals.

mdrissi · January 3, 2024, 10:49pm

If you do this you’ll need a minimum python version. 3.11 is earliest version to allow Annotated at top as it was fixed here. If you support 3.9/3.10 you’ll need to handle Final[Annotated[…]] or just forbid Final being used with Annotated.

This is fair restriction just noting it as my current codebase is 3.9 and deals with this stuff.

edit:

But how can I possibly write an if statement like that? The list of “meta annotations” is growing with each version of python and I can’t know every type. So in context of this if statement there is no difference between list[] and Required[] that will not age out with every new version of python.

I think it’s reasonable to have ~annual small update adding a new special form when needed and just specify which forms are supported. I think most runtime type checkers will specify which various special forms they support and it’s normal that newest forms may take some time to handle.

couling · January 3, 2024, 10:54pm

It feels strange to me that, given the original intention of Annotated, there isn’t a built in helper to handle this type of logic. If the helper were built in, it could be maintained along with addations from new PEPs.

mdrissi · January 3, 2024, 11:13pm

At moment standard library provides core typing constructs. Runtime typing is a supported use case, but is mainly done outside standard library. So most of these utilities live in individual runtime type tools. And runtime typing is relatively new and still somewhat unstable. My experience upgrading python versions is most likely place where I’ll see interesting behavior changes/tests fail is related to runtime type manipulation. I try to keep most of my code to public apis, but there are few places (forward ref evaluation/manipulation) that need it or just interesting behavior changes.

I’ll note that once a year while pretty good is not only time where typing constructs evolve. New typing features can be available in typing-extensions before next python release. It’s pretty typical for typing feature to go PEP → implemented in typing-extensions → implemented in standard library → released.

For this particular issue if you limit yourself to 3.11+ fine solution is just say you only support outer Annotated and nesting isn’t handled at all. That should reasonably cover future special forms.

edit: I’d personally be happy if standard library gained a couple runtime type utilities/made a few things public. Maybe better for a separate topic along with specific examples of which utilities. I find it less likely it’d handle too much for you to leave it easier for other parts of ecosystem to develop on their own timeline, but adding 2-4 most common runtime type needs feels fair. Also saying this as not a typing maintainer.

Jelle · January 3, 2024, 11:33pm

We’re generally open to adding new utilities to typing to make introspection easier and less error-prone.

In this case, I could see a case for a new function, let’s call it strip_annotations().

It could work like this:

assert strip_annotations(Required[int]) == (int, [(Required, ())])
assert strip_annotations(Annotated[Required[int], "something"] == (int, [(Annotated, ("something",)), (Required, ())])

couling · January 4, 2024, 1:15am

Looking at the code, it seems the knowledge of what is and isn’t an annotation is already encoded in a function named _strip_annotations() however its return value is almost the exact opposite of the strip_annotations() proposed above.

On first glance, the existing function could be modified to return the annotations it is stripping off.

However it raises a question for some edge cases including unions. The existing currently strips off annotations so:

Annotated[int, "foo"] | Annotated[str, "bar"]

Becomes

int | str

If the function were modified to return the stripped annotations, I’m not sure how it could express the fact that "foo" is attached to int and "bar" is attached to str.

couling · January 4, 2024, 2:28pm

Okay I’ve submitted a feature request, and for now the cose set of Required, NotRequired will work until python 3.13.

MegaIng · January 4, 2024, 5:24pm

The reason this is so difficult is because the typing system is being misused by Required and NotRequired (and Final and a few others). Those belong to the variable being annotated, not the type itself. It doesn’t make sense for this to be part of the type hierarchy. This is part of what my suggestion here was supposed to solve. So the syntax would instead be int @ Required() @ Something() and you couldn’t nest those annotations inside of the type system (since they don’t belong there).

couling · January 5, 2024, 10:26am

I’d argue something different. Required and NotRequired seem to be considdered part of the annotations system (see here). To my mind the simple mistake was making them something in their own right. They should have just been syntatic sugar for adding markers to the annotations.

That is, I’d argue Required[int] and Annotated[int, Required] should have resulted in the same thing. Then the rule on collapsing annotations would have flattened them all into one top level container.

Eg:

Required[Annotated[int, ExtractFrom("x", 1)]]

becomes

Annotated[int, ExtractFrom("x", 1), Required]

This would take away all the searching and always leave the Annotated at the top unless the user did something crazy like:

Union[Annotated[int, "foo"], Annotated[str, "bar"]]

ajoino · January 5, 2024, 10:35am

I was gonna make a post about how type checkers are supposed to ignore arguments after the first, but turns out PEP 593 makes no such requirements. I think I agree with you both, it’s only tangentially related to the type system and maybe should have been introduced as an annotation.

MegaIng · January 5, 2024, 1:56pm

Yep, exactly. I might have worded that a bit too strong.

The only reason we really need syntax sugar here is that Annotated is way to verbose for many. That is another motivation for my @ syntax proposal.

This is another point my proposal targets: The @ syntax would only be valid at top level. It conceptually isn’t attached to a type, but to the variable.

And inherently using Annotated makes the other usacases second class users of annotations.

couling · February 13, 2024, 5:06pm

In a completely parallel topic, and yet totally the same one: Since python 3.10, annotations are not inherited. Long discussion of the topic on issue 88067

Is there a “recommended” way to find ALL the annotations?

Am I missing something obvious here?!

My first, second, third, and fourth reading of PEP 593 leave me with the impression that its about allowing frameworks to attach additional meta-data that can be collected at runtime. Eg: major use cases along the lines of sqlalchemy and pydantic could use it to map attributes onto external data sources.

I guess I’m a little stunned at #8806 aparently dropping inheritece without discussing the obvious question of how to collect meta for inherited attributes.

I’m left wondering if I’ve completely misunderstood something.

(@Jelle ?)

Daverball · February 13, 2024, 5:14pm

You can look at the source code for typing.get_type_hints, you essentially have to look at the mro of the class to collect all the annotations, there’s not really any way around this. So it’s a similar situation to cls.__dict__.keys() vs. dir(cls) where you need higher level introspection functions to collect all the data.

PEP 649 should help with that a little bit, by adding additional modes to typing.get_type_hints so you can still use it when your annotations contain forward references that cannot be resolved.

couling · February 13, 2024, 5:47pm

@Daverball Thanks. So typing.get_type_hints(include_extras=True) is similar to inspect.get_annotations() except it follows the MRO. Curious that it’s not mentioned here Annotations Best Practices — Python 3.12.2 documentation

Funny. I’ve seen a bunch of other discussion around with many people missing the function. Perhaps because it’s in typing instead of inspect, a lot of people miss it.

Daverball · February 13, 2024, 9:17pm

inspect.get_annotations is the more general purpose function since annotations don’t have to be used for type hints, the feature predates the optional type system, although at this point they might as well be typing only, since type checkers require the use of Annotated for interoperability.

All typing related introspection is in the typing module, such as typing.get_origin, typing.get_args and typing.is_typeddict.

couling · February 13, 2024, 9:55pm

@Daverball Thanks! Context here is helpful.

inspect.get_annotations is the more general purpose function

At grave risk of sounding argumentative, I certainly think you are right abut the intent. But…

…I think 88067 changed things to make that less true, or perhaps failed to maintain it.

In almost every case I can think of, when you want to know about annotations on a class you’d want to know about them for the fully constructed class combining all it’s ancestor classes along the MRO.

The only contexts I can think where you’d want just the immediate class at runtime and not it’s ancestors would be custom meta-classes (and perhaps, just maybe custom __new__ methods). That rather leaves inspect.get_annotations() in the “special case” category rather than “more general”. Though I do take your point about “more general” meaning specifically “more than just typing”.

This all keeps failing the principle of least astonishment for me, and I can’t quite figure out if that’s because I have a warped perception of what this is all intended for, or if it is genuinely all a bit crooked itself.

MegaIng · February 13, 2024, 10:03pm

Well, what I would imagine to be the most common runtime-usage of class-annotations, dataclasses and similar construct specifically do not want the rest of the mro AFAIK.