100% agree. I always have to alias Literal as sz for numpy\
Iāve been thinking about this approach more and been working on a dummy implementation to see how things shake out. Overall I think the AST based approach is really useful, but thereās some annoying wrinkles to work out. Hereās the overall idea I have in mind now:
Thereās a new value AST added to the Format enum. And the auto-generated __annotate__ functions now look like this:
def __annotate__(format):
if format == Format.VALUE:
return {
"a": int,
"b": str | int,
}
if format == Format.AST:
import ast
return {
"a": ast._build((26, "int", 1))
"b": ast._build(...) # some more complicated tuple
}
That is, if the AST format is specified we build essentially the same tuple, but instead of actually evaluating the values we build the ast from a constant tuple that contains the necessary data.
And finally, we add a two new helper functions to the typing module. The first calls an annotate function with the AST format, modify the returned AST in some fashion to support future typing semantics and then resolve that AST to a value/string/forwardref. The second uses the same mechanism to evaluate a string using type expression semantics.
If users then want to get type annotations, they can use the first helper method and get the same kind of results as with the annotationlib functions, but the annotations can use typing-specific semantics. And if users want to use these semantics somewhere other than in an annotation or type alias, they can write eval_type("SomeTypeExpr"). In most cases users wouldnāt even have to explicitly call the eval function since library functions that expect type forms often already support stringified versions.
I donāt think that itās necessary to return a new kind of object that can look up names in the annotated objectās scope since we can get (most) of that info from the annotate function. The relevant global namespace is stored in the dunder and since the annotations themselves canāt define any locals, the only relevant ones are cell vars, which also are stored in a dunder. Of course, this has the weakness that we canāt look up names from stringified annotations. Iām not sure if there is a need to do that though since AFAIK the need for them is gone with delayed annotations. So if a user is having trouble with these name lookups, they should be able to just un-stringify the annotations and everything works.
Since most of the negative feedback here has revolved around the specific new semantics I originally proposed, Iād now limit this to just the general ideas and mechanics of typing specific semantics. There seems to be broad support for that idea even without an immeadiate payoff for builtin types or similar.
Being able to evaluate annotations in a typing-specific context could iron out some of the subtle differences that already exist depending on syntax.
For example, the union syntax and typing.Union donāt produce the same objects at runtime when one of the items in the union is unknown and so annotationlib doesnāt assume the syntax implies a union. If these were evaluated with that assumption then it would be possible to get the actual union in both cases.
>>> from annotationlib import get_annotations, Format
>>> from typing import Union
>>> from pprint import pp
>>>
>>> class Example:
... syntax: str | unknown
... subscript: Union[str, unknown]
...
>>> pp(get_annotations(Example, format=Format.FORWARDREF))
{'syntax': ForwardRef('__annotationlib_name_1__ | unknown', is_class=True, owner=<class '__main__.Example'>),
'subscript': str | ForwardRef('unknown', is_class=True, owner=<class '__main__.Example'>)}
If extra formats were added to annotate I would however like if they could also solve the issue I brought up before with regards to the need to create new __annotate__ functions[1].
like the one in dataclasses where I accidentally broke certain dataclasses in 3.14.1
ā©ļø
I wasnāt familiar with this issue before so I hope Iām understanding it correctly now. Itās that autogenerated functions like the dataclass dunders have problems with their __annotate__ functions since they run into issues if the annotations that they are trying to modify canāt be evaluated properly. So for example, if a dataclass looks like this:
@dataclass
class Thingy
a: int
b: WillNeverBeDefined
then __init__.__annotate__ canāt just fall back to Thingy.__annotate__ and modify the returned dict since bās annotation breaks things?
If my understanding is correct, the proposed AST format should already fix that. The autogenerated __annotate__ could then call get_annotations(Thingy, Format.AST) (regardless of which format it got passed), proceed to exclude arguments and modify the dictionary as needed and then resolve the AST into the actually requested format. To make that easier it might be best to add a helper method to annotationlib. 90% of the functionality is already there since the forwardref and string formats already do essentially that but with extra work of having to first generate the AST using the fake globals trick.
Itās a bit more like this:
@dataclass
class Example:
a: list[Example]
b: NeverDefined = field(init=False)
Currently (in 3.14.2 now the bug Iād unfortunately introduced has been resolved) you canāt get the VALUE annotations for Example.__init__ because the annotation for b wonāt resolve, even though it is not in the annotations for __init__.
The current FORWARDREF format is no good for creating VALUE annotations because it attempts to resolve as far as it can, for example a will be list[ForwardRef("Example", ...)] and not ForwardRef("list[Example]", ...). List here is an easy example but the ForwardRef could be anywhere in an arbitrary container.
The goal is to be able to collect objects from get_annotations(cls, format=Format.EVALUATE_LATER) at the time the dataclass is constructed, and to use those to make a new __annotate__ function. The generated __annotate__ should not need to refer back to the annotations from the class as they will have already been gathered when the dataclass was constructed. You would collect these objects in a dict, as you would in 3.13 or earlier, but instead of attaching to __annotations__[1], you would attach make_annotate_function(annos) to __annotate__.
Edit: Iāll also note that the logic for dataclasses is relatively simple, itās more annoying to do it for something like attrs which also has the annotations change due to the presence of converters.
as
attrsdoes for example, dataclasses used to write them into the source code. ā©ļø
What is the AST format missing that EVALUATE_LATER would provide? We can resolve AST nodes to values, forward refs and strings. So why can you not build the annotate function from the dict of AST nodes? Or is this not about the underlying functionality and more about making sure the utility functions that do that are added to the annotationlib module?
Itās largely about making sure the utility functions are there.
The requirement is only that the objects received are not evaluated and are able to be evaluated correctly later. The DEFERRED (or EVALUATE_LATER) format as proposed in the other post planned to use the unevaluated ForwardRef objects largely because they already exist inside call_annotate_function.
Format.AST may be fine for this, my point was more that the generated __annotate__ wouldnāt be calling get_annotations itself. It would use a dictionary of annotations collected in this new format and then just call some evaluate method on each to return the annotations in the requested format when the annotate function is called.
The only other thing is that you might need to make an annotate function from objects that are already resolved. Such as with the __init__ function we add the {"return": None} annotation. Iām not sure how that would work with this AST format as I havenāt looked too closely yet. If thatās not a problem then fine.