I’m not sure if you’re saying the difference if arguments are forward references should be documented or if @ should make the assumption?
My view is that Format.FORWARDREF should be conservative and only resolve if it can be done in such a way that, when the ForwardRef objects are resolved it will give the same result as Format.VALUE[1].
I’ll note that although I also don’t make this assumption in reannotate, I do allow the user to subsequently make the assumption for a deferred annotation that all names are types with get_origin and get_args which does make it possible to extract the other types from the | syntax:
>>> from annotationlib import Format
>>> from reannotate import get_deferred_annotations, get_args, get_origin
>>> class Example:
... a: unknown | str | int
...
>>> a_anno = get_deferred_annotations(Example)['a']
>>> print(a_anno)
DeferredAnnotation('unknown | str | int')
>>> a_anno.evaluate(format=Format.FORWARDREF)
ForwardRef('unknown | __annotationlib_name_1__ | __annotationlib_name_2__', is_class=True, owner=<class '__main__.Example'>)
>>> get_origin(a_anno).evaluate(format=Format.FORWARDREF)
<class 'typing.Union'>
>>> for arg in get_args(a_anno):
... print(arg.evaluate(format=Format.FORWARDREF))
...
ForwardRef('unknown', is_class=True, owner=<class '__main__.Example'>)
<class 'str'>
<class 'int'>
Except for some special corner cases like str(undefined) which fail due to implementation details. (Note the use of parentheses, not square brackets!) ↩︎
David, you make a compelling case for a more powerful annotationlib. Native operators further expose the shortcomings of the current FORWARDREF approach; this already affects unions.
I want to verify my CPython prototype exposes everything reannotate needs. I am ready to help drive these paths to ensure the ecosystem has the structural tools it needs:
New Format: Add Format.TYPE_STRUCTURE to resolve operators while proxying missing names.
Upstreaming: Help upstream reannotate’s logic into CPython.
Would it perhaps be possible to restrict Python, so that putting random objects into type annotations results in Undefined Behaviour?
This wouldn’t break Python’s “you’re allowed to do whatever you want, we’re all consenting adults” principle as I see it, because another way to phrase that is “FAFO”.
I feel like Python has hit the point where an excess of flexibility starts hurting ergonomics. Even if there are some genuine use cases such a restriction on Defined Bahaviour would ruin, I think it would be an improvement for most programmers.
Personally I don’t really want to see a proliferation of Format values that all do slightly different things.
Initially the functionality of reannotate was proposed as a new Format.DEFERRED value, the intent being that any other forms of evaluation could be handled as operations on the DeferredAnnotation objects rather than all being added as new Formats to get_annotations.
Note that despite this demonstration, I’m still against adding the @ syntax for Annotated. Other than avoiding the current import time penalty for import typing I don’t think it’s an improvement over the explicit Annotated[T, ...] syntax.
There’s no guarantee that new syntax avoids or defers an import. I would expect this doesn’t avoid, but could, in most cases, result in deferring. 3.12+ class generic syntax implicitly imports typing.
I think the ideal would be for any typing constructs that are tied to syntax to be lifted out of typing and into types, and to purposefully keep them to minimum behavior, with all introspection or validation related functionality kept elsewhere and done lazily (dont validate unless the annotation is evaluated as a type), but that’s a seperate issue overall.
I guess I should have said potentially avoiding the import time penalty. Yes, it’s not a general guarantee but without it there’s even less of a purpose to my mind as I think the proposed syntax is less clear.
I’m actually a little surprised by the generic syntax example, I would have expected it to only need _typing.
I believe no one had the appetite to try rewriting some of the complex typing.py internals such as these in C, which is why the C code sometimes calls into typing.py.
I may be missing something, but is import time penalty an issue now that we have lazy imports and deferred annotations? If you don’t care about runtime introspection, then no matter if the evaluation of the @ operator results in importing typing, it won’t happen at runtime; if you do, then typing/annotationlib will have to be imported and import time will not be the bottleneck.
My assumption is that if you’re using Annotated there’s a reasonable chance it’s for a tool that’s doing something with the information at runtime.
As such the @ operator, implemented in a similar way to Unioncould avoid the import, although if you need to use something from typing to work with it then yes it is irrelevant that the syntax avoided the import. annotationlib would probably be required either way but is a little lighter than typing[1].
Otherwise yes, you can lazy from typing import Annotated in 3.15 to also avoid the import. Or put it behind a TYPE_CHECKING block to really avoid the import and make things more difficult for runtime evaluation .
That said in my classbuilder I still defer importing it unless standard annotations fail and I need forward references. ↩︎
That’s right. When I implemented it I had the vague intention to move more of it to C later, but I haven’t felt a strong motivation to move it and nobody else has stepped up to do the work.
Still thinking about whether I want to sponsor this PEP. I’m not sure the need is strong enough.
In the “Union as X|Y” PEP, the first reason they give for the change was to make it easier to read. I think the @ syntax mostly fulfills that. It’s not as elegant as | (or &/~ for the intersection proposal), but it’s probably the best key (though, I think an argument can be made for ?).
Presumably this wasn’t an issue for using | for unions (and I’m guessing it was more likely for someone to add a __or__ method to the metaclass than __matmul__). I’m sure someone could use some github-foo to figure out if someone is doing this.
I’m not really sure how a readability argument stands without seeing some real examples that it makes clearer. As the new syntax it would be on the proposal to demonstrate this. Also unlike | I don’t believe there’s prior art from other languages for this?
No, this is technically also an issue for unions, however they were added before PEP-649 annotations. Format.FORWARDREF tries not to make assumptions about undefined objects. This can mean that order changes how the references evaluate.
For example:
from annotationlib import get_annotations, Format
from pprint import pp
from typing import Union
class Example:
a: Path | str | bytes
b: str | bytes | Path
c: Union[Path, str, bytes]
d: Union[str, bytes, Path]
pp(get_annotations(Example, format=Format.FORWARDREF))
In this case b, c and d are unions, but a is a ForwardRef because annotationlib doesn’t make the assumption that a is a type so it can’t assume __or__ will produce a union.
In terms of readability, the fact that the actual type directly follows the name helps a lot. This is especially true for “large” metadata that spans multiple lines.
When writing CLI tools with Typer I often end up with code like this:
@cli.command()
def some_command(
…
timeout: int @ typer.Option(
'--timeout',
'-t',
help='Timeout in seconds.',
),
)
It’s now much clearer what the type of timeout is, in addition to saving 3 lines and an indentation level[1].
Especially when working with other developers who aren’t familiar with advanced Python typing concepts, Annotated[…] is very unintuitive, and masks the “basic” type annotation that they might otherwise be familiar with.
Before PEP 593 (Annotated), libraries like Typer, Pydantic, and FastAPI used the “direct assignment” pattern:
def some_command(
...
timeout: int = typer.Option(
10,
'--timeout',
'-t',
help='Timeout in seconds.',
),
):
While there’s not a lot of “visual clutter” it just hurts my head to read this and requires the type-checker to specifically know about your library.
I think we should offer something like:
def some_command(
...
timeout: int @ typer.Option(
'--timeout',
'-t',
help='Timeout in seconds.',
) = 10,
)
Given the massive popularity of these libraries, asking users to adopt the obscure Annotated syntax is a big ask. Without a more natural shorthand, users will likely continue to favor the “direct assignment” hack simply because it stays out of the way.
I did some rudimentary tests with my python prototype. Typer, Pydantic and annotated-types all seem to work right out of the box. Here’s a pydantic/annotated-types example:
from typing import Annotated
from annotated_types import MinLen
from pydantic import BaseModel, Field
class User(BaseModel):
uid: int @ Field(gt=0, description="Primary Key")
tags: tuple[str @ MinLen(3), ...] @ Field(max_length=10) = ()
status: str @ Field(description="Account status") = "active"
I actually prefer the “direct assignment hack” in these cases. Both from a runtime simplicity and a readability standpoint.
From the development point, my case is more like dataclasses where people have wanted similar things. In those cases it’s much less work[1] to get the metadata directly from an object than it is to extract it from the annotations namespace.
From the readability point my issue is that it makes it harder for me to spot the default value. In the first example I can see ‘timeout’, ‘int’ and ‘10’ all relatively close together. In the second it’s pushed further away to the end of the option line. I have the opposite “hurts my head” experience.
One thing I’d like to see is how this is supposed to look for multiple pieces of metadata over multiple lines. I actually think the original proposal to use a list may have been better for that case as it doesn’t rely on the metadata object having its own parentheses and would avoid having lines that look like ) @ metadata( in order to chain things.
I’m not just talking about writing the logic to do this, but the fact that there’s just more to do at runtime to get the same information. ↩︎
Thanks everyone for the feedback. Following the consensus here, we’ve settled on the bracket-free T @ Meta syntax and formalized it into a Pre-PEP draft.
You can read the spec and test the implementation right now:
Live WASM CPython Demo:here (if it’s unstable, I can create a MyBinder notebook)
Implementation Status:
CPython Core: Zero grammar changes. We are simply reusing the existing @ operator. By implementing __matmul__ on type, type objects (like int) return the alias, while scalars (like 5) don’t, fully preserving existing matrix multiplication (like NumPy).
Forward Refs: Patched annotationlib to correctly resolve annotations around undefined values as Annotated forward refs (using a new Format.FORWARDREF_STRUCTURAL).
Type Checkers: Working prototypes built for MyPy and PyRight.
Migration: Prototyped a Ruff auto-fix rule to upgrade existing Annotated[T, Meta] codebases.
I’m curious why none of the alternatives discussed in Dedicated syntax for `Annotated` were considered in the PEP under Alternatives Considered. Especially the favoured style of foo: [int, “something”] seems to address a few of the issues raised here, doesn’t require overloading an operator, naturally composes across multiline statements and is, imo, easier to read because it’s clearly scoped through the brackets.
Especially when adding multiple annotations, the @ syntax gets difficult to comprehend: name: str @ Field(min_length=3, max_length=50) @ Alias("Name") = "Anonymous" is a lot harder to parse for me than name: [str, Field(min_length=3, max_length=50), Alias("Name")] = "Anonymous", as the latter is clearly grouped by its brackets.
Even if not chosen, it would be nice to see this included in the PEP as a rejected alternative, with reasoning as to why it was rejected.