An alternative to Annotated

This is a half baked idea I got from reading the recent discussions about PEP-727. One of the big concerns raised in there is that the usage of Annotated means that any usage of doc requires a type hint, even if that isn’t wanted otherwise, and more generally, that annotations are supposed to be type information and documentation isn’t really that.

My suggestion would be to use an AFAIK currently unused syntax in annotations for non-typing purposes, specifically function calls:

def create_user(name: Doc("The User's name")): ...

If you want both type hints and non-typing annotations, the @ operator could be overloaded to create an AnnotationList class (or an Annotated instance):

def create_user(name: str @ Doc("The User's name")): ...

The stdlib would gain a new module, for example named annotations with a few tools:

A BaseAnnotation class that would act similar to the BaseException: It would store passed args and kwargs and have a repr that looks similar to a dataclass repr, i.e. such that it can be reconstructed. Doc would then be a subclass (maybe also in the same module) that would verify that only one argument was passed.

An AnnotationList class (Not a subclass of BaseAnnotation, probably) that would be constructed by an __matmul__/__rmatmul__ overload of BaseAnnotation such that arbitrary annotations can be chained.

A function get_annotation or similar, that takes a (tuple of) subclass(es) of BaseAnnotation and an arbitrary entry of an __annotations__ dict (or an entire dict/object with annotations?) and extracts the corresponding annotation instance(s) or returns None/raises an exception. This would deal with the different options of the annotation being passed in being a string, a type hint with no annotations, a direct BaseAnnotation subclasses instance or an AnnotationList instance. (I didn’t think about how this would play with delayed annotation evaluation, this might need to be changed)

Either here or in typing there would be a magic AnnotationAlias type marker like the TypeAlias marker, which basically the same semantics. However, TypeAlias would not carry over the new annotations, but only the actual type (for static type checkers at least), so that an @ Doc() on a TypeAlias would document the alias and not the parameters where the new name is used.

Static Type checkers/linters should assume that any call syntax in an annotation is not a type hint and ignore it. AFAIK that wouldn’t cause any conflicts. The type checkers can also check that the calls are correct Annotation subclasses with correct parameters.

This proposal would in the long term mean that Annotated could be deprecated since it’s purpose would be fully contained by this syntax.

For example, ctypes could use the syntax like this:

class S(ctypes.Structure):
    a : int @ CType(ctypes.c_int)
    b : CType(ctypes.c_char_p)

This would still mean that non-typing users of annotations are second class, but it would allow usage of annotations without any type hints without conflicting with type checkers. If no type is specified in an annotation, type checkers should assume Any.

(Specifically the @ syntax was also half-proposed in this comment, but I came up with it independently.)

4 Likes

Do note this redefines what @ is for.

1 Like

That by itself doesn’t seem a big objection – type annotations reuse all kinds of operations, including x[y] and x | y.

1 Like

I would say that this is of course partially true, however the fact that @ is also used for decorators, that @ is not used by anything in the base language/stdlib and that other languages already use for @ for other things does lower the importance of this IMO. If others agree that this is a major problem, a different binary operator could also be chosen. I don’t think any of the already established meanings really fit. + or & are the closest, but neither really map onto this “combine unrelated things” operation.

I also considered just using list or tuple syntax, however both of those seemed a bit too confusing since they would use , as the separator which would potentially confusing in function headers. They also wouldn’t result in a clearly differentiated type.

I am very honored to get response from important core devs this early :slight_smile:

Just drive-by commenting that it is used for “matmul” operator: PEP 465 – A dedicated infix operator for matrix multiplication | peps.python.org and operator — Standard operators as functions — Python 3.12.1 documentation and 3. Data model — Python 3.12.1 documentation

2 Likes

Which isn’t implemented by any type in the stdlib AFAIK. But it is clearly documented for that purpose, that is true.

This is a bit OT in that it’s not about this proposal but it’s an alternative solution (which would work in other contexts as well): Python could formally define support for doc comments.

Doc comments have the advantage that they don’t break any existing syntax. They also prevent a set of confusing interactions because they can’t be modified at runtime–which can have its uses but I think it’s fairly rare.

I think the main difference between comments and other proposals is that finding them is based on the tokenizer rather than the parser. I’m not sure if that’s a big obstacle.

1 Like

I dont see how doc comments would solve the problem of annotations being reserved purely for typing. Did you mean just as an alternative to PEP 727?

Maybe I didn’t make it clear enough that my idea isn’t just about documentation, but also other uses of annotations that are currently in a weird situation where standard editors will tell them that they are using the language feature incorrectly.

2 Likes

Ah yeah, I was thinking that documentation and typing shouldn’t really be connected, except to the extent that documentation generators should include the types when they’re present.

1 Like

True, but at least | aligns closely w/ its non-typing purpose.

1 Like

This last comment involving | made me realize that we actually have a precedence problem where a: str | int @ Doc() would be parsed as a: str | (int @ Doc()).

Technically, this could be dealt with by overloading | for the AnnotationList class to “invert” the precedence manually or by requiring users to write (str | int) @ Doc().

Neither of these options seem perfect to me (with a preference for the latter, explicit is better than implicit). Using a different binary operator isn’t a solution since | has the lowest precedence excluding the comparsion and boolean logic operators.

1 Like

Edit: I was off-topic. I migrated this comment to the original megathread

1 Like

Ok, it appears I really need to create examples independent of pep 727. I am only using Doc so much because it’s the most recently discussed. Other suggestions that have been made, like dataclasses and ctypes would benefit from this suggestion.

Unless I am misunderstanding what you are saying and it implies generally and not just to Doc.

1 Like

If @ has a special syntax in annotations, would it conflict with using __matmul__ to evaluate type annotations with variadic generics (PEP 646)? Expanding the example in PEP 646, the current form of array multiplication would look like:

class Array(Generic[DType, *Shape]):

    def __matmul__(self, other: Array[Dtype, *OtherShape]) -> Array[DType, *Shape[:-1], *OtherShape[1:])

But with a subclass of TypeVarTuple, the following makes intuitive sense:

class ShapeTuple(TypeVarTuple):

    def __matmul__(self, other: ShapeTuple) -> ShapeTuple:
        return TypeVarTuple(self[:-1], other[1:])

class Array(Generic[DType, *Shape]):
    def __matmul__(self, other: Array[Dtype, *OtherShape]) -> Array[Dtype, Shape @ OtherShape]

Granted, currently “TypeVarTuples cannot be split”. But as the following section notes, “We plan to introduce these in a future PEP”.

1 Like

Hi guys, there hasn’t been much movement here since December, but the OP pointed me to the syntax proposed here as an option for the idea I have proposed for a small enhancement to the the dataclass machinery, here: Dataclasses - Sentinel to Stop creating “Field” instances

Seeing its potential I have implemented a proof-of-concept which allows to declare the attributes of the dataclasses using @ to annotate them and influence the final form of the attribute.

Github Gist: Dataclass with Field Annotations using @

This code

    @at_dataclass
    class A:
        a: int
        b: int @ KW_ONLY = 25
        c: int @ NO_INIT = 5
        d: list[str] @ NO_INIT_FACTORY = list
        e: int @ NO_INIT | Dummy() | Dummy() = 0
        f: int @ [NO_INIT, Dummy(), Dummy()] = 1
        g: int @ NO_FIELD = 7

would translate to this

    @dataclass
    class A:
        a: int
        b: int = field(kw_only=True, default=25)  # or declared after _: KW_ONLY
        c: int = field(init=False, default=5)
        d: list[str] = field(init=False, default_factory=list)
        e: Annotated[int, Dummy(), Dummy()] = field(init=False, default=0)
        f: Annotated[int, Dummy(), Dummy()] = field(init=False, default=1)
        # The following attribute which was NO_FIELD is not managed by `dataclass`
        g: int = 7

The @ syntax would seem to add clarity and readability.

| is used as one of the two options to separate annotations because it won’t throw a SyntaxError. The other option is to use a list as shown with attribute f.

The syntax allows whitespace in between @ and the declarations and also @NO_INIT with no intervening whitespace.

Personally I see the value in the proposal made by Cornelius.

Best regards