Is `Annotated` compatible with `type[T]`?

Jelle · February 5, 2024, 12:43am

Adding TypeForm (under whatever name) would definitely need a PEP.

rijenkii · February 5, 2024, 6:18pm

I have used this form quite often with pydantic for defining custom validators:

def _shortuuid_validator(value: Any) -> UUID:
    value = TypeAdapter(
        Annotated[str, Field(min_length=22, max_length=22)]
    ).validate_python(value)
    return shortuuid.decode(value)


type ShortUUID = Annotated[UUID, PlainValidator(_shortuuid_validator)]

class ArtifactObject(BaseModel):
    name: Annotated[str, Field(pattern=r"^[a-zA-Z0-9_\-]+$")]
    path: Annotated[str, Field(pattern=r"^[a-zA-Z0-9_\-/\\]+$")]


type ArtifactLongStr = Annotated[
    ArtifactObject,
    BeforeValidator(
        lambda x: (lambda y: {"name": y[0], "path": y[1]})(
            TypeAdapter(
                Annotated[str, Field(pattern=r"^[a-zA-Z0-9_\-]+:[a-zA-Z0-9_\-/\\]+$")],
            )
            .validate_python(x)
            .split(":"),
        ),
    ),
]

type ArtifactShortStr = Annotated[
    ArtifactObject,
    BeforeValidator(
        lambda x: (lambda y: {"name": y, "path": y})(
            TypeAdapter(
                Annotated[str, Field(pattern=r"^[a-zA-Z0-9_\-]+$")],
            ).validate_python(x),
        ),
    ),
]

class PushPayload(BaseModel):
    body: Annotated[
        Json[_PushBody], BeforeValidator(TypeAdapter(Base64Bytes).validate_python)
    ]

These examples work correctly with pydantic, but now fail to typecheck.

Jelle · February 5, 2024, 6:30pm

As I wrote before, this simply means that pydantic has an incorrect annotation in their codebase. You should report it to them.

rijenkii · February 6, 2024, 6:21am

For anyone interested, I have created a ticket in pydantic repo: `TypeAdapter` should accept `Annotated[T, ...]` · Issue #8735 · pydantic/pydantic · GitHub.

adriangb · February 8, 2024, 3:46pm

Jelle, what would the right type annotations be?

from typing import Annotated, TypeVar

T = TypeVar("T")

def accepts_typeform(x: Annotated[T, ...]) -> T:
    raise NotImplementedError

x2 = accepts_typeform(Annotated[int, ...])
reveal_type(x2)

MyPy says object: mypy Playground

Pyright says Any:

information: Type of "x2" is "Any"

Ultimately I think what Pydantic is trying to convey here is quite clear (I’m happy to clarify if it’s not) and this is a common runtime typing use case (Pydantic is not the only library that does this, as already pointed out above).

We’re happy to use whatever solution the typing community suggests but IMO saying that the usage is incorrect and converting previously working use cases to errors by updating libraries and previously unclear docs is a breaking change that should not be made without a proposed solution.

mikeshardmind · February 8, 2024, 4:14pm

Was covered earlier in the thread

Annotated wasn’t specified to be used as a value expression that’s equivalent to itself in an annotation context (nor does doing this make sense when you consider the difference between annotations and values that are compliant with those annotations)

Pydantic (and other libraries using it as a value this way) were relying on something that was never specified, and only worked because it wasn’t being rejected rigorously. This isn’t a change in what was specified, but clearing up what was already specified at the same time that type checkers are working on being more specification-compliant. I don’t think it’s reasonable for every use that isn’t specified to be a reason to stop improving the specification and type checker behavior.

The most cynical view here, taking at face value what you’ve said about what should be considered breaking, would mean that any large enough library that relied on unspecified behavior could dictate that the unspecified behavior must become the behavior, even to the detriment of other considerations, bypassing the process for adopting intended changes to the type system.

MegaIng · February 8, 2024, 4:30pm

It was covered in the sense “there currently is none, here is a potentially suggestion we could add”, unless I missed a comment. Without TypeForm, there is currently no way of specifiying that a function should accept Annotated as a runtime value without falling back to Any/object.

Jelle · February 8, 2024, 4:32pm

I pretty much agree with @mikeshardmind here, but a few additions.

The right annotation is Any, because the type system currently does not have a construct to support what you want. For that to change, somebody needs to write a PEP about the TypeForm proposal. (Or something else: TypeForm is a solution that @davidfstr suggested on the mypy issue tracker previously, but if someone writes a PEP to cover this use case, they’re obviously not bound to follow this early proposal exactly.) TypeForm[T] would be an annotation that can match anything that is itself valid as an annotation (e.g., Annotated[int, "whatever"], str | int, list[int], etc.).

I had to read through the documentation for a while to see this, but it seems to me that TypeForm matches what you want here. Your documentation already has a note saying that mypy rejects TypeAdapter(Union[str, int]) (correctly: Union is also not compatible with type). It would be different if what you accepted was really just class objects or class objects wrapped in Annotated, but that doesn’t seem to be the case.

adriangb · February 8, 2024, 4:36pm

Pydantic and the ecosystem we’ve created around Annotated are by far the largest users of the feature (as far as I know). And the spirit of Annotated was being able to attach runtime information to types without changing how the type checks interprets them. Pydantic and co are very much using Annotated in that spirit, I feel that saying otherwise is focusing on (IMO somewhat unclear specs and docs to even advanced Python users) technical details. So if there is a technical reason why what Pydantic does is wrong I propose that type checkers relax tightening up the strictness until TypeForm or some other long term solution lands. Making Annotated unusable for most of its users seems like a mistake even if it is technically correct.

There’s also the case of unions. I understand those are also not technically a type, but the whole special form thing again seems like it’s hurting useful real world use cases that thousands of users have for the sake of being technically correct.

mikeshardmind · February 8, 2024, 4:52pm

But pydantic is attaching it to values in all of the examples that don’t work, not to types.

There’s nothing here that’s forbidding this, much more common form taken from an above example:

from typing import Annotated

from pydantic import BaseModel, Field

class Model(BaseModel):
    field: Annotated[int, Field(ge=0)]

The above is is still entirely specification compliant, this other example from above on the other hand is not:

from typing import Annotated

from pydantic import Field, TypeAdapter 

ta = TypeAdapter(Annotated[int, Field(ge=0)])

As a value, it should be fine to have a definition of TypeAdapter that takes the type and the extra data as separate arguments, but pydantic tried to do something that wasn’t specified rather than help write the specification that would allow it. I’m inclined to believe that this was not an intentional move, but a result of misunderstanding.

The below shows a use that would require a change for pydantic, but that is currently possible within specification and type-safe.

ta = TypeAdapter(int, constraints=[Field(ge=0)])

If you don’t want to change your API, I believe you can inform your users of the situation, recommend pinning mypy/pyright/etc to a version before the stronger enforcement and work on helping propse TypeForm so that your users can later unpin. I don’t think there’s a strong desire to have a runtime breaking change here, and I don’t think it’s necessary.

mdrissi · February 8, 2024, 5:13pm

I think making typing spec more precise when usable alternatives exist is fine. When alternative is a future PEP then it effectively is a breaking change (admittingly under specified behavior) restricting code that worked for a long time previously with a type checker. I have similarly had code that used some type forms at runtime like unions and that pyright supported for over year. There were occasionally even bug reports on this topic in past were before type spec existed Pyright does not evaluate type correctly when calling a generic class constructor with a union of classes · Issue #5022 · microsoft/pyright · GitHub where pyright stayed lax mainly to allow this use case.

TypeForm in optimistic case is at least a couple months out and regardless decision on an unknown future state isn’t that helpful to current usage. I would strongly prefer for this to be under specified intentionally today and when TypeForm or a similar construct exists, then we can update the spec to restrict type.

adriangb · February 8, 2024, 7:11pm

I’m happy to write a PEP for TypeForm. Luckily if a PEP gets approved it could go in typing_extensions and thus not require waiting for a new Python version.

In the meantime, can we have a bit of a compromise and not tighten up the type checkers and specs until there’s a long term solution available? I’m not asking for changes in behavior, just to keep it as is a while longer, maybe under a feature flag. I understand that our usage may not be technically correct but I think that was never clear and this usage is now extremely widespread. It’d be a mistake to on average hurt the user experience for Python users for the sake of being technically correct, that’s not the Python ethos in my view.

Tinche · February 8, 2024, 7:20pm

There’s already some effort underway at TypeForm[T]: Spelling for regular types (int, str) & special forms (Union[int, str], Literal['foo'], etc) · Issue #9773 · python/mypy · GitHub, maybe you can coordinate? And yeah, please, let’s get this done.

Typing folks in this thread have made the point that “an object compatible with type[T] should be an instance of type”. I’m curious: why is this important in practice, why is knowing that something is an instance of type useful?

mikeshardmind · February 8, 2024, 7:20pm

Is there a reason users can’t pin to an older version of mypy/pyright prior to this being stricter until TypeForm rather than holding up other progress in clearing up the specification? Nothing about this requires a runtime change that would break users here.

Daverball · February 8, 2024, 7:23pm

I don’t understand why in the meantime you can’t just go the other way, change your signature, and accept a few false negatives. Your users will no longer see any false positives, and once TypeForm is a thing you can tighten things back up again, i.e. change the second overload on TypeAdapter to:

    # This second overload is for unsupported special forms (such as Union). `pyright` handles them fine, but `mypy` does not match
    # them against `type: type[T]`, so an explicit overload with `type: T` is needed.
    @overload
    def __init__(  # pyright: ignore[reportOverlappingOverload]
        self,
        type: Any,
        *,
        config: ConfigDict | None = ...,
        _parent_depth: int = ...,
        module: str | None = ...,
    ) -> None:
        ...

You could even go one step further and add an annotation for self with TypeAdapter[Any], so you won’t get type checkers complaining about unbound generics.

mikeshardmind · February 8, 2024, 7:34pm

The answer is rooted in the difference between an annotation constraining the expectations of a program for what can be provided as a value and the concrete type of a value at runtime. Because we have determined that in an annotation, given a type (ie. x: int, the annotation being int), that the value must be an instance of the type, the form type[int] allows specifying when you actually want the type. This is useful in everything from composition of generics, validation tools, abstract math libraries, to code that needs to handle custom wireprotocols. Knowing the type of something allows for strong metaprogramming, but there is a difference between a runtime type and concepts which exist in the type system to specify things like Unions, where you instead have a set of possible types.

In the set-theoretic model (which hasn’t been formally adopted, but which the original PEPs specifying the type system are very closely related to), you really have these special forms of sets of types, that describe an expectation that can be fulfilled by instances of some number of types, which may be wider than just the first set of types via subtyping relations where those are allowed.

If I might presume something here for a moment, it seems like what you’re getting at is a question like “why can’t type be used to interact with typing special forms”, and this has a much simpler answer that isn’t as deeply rooted in theory. x: type[T] implies that x can be constructed as a type to create T, can be introspected as if it was the class T, and so on. This isn’t true of special forms in typing, and you need to access the inner members and introspect them, then compose that information as appropriate to the form. But as has been shown here, being able to express that a function takes one of these special forms is still an entirely valid case, we just need a better way to express it.

mdrissi · February 8, 2024, 7:36pm

Because these libraries have runtime type checking often as a core usage. My own usage of this pattern is for a function similar to trycast that also does some config deserialization. It is intended as a type safe json.loads.

Using Any works with type checker but effectively hurts core intent of working well with type checking.

On TypeForm Pep discussion I reached out to @davidfstr earlier this week to continue that work. There’s already one prior person asking recently to pick up that work so currently checking with them first on their plans. Maybe better to continue this line of actual work on PEP to separate topic/channel.

edit: I’ll add timing here seems unnecessarily strict given ambiguity/reliance was used for years here. If spec definition is goal then give a 6 month timeline for TypeForm PEP as right now it just is not ready and adding pressure here will not make a PEP land in a week.

mdrissi · February 8, 2024, 7:39pm

This behaves rather poorly with pyright in particular for vscode users. As that requires pinning vscode extension as a whole that normally auto upgrades today. I occasionally get questions about this confusion where the users ide vs CI disagree due to pyright being awkward to pin. And it still leaves a large amount of users who are unaware of different that previously had their pydantic or similar library code happy with vscode now start seeing errors when it is unlikely average user will know much about different between type vs TypeForm.

mikeshardmind · February 8, 2024, 8:11pm

Well, there is the option of specifying non type-constraints separately from types when not in an annotation context in the interim as well.

I don’t think that’s sustainable, let alone realistically possible at a specification level. Maybe various type checkers can decide for themselves to allow this in the short term with some sort of warning (rather than immediately being an error to match specification), but what you’re asking seems to me like intentionally specifying that you can temporarily do something which was never allowed by the specification just because it was unclear to some and not being enforced by other tools, and that will massively complicate other ongoing work wherever it can interact with Annotated (which would be anywhere with such a specification)

adriangb · February 9, 2024, 4:11pm

I think I’m having trouble seeing the other side of the argument because I don’t know what the goal, from a users perspective, is to do this spec tightening. Could someone give me a simple example of something your average Python user would do that is greatly improved by this change?