Specify that `assert_type()` checks for equivalence

We added assert_type() before we created the “Type system concepts” page that puts the type system on a firmer foundation. As a result, the spec for assert_type() is a bit vague; it says that type checkers “should emit an error if the value is not of the specified type”.

With the current spec, it makes sense to specify that assert_type() should check for equivalent types. Two static types are equivalent if they contain the same set of values; two gradual types are equivalent if they materialize to the same set of static types.

This was previously discussed in Is `assert_type` checking for exact match, or a subtype relationship? - #16 by erictraut, where @erictraut already suggested using equivalence.

Note though that existing type checkers don’t necessarily use equivalence. Two obvious parts of the type system where it’s easy to express types that are equivalent but spelled differently are structural types (TypedDicts and Protocols) and unions (if a union contains both a type T and a subtype U of T, it is equivalent to T). Here is a test case:

from typing import assert_type, Protocol, TypedDict, Literal

class TD1(TypedDict):
    a: int

class TD2(TypedDict):
    a: int

class P1(Protocol):
    a: int

class P2(Protocol):
    a: int

def func(td1: TD1, p1: P1, i: int, int_str: int | str) -> None:
    assert_type(td1, TD2)
    assert_type(p1, P2)
    assert_type(i, int | Literal[1])
    assert_type(int_str, int | bool | str)

Only mypy currently passes this test case. Pyright rejects all assert_type() calls; pyrefly passes only the int | Literal[1] test; pycroscope accepts only the TypedDicts; ty rejects the TypedDict (probably because it hasn’t implemented TypedDict support yet).

Proposed spec change: Specify that assert_type() uses equivalence by JelleZijlstra · Pull Request #2011 · python/typing · GitHub

3 Likes

I agree that assert_type should use type equivalence, so I support the proposed spec change. However, it’s important to note that while type equivalence is an easy concept to define, it can be very difficult to compute programmatically. I think it’s an NP complete problem in the general case.

I’m not sure what that means for conformance testing. We can probably always find more examples of types that are provably equivalent but fail to be computed as equivalent by one or more type checkers. For example, here are some cases where mypy fails to detect equivalence (as does pyright). I was able to generate these in just a few minutes off the top of my head. I could likely generate dozens more examples if I took more time.

def func2(
    c1: Callable[[float], int] | Callable[[int], float],
    l1: list[float],
    t1: tuple[int | str],
    t2: tuple[int, ...],
    t3: tuple[*tuple[int, ...], int],
) -> None:
    assert_type(c1, Callable[[int], float])
    assert_type(l1, list[float | int])
    assert_type(t1, tuple[int] | tuple[str])
    assert_type(t2, tuple[()] | tuple[int, *tuple[int]])
    assert_type(t3, tuple[int, *tuple[int]])

It feels somewhat arbitrary to choose a subset of cases and include them in the conformance tests. Is there a way we could be more principled about what to include? Perhaps we could define a subset of equivalence classes or patterns that we want to mandate for all type checkers?

3 Likes

I agree with Eric - explicitly specifying equivalency is good, but unsure about corner case tests, since most of the time assert_type types in X and Y match perfectly and it’s used as a way to introspect the type to see that it didn’t changed somewhere before it reached the assert_type, without any complications in Y value.

Any ideas if there were any reports to type checkers for users reaching this kind of corner cases?

I think that the definition we are looking for here is “gradually equivalent” rather than simply “equivalent”, since I assume we want assert_type to also work for types that are not fully static (contain Any as a component).

That’s true; equivalence is difficult to decide in all cases. It would be nice to have test cases to show that type checkers use equivalence and not some full notion of consistency, but I’d be OK leaving those cases out of the test suite.

As a type checker author I would consider any failure on these to be a bug, though, if possibly a low-priority one.

(Note that your test case has a bug; the t3 case is incorrect. I think you meant assert_type(t3, tuple[int, *tuple[int, ...]]).)

I noticed ty uses the term “gradual equivalent” internally, but it’s not in the spec. We defined “equivalent” for gradual types too (Glossary — typing documentation):

Two fully static types A and B are equivalent if A is a subtype of B and B is a subtype of A. This implies that A and B represent the same set of possible runtime objects. Two gradual types A and B are equivalent if all materializations of A are also materializations of B, and all materializations of B are also materializations of A.

I see, thanks. I’d forgotten that was the phrasing in the spec. That’s too bad, I consider that a (minor) deficiency in the spec, that we don’t clearly separate these relations with a different term.

I think all of these examples look right to me (with Jelle’s correction to t3), except the first one.

I don’t think the type Callable[[float], int] | Callable[[int], float] is equivalent to the type Callable[[int], float]. The former type includes some callables which can accept float arguments, which are excluded by the latter type.

(And I definitely agree that a complete understanding of type equivalence is not feasible.)

I believe it’s correct, though it depends on how the float-int promotion is implemented. If we interpret it to mean that float in a type expression actually means the static type float | int (as I think we should), then a Callable[[float], int] is really a Callable[[float | int], int]. That’s a subtype of Callable[[int], float], because the argument type is a supertype of the RHS and the return type is a subtype.

1 Like

Ah, you’re right of course. Contravariance is confusing :slight_smile:

What’s especially funny is that I got this one wrong, but ty gets it right.

2 Likes

What is the use case for assert_type? I’ve typically seen it used within typechecker test suites to assert on expected types, where matching on the exact nominal type is more useful behaviour than trying to compute type equivalence. Similarly, if end-users are adding it to their code it’s usually to verify that the typechecker’s inference matches their expectation of what something’s type is; again, they are more likely to have an exact type in mind than a mental model of type equivalence.

Use cases are primarily in testing type checker behavior, e.g. typeshed’s tests that make sure that stubs are interpreted as intended; the conformance test suite; type checkers’ own tests.

I don’t know what exactly it means to check for the “exact nominal type”. Type checkers may have various internal representations for types; the notion of “equivalence” in the spec is more precisely defined and doesn’t depend on exactly how types are represented internally.

Ah, I was thinking mostly of the typecheckers’ own tests, where the internal representation was what was being tested. For testing typechecker behaviour from the outside I agree equivalence is a useful concept.

It might even be worse than that:
This problem has only two possible solutions: “equivalent” and “not equivalent”. And verifying a solution is as difficult as it is to come up with one. So it probably falls under one of those even-scarier ones, like Co-NP or EXPTIME, that even a quantum computer won’t be able to help with :sweat_smile:.

But considering the use-cases of assert_type (i.e. mostly type-testing, as others have also noted), the input sizes are likely to be very manageable, generally speaking. So the theoretical time-complexity is probably not something to worry about too much :person_shrugging: .