Cast syntax for static typing

Why not declare cache as:

cache: dict[tuple[Type[T], T], A[T]]

?

That gives

$ mypy t.py
t.py: note: In class "A":
t.py:18:50: error: Missing type parameters for generic type "A"  [type-arg]
        cache: dict[tuple[Type[Hashable], Hashable], A] = {}
                                                     ^
Found 1 error in 1 file (checked 1 source file)

In this instance mypy does accept that. I suspect that is actually a bug in mypy though because it doesn’t make any sense that a class attribute here could have an unbound typevar like that. The typevar T parametrises instances but cache is a class attribute and hence shared for all instances regardless of what T they were created with. With that change I can then change the return line to

            return cache[(int, 1)]

and mypy will pass that without error even though it fails with TypeError at runtime. I think maybe mypy just isn’t clever enough to spot that these are bogus hints and allows that to pass. It seems to me like this is exploiting a bug that achieves the same effect as type: ignore somehow.

Looking back to my actual code cache is a module-level global rather than a class attribute so the typevar T isn’t available where it is defined but I could maybe move cache to a class attribute as in this demonstration code. The error that mypy reports if cache is global also shows why it shouldn’t be accepted for a class attribute though:

$ mypy t.py
t.py:7:24: error: Type variable "t.T" is unbound  [valid-type]
    cache: dict[tuple[Type[T], T], A[T]] = {}
                           ^
t.py:7:24: note: (Hint: Use "Generic[T]" or "Protocol[T]" base class to bind "T" inside a class)
t.py:7:24: note: (Hint: Use "T" in function signature to bind "T" inside a function)
Found 1 error in 1 file (checked 1 source file)

Putting cache inside a class apparently “binds” T but there isn’t really any distinction between a class attribute and a global from a typing perspective so I think this is a deficiency of the checker.

Measurable doesn’t necessarily mean significant, so I’d like to see the code and the figures. And then there would be three questions:

  1. If the true cost is constructing the type object, can that be broken out into a variable above the loop?
  2. How bad is it to use # type: ignore? In your example here, I would expect basically no consequences at all, since it’s a return statement and you have type-annotated the function itself, so any caller will still have type information.
  3. Exactly how costly is the type cast, compared to (say) changing “try: return cache[key]” to “if key in cache: return cache[key]”? Just to try to get an idea of what scale we’re even talking here.

You’re asking for a feature that is ONLY of value in the extremely narrow intersection of a number of requirements. To have even a single compelling use-case, you need to show that those requirements really do intersect; to have a compelling argument overall, you need to show that they intersect frequently. I remain unconvinced.

Yes, with --disallow-any-generics. I had initially been testing this with ClassVar[dict[...]] (which doesn’t allow type variables), as the cache is not an instance attribute.

But the point is, only T (and Type[T]) need to be hashable, and you aren’t mapping key to an arbitrary hashable value, but to an instance of A.

Perhaps a cleaner solution is to just preprocess out calls to cast? I imagine a third-party tool could parse the AST and implement something like that easily enough.

(It’s a little strange to me that people want “optimal runtime behaviour” and types that can operate at compile time and… well, Python.)

You are right. I showed the hint as

cache: dict[tuple[Type[Hashable], Hashable], Hashable] = {}

I should have typed it as

cache: dict[tuple[Type[Hashable], Hashable], A[Hashable]] = {}

It is not incorrect to say that the values are Hashable (because A is hashable) but A[Hashable] is the most precise type that we can use. There is no correct hint that can be used here to have a typechecker understand that when doing A(1) the dict lookup will return an A[int] rather than an A[str] though.

@storchaka suggested

cache: dict[tuple[Type[T], T], A[T]] = {}

This looks sort of correct but it does not really mean what is needed here. This says that there is a single type T such that all values of the dict or of type A[T]. It does not convey that when they key is of type tuple[Type[str], str] the corresponding value will be of type A[str] which is what is needed for the code in the __new__ method to be correct with respect to its own annotations.

I don’t know what mypy thinks that type annotation means but it causes it to accept invalid code:

from __future__ import annotations

from typing import TypeVar, Generic, Type, Hashable, cast, Any

T = TypeVar('T', bound=Hashable)

class A(Generic[T]):

    __slots__ = ("value",)

    value: T

    cache: dict[tuple[Type[T], T], A[T]] = {}

    def __new__(cls, value: T) -> A[T]:
        key = (type(value), value)
        cache = cls.cache
        try:
            # Here we return A[int] rather than A[T]
            return cache[(int, 1)]
        except KeyError:
            obj = super().__new__(cls)
            obj.value = value
            return cache.setdefault(key, obj)


# mypy allows this thinking that astr is A[str]
aint = A(1)
astr = A("a")
print(aint.value + 2)
print(astr.value + "b") # fails at runtime

That gives:

$ mypy t.py
Success: no issues found in 1 source file
$ python t.py
3
Traceback (most recent call last):
  File "t.py", line 31, in <module>
    print(astr.value + "b") # fails at runtime
TypeError: unsupported operand type(s) for +: 'int' and 'str'
1 Like

Thank you, I see the problem now.

I’ll note that inside __new__, reveal_type shows cache has having type dict[tuple[type[Any], Any], A[Any]]. I don’t know if that’s a consequence of it technically being hinted as a instance attribute? (If you try to access A.cache outside the class statement, for example, you get an “Access to generic instance variables via class is ambiguous” error.)

Good point. In my original code cache is a global variable but ClassVar should be used here if it is a class variable. With ClassVar the annotation is correctly rejected:

$ mypy t.py
t.py: note: In class "A":
t.py:13:5: error: ClassVar cannot contain type variables  [misc]
        cache: ClassVar[dict[tuple[Type[T], T], A[T]]] = {}
        ^
Found 1 error in 1 file (checked 1 source file)

That error matches my expectation that T should be considered unbound in the class body.

So we’re back to the fact that no fully correct type hint exists for cache here. We can use

cache: ClassVar[dict[tuple[Type[Hashable], Hashable], A[Hashable]]] = {}

but then all uses of cache are rejected:

$ mypy t.py
t.py: note: In member "__new__" of class "A":
t.py:20:20: error: Incompatible return value type (got "A[Hashable]", expected "A[T]") 
[return-value]
                return cache[key]
                       ^~~~~~~~~~
t.py:24:20: error: Incompatible return value type (got "A[Hashable]", expected "A[T]") 
[return-value]
                return cache.setdefault(key, obj)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~
t.py:24:42: error: Argument 2 to "setdefault" of "MutableMapping" has incompatible type "A[T]";
expected "A[Hashable]"  [arg-type]
                return cache.setdefault(key, obj)
                                             ^~~
Found 3 errors in 1 file (checked 1 source file)

Either cast or type: ignore is needed to satisfy the checker that __new__ respects its stated types for parameters and return value. That’s not a surprise because doing something similar in C would require the cache to store a union, pointer cast, void pointer or something along those lines as well.