Whether private members should affect PEP 695's type-var variance

Yes, you can construct examples that are safe for the consumer of the API, I fail to see how that is relevant.

How is the type checker supposed to know whether or not you wanted type safety on the internal dict in the implementation to be enforced or not unless you explicitly tell it to? If you don’t want type safety then use Any. Making private members implicitly unsafe is to the detriment of everyone that wants type safety in their implementation.

I want internal type-safety. When I write value = self.__cache[key], I need to make sure that key has been typed as K and value will be typed as V. Type __cache as dict[Any, Any] will ruin this type-safety. But when __cache is not exposed to outside of the class, it should not affect variance of K or V on the class.

I’m just skimming this discussion for interest, and in isolation this example seems fine to me. But you don’t give an example of what goes wrong with it. The original issue was around variance, which I’ll freely admit is something I find incredibly confusing. But I can’t see how to construct something that’s “wrong” for this example, just knowing that “it’s about variance”.

Can you show an example of something that goes “wrong” with this LazyMap example?

You said initially that “both arguments of dict are invariant”. I don’t honestly understand why this is (I could probably work it out if I worked through some examples, so don’t worry about explaining) but I don’t see why I’d expect LazyMap to be any different, if that’s the case.

You can achieve this by declaring K and V manually with the variance you want. auto variance should be safe by default, if you want to do unsafe things, then you have to be explicit, I don’t think that is too much to ask. PEP695 was supposed to be a simplification for the most common cases, it was never intended to replace manually specifying TypeVar variance entirely, since that is not possible.

That being said I think it would be nice to extend TypeVars to allow combining auto variance with either covariance or contravariance, so you would get an error if the auto variance disagrees with your declared variance, which you then can explicitly ignore to force your declared variance.

Then we could extend PEP695 to set the covariant flag in addition to the auto variance flag for any type vars with the suffix _co and the contravariant flag in addition to auto variance for the suffix _contra. This would strike a good balance between being able to take advantage of PEP695 syntax and type safety. The behavior for type vars without either of those suffixes would remain the same. Although this would leave a gap in the specification for when you want to force invariance.

A good idea.

from typing import Callable

# `V` in the next line is inferred as invariant by Pyright, which is our main concern in this example.
class LazyMap[K, V]:  
    __getter: Callable[[K], V]
    __cache: dict[K, V]

    def __init__(self, getter: Callable[[K], V], /) -> None:
        self.__getter = getter
        self.__cache = {}

    def __getitem__(self, key: K, /) -> V:
        try:
            value = self.__cache[key]
        except KeyError:
            value = self.__getter(key)
            self.__cache[key] = value
        return value


m0: LazyMap[object, int] = LazyMap(id)
m1: LazyMap[object, object] = m0  # Reported.

The last line gets reported because V is currently inferred as invariant and object doesn’t equal int. But what will be unsafe for that? Everything that you will get from the map is an int, and will also be a object.

class A: ...
class B(A): ...
class C(B): ...

d0: dict[B, B]

d0 = {}
d1: dict[A, B] = d0 # Dangerous.
d1[A()] = B()
fake_b = next(iter(d0.keys()))  # Type inferred as `B`, but is actually `A`.
assert isinstance(fake_b, B)  # Boom.

d0 = {}
d2: dict[C, B] = d0  # Dangerous.
d0[B()] = B()
fake_c = next(iter(d2.keys()))  # Type inferred as `C`, but is actually `B`.
assert isinstance(fake_c, C)  # Boom.

d0 = {}
d3: dict[B, A] = d0  # Dangerous.
d3[B()] = A()
fake_b = next(iter(d0.values()))  # Type inferred as `B`, but is actually `A`.
assert isinstance(fake_b, B)  # Boom.

d0 = {}
d4: dict[B, C] = d0  # Dangerous.
d0[B()] = B()
fake_c = next(iter(d4.values()))  # Type inferred as `C`, but is actually `B`.
assert isinstance(fake_c, C)  # Boom.

Thanks, I understand now. Can’t you just explicitly declare V with the right variance, as @Daverball said here? You’re using dynamic features of Python here, and I don’t think it’s unreasonable to expect to have to be explicit.

At least that’s what I’ve always been told when I complain that it’s hard to annotate functions that accept duck-typed arguments correctly :grinning:

For dict, it’s members like __setitem__ that make V invariant instead of covariant and members like keys and items that make K invariant instead of contravariant. These members don’t exist on LazyMap in my example.

PEP 695 version of type-vars currently don’t support explicit variances. There has already been a proposal going for it: Proposal: Optional Explicit covariance/contravariance for PEP 695 - Ideas - Discussions on Python.org .

This is explicitly unsafe due to the mutability.

If we extend your example by just three more lines, the issue is clear

k = object()
m1[k] = object()
x = m0[k]  # x is an instance of object, but not an int, breaking expectations of m0

LazyMap doesn’t support __setitem__, so no.

sigh It doesn’t, but the point I’m getting at there still applies because the inner dict does. you can’t freely change that just externally while reusing the type vars publicly

A more detailed example that actually breaks it incoming in a moment.

from typing import Callable

# `V` in the next line is inferred as invariant by Pyright, which is our main concern in this example.
class LazyMap[K, V]:  
    __getter: Callable[[K], V]
    __cache: dict[K, V]

    def __init__(self, getter: Callable[[K], V], /) -> None:
        self.__getter = getter
        self.__cache = {}

    def __getitem__(self, key: K, /) -> V:
        try:
            value = self.__cache[key]
        except KeyError:
            value = self.__getter(key)
            self.__cache[key] = value  # This line right here precludes it
        return value


m0: LazyMap[object, int] = LazyMap(id)
m1: LazyMap[object, object] = m0
# m1.__getitem__ internally calls self.__cache 's setitem which was bound as an int

Your internal use of __setitem__ here prevents this from being provably safe. There’s some fun I could get into with this, but we’re going in circles here. If you are using a dict as a mutable mapping, dicts are invariant for a reason. It being internal doesn’t change that.

Yes, but nobody is forcing you to use PEP695. You can still define TypeVars the old way with the variance you want. I agree it would be nice if you could declare variance with PEP695 syntax, but it also was never intended as a complete replacement for the old way to declare type variables, so you just will have to put up with sometimes still needing to declare a TypeVar manually.

You not wanting to do that is not enough of a reason to make auto variance less type safe.

Please show me how that would be less type safe.

from collections.abc import Callable
from functools import lru_cache


class LazyMap[K, V]:  
    __getter: Callable[[K], V]

    def __init__(self, getter: Callable[[K], V], /) -> None:
        self.__getter = lru_cache(getter)

    def __getitem__(self, key: K, /) -> V:
        return self.__getter(key)

m0: LazyMap[object, int] = LazyMap(id)
m1: LazyMap[object, object] = m0  # this is fine

FWIW, simply not using a dict in a way that attaches to your public types removes the invariance. You’re no longer telling the type checker that your internal type is invariant and depends on the public generics. It’s not that what you were doing is unsafe per-se, it’s that you told the type checker you were using one thing, and it detected an incompatibility. “Fixing” that detection would require the type checker to know all possible ways the code will be used, which can’t happen since static analysis doesn’t have the full code graph as a guarantee like a compiler checking this would.

Just because it is type safe in one narrow example, that still leaves all of your other examples which weren’t type safe. Do you expect the type checker to perform a deep introspection of your entire implementation just to broaden the variance in case you don’t use a private member in an invariant way? Do you want to wait for minutes for the type checker to do its job instead of seconds?

To put it in other words: You have not demonstrated how the type checker is supposed to distinguish between the case where choosing the wrong variance would hide an error in the implementation and one where it would be fine to choose wrong, because only a safe subset of operations have been performed.

The variable being private does not prevent the class’s implementation from performing unsafe operations.

In this example, can calling m1.__getitem__ ever actually get you into trouble? You can’t pass in a value of type V or anything that would let __getitem__ know it was being called as a LazyMap[object, object] instead, so there’s nothing you can do to make m0.__getitem__ return a value of the wrong type (unless you directly access the private member m1.__cache).

So for example, would it be possible to add a new feature Private[] to be used as follows:

from typing import Callable, Private

class LazyMap[K, V]:  
    _getter: Private[Callable[[K], V]]
    _cache: Private[dict[K, V]]

    def __init__(self, getter: Callable[[K], V]) -> None:
        self._getter = getter
        self._cache = {}

    @Private
    def _update_cache(self, key: K, value: V) -> None:
        self._cache[key] = value # "Safe" use of private member bypassing variance

    def __getitem__(self, key: K, /) -> V:
        try:
            value = self._cache[key]
        except KeyError:
            value = self._getter(key)
            self._update_cache(key, value) # Safe call to private method
        return value

m0: LazyMap[object, int] = LazyMap(id)
m1: LazyMap[object, object] = m0
m1['foo'] # Safe: returns an int, which is a valid object
m1._cache['bar'] = object() # Unsafe: accessing private member
m1._update_cache('bar', object()) # Unsafe: calling private method

This would allow safe type checking for both library authors and library users without any hacks or workarounds, and would solve the problem in this thread of being able to correctly infer variance (private members would be ignored when calculating variance). It would also let you actually validate your assumptions that your type can be used safely with the variance you want - if you just force a particular variance with a TypeVar and then # type: ignore any associated type warnings, you have to rely on hoping you reasoned about it correctly. But with Private the type checker can actually check these assumptions for you.

The only final question is - is this actually safe? Or is there still some way you can break m0 by accessing its public interface through m1? I think there’s actually bug in both mypy and pyright, because the way I thought you might try breaking this is by adding:

    def update_getter(self, getter: Callable[[K], V]) -> None:
        self._getter = getter

Then you could do m1.update_getter(lambda x: x) to break m0. But I think the presence of this method should actually force V to be invariant (well, contravariant - but invariant when combined with the other definitions). To see why, try this class:

from typing import Generic, TypeVar

T_co = TypeVar('T_co', covariant=True)

class Wrapper(Generic[T_co]):
    def __init__(self, value: T_co):
        self._value = value

    def get(self) -> T_co:
        return self._value

    #def set(self, value: T_co) -> None:
    #    self._value = value

    def set_from(self, fn: Callable[[], T_co]) -> None:
        self._value = fn()

def f(w: Wrapper[object]):
    reveal_type(w.get())
    #w.set('foo')
    w.set_from(lambda: 'foo')

x: Wrapper[int] = Wrapper(1)
f(x)
print(x.get())

Both mypy and pyright report no errors here, despite the fact that x.get() returns the string 'foo'. However if you uncomment the Wrapper.set() function, they both complain about T_co being used non-covariantly.

The reason this is not a solution and also the reason why your last example does not raise any type errors, even though it arguably should, is actually the same. It would require type checkers to check multiple levels deep, i.e. it would have to look at the implementation of the method in addition to the signature of the method in order to detect variance issues. While this is certainly possible, you can also take this dependency chain infinitely far, when do you want the type checker to stop following the chain?

It is not realistic to have that deep of an bidirectional inference at reasonable speeds, so you would be sacrificing performance for the benefit of detecting a small number of false positives/negatives introduced through shallow inference. So in cases like this it’s up to us to spot these problems and change the variance of the type vars accordingly. That is the price you pay for getting near instant feedback from mypy.

Other type checkers in the future may make a different trade-off and perform more deep inference at the cost of speed.