Whether private members should affect PEP 695's type-var variance

That example works because by using a protocol without __setitem__ as the type hint, you’re promising the type checker new items won’t be set, while also providing that knowledge of it not being part of the object’s contract to all consumers in the annotation (and it will yell at you if it sees them being set) the invariance forced by having both __setitem__ and __getitem__ is no longer there.

The internal dict there would otherwise be invariant, and is sharing the parent class’s type vars. Without the promise that __setitem__ won’t be used provided by the protocol there defining structurally what’s allowed to be used and __setitem__ not being present, the type checker doesn’t know it won’t be used. (This isn’t being checked in a compiled language at compile time, static analysis can’t know everything here)

They may not be technically private, but they are treated as private.

As I just confirmed, neither Mypy nor Pyright supports accessing such members with their mangled names like “_Class__name”, so it should be safe to say such members would not be accessed outside their enclosing classes.

While __dictionary is private, how it is used doesn’t affect variance of K or V, even if __setitem__ gets called. Class type-vars are already treated as invariant for internal usages.

I’ve bolded the relevant section of what’s already been said. You are exposing the internals in your types. Either don’t do that, or deal with that you are the one who tied the variance together like that.

If the dict is truly an internal implementation detail that you don’t want to deal with checking, its type information shouldn’t be leaking out. You can leave it typed as dict[Any, Any] and rely on it being internal and only intended to be accessed as supported. Any coerces to the specific type implicitly just fine. you’ve specified that this is related to the exposed type and have created the connection, the type checker is correctly informing you of this issue.

You can’t remove checking this (as a blanket action on the type system as a whole) without breaking libraries that want their own internals checked too. In my own code and code that I maintain, I’d happily use the protocol if that was the intent because it means my own code is checked that the important invariants are upheld internally.

This version is checked that internal use actually conforms with the exposed intended use
from typing import Protocol

class SupportsGetItem[K, V](Protocol):
    def __getitem__(self, key: K, /) -> V: ...

class Translator[K, V]:
    __dictionary: SupportsGetITem[K, V]  # whatever is assigned to this, only using `__getitem__` is supported

    def __init__(self, dictionary: dict[K, V], /) -> None:
        self.__dictionary = dictionary

    def __getitem__(self, key: K, /) -> V:
        return self.__dictionary[key]
This version decouples the internal use from the public interface, but loses the ability for the type checker to enforce that the internal use matches the exposed types
from typing import Any

class Translator[K, V]:
    __dictionary: dict[Any, Any]

    def __init__(self, dictionary: dict[K, V], /) -> None:
        self.__dictionary = dictionary

    def __getitem__(self, key: K, /) -> V:
        return self.__dictionary[key]

Both options are available in the type system currently, and it’s not necessary to remove the ability for people to check things they want checked.

I’m still not sure whether you would agree that __dictionary in my example is private and will not get accessed outside implementation of Translator. If the answer is no, that should be the base point that we have a disagreement.

With all due respect, whether we agree on __dictionary being accessible outside and the meaning of private in python isn’t going to change the situation here, as libraries and their authors are type checker users too. The ability to have internals checked for consistency with the exposed API is a feature for many of those users, not a bug. The type system also gives you ways to choose how rigidly that API boundary is defined.

With that said, It can be accessed outside of the translator in at least two ways. One of them, as you pointed out isn’t supported by some type checkers already (mypy and pyright aren’t the only type checkers in the ecosystem). I disagree with the type checkers taking this behavior, but it isn’t relevant to the overall answer here.

The other is that you are accepting a mutable reference in __init__. There is no guarantee the caller is not still doing other things with the same dict.

Taking a step back for a moment, I think there are two distinct issues here:

  1. What’s the correct behavior for type checkers?
  2. How do we make it easy for users to express what they want checked?

Right now, I believe the answer to the first one is “the type checker is correct for the case on-hand, and there are x other ways to express it depending on your intent”

The latter has been a larger ongoing recurring topic with a lot of concerns. Variance and API boundaries not being “Easy” was definitely brought up there, and I think this is a real thing we need to address.

I’m not sure I agree with a protocol with a single method defined as effort, but I have a feeling that effort here is less the protocol itself, and more the process to determining it was needed and why. If that’s accurate, I think a more productive way forward would be pushing type checkers to have built-in explanations for what caused specific variance to be determined. This is a not-uncommon pain point in the type system, and even those with experience in it or experience with similar, yet different languages (With slightly different concerns) get surprised by the effects of it at times.

Sadly, it cannot always be worked-around by protocols.

import copy

class Translator[K, V]:
    __dictionary: dict[K, V]

    def __init__(self, dictionary: dict[K, V], /) -> None:
        self.__dictionary = dictionary

    def __getitem__(self, key: K, /) -> V:
        value = self.__dictionary[key]
        value = copy.copy(value)
        self.__dictionary[key] = value
        return value

__dictionary is accessed as a mutable map internally, so its type can at best be abstracted as a mutable mapper protocol with __getitem__ and __setitem__ on it, and V will still be invariant instead of covariant, if other conditions remain the same.

Why would you want to reassign the value in the dictionary if you always want to return a fresh copy? The assignment is redundant in this case, unless you want the reference count for the original object to decrease. As soon as you are setting values in the implementation it just simply is no longer covariant, there are no ifs and buts about it. If you want to ignore type safety, you have to do it manually and at your own peril.

If your aim is to create a one time copy of the elements so you don’t keep a reference to the original objects there are other covariant protocols you can use, such as SupportsItems[K, V] and Iterable[tuple[K, V]].

Please don’t pay too much attention to method implementations as they are not important for type-var variances.

Here I’m giving a more practical example:

from typing import Callable

class LazyMap[K, V]:
    __getter: Callable[[K], V]
    __cache: dict[K, V]

    def __init__(self, getter: Callable[[K], V], /) -> None:
        self.__getter = getter
        self.__cache = {}

    def __getitem__(self, key: K, /) -> V:
        try:
            value = self.__cache[key]
        except KeyError:
            value = self.__getter(key)
            self.__cache[key] = value
        return value

Yes, you can construct examples that are safe for the consumer of the API, I fail to see how that is relevant.

How is the type checker supposed to know whether or not you wanted type safety on the internal dict in the implementation to be enforced or not unless you explicitly tell it to? If you don’t want type safety then use Any. Making private members implicitly unsafe is to the detriment of everyone that wants type safety in their implementation.

I want internal type-safety. When I write value = self.__cache[key], I need to make sure that key has been typed as K and value will be typed as V. Type __cache as dict[Any, Any] will ruin this type-safety. But when __cache is not exposed to outside of the class, it should not affect variance of K or V on the class.

I’m just skimming this discussion for interest, and in isolation this example seems fine to me. But you don’t give an example of what goes wrong with it. The original issue was around variance, which I’ll freely admit is something I find incredibly confusing. But I can’t see how to construct something that’s “wrong” for this example, just knowing that “it’s about variance”.

Can you show an example of something that goes “wrong” with this LazyMap example?

You said initially that “both arguments of dict are invariant”. I don’t honestly understand why this is (I could probably work it out if I worked through some examples, so don’t worry about explaining) but I don’t see why I’d expect LazyMap to be any different, if that’s the case.

You can achieve this by declaring K and V manually with the variance you want. auto variance should be safe by default, if you want to do unsafe things, then you have to be explicit, I don’t think that is too much to ask. PEP695 was supposed to be a simplification for the most common cases, it was never intended to replace manually specifying TypeVar variance entirely, since that is not possible.

That being said I think it would be nice to extend TypeVars to allow combining auto variance with either covariance or contravariance, so you would get an error if the auto variance disagrees with your declared variance, which you then can explicitly ignore to force your declared variance.

Then we could extend PEP695 to set the covariant flag in addition to the auto variance flag for any type vars with the suffix _co and the contravariant flag in addition to auto variance for the suffix _contra. This would strike a good balance between being able to take advantage of PEP695 syntax and type safety. The behavior for type vars without either of those suffixes would remain the same. Although this would leave a gap in the specification for when you want to force invariance.

A good idea.

from typing import Callable

# `V` in the next line is inferred as invariant by Pyright, which is our main concern in this example.
class LazyMap[K, V]:  
    __getter: Callable[[K], V]
    __cache: dict[K, V]

    def __init__(self, getter: Callable[[K], V], /) -> None:
        self.__getter = getter
        self.__cache = {}

    def __getitem__(self, key: K, /) -> V:
        try:
            value = self.__cache[key]
        except KeyError:
            value = self.__getter(key)
            self.__cache[key] = value
        return value


m0: LazyMap[object, int] = LazyMap(id)
m1: LazyMap[object, object] = m0  # Reported.

The last line gets reported because V is currently inferred as invariant and object doesn’t equal int. But what will be unsafe for that? Everything that you will get from the map is an int, and will also be a object.

class A: ...
class B(A): ...
class C(B): ...

d0: dict[B, B]

d0 = {}
d1: dict[A, B] = d0 # Dangerous.
d1[A()] = B()
fake_b = next(iter(d0.keys()))  # Type inferred as `B`, but is actually `A`.
assert isinstance(fake_b, B)  # Boom.

d0 = {}
d2: dict[C, B] = d0  # Dangerous.
d0[B()] = B()
fake_c = next(iter(d2.keys()))  # Type inferred as `C`, but is actually `B`.
assert isinstance(fake_c, C)  # Boom.

d0 = {}
d3: dict[B, A] = d0  # Dangerous.
d3[B()] = A()
fake_b = next(iter(d0.values()))  # Type inferred as `B`, but is actually `A`.
assert isinstance(fake_b, B)  # Boom.

d0 = {}
d4: dict[B, C] = d0  # Dangerous.
d0[B()] = B()
fake_c = next(iter(d4.values()))  # Type inferred as `C`, but is actually `B`.
assert isinstance(fake_c, C)  # Boom.

Thanks, I understand now. Can’t you just explicitly declare V with the right variance, as @Daverball said here? You’re using dynamic features of Python here, and I don’t think it’s unreasonable to expect to have to be explicit.

At least that’s what I’ve always been told when I complain that it’s hard to annotate functions that accept duck-typed arguments correctly :grinning:

For dict, it’s members like __setitem__ that make V invariant instead of covariant and members like keys and items that make K invariant instead of contravariant. These members don’t exist on LazyMap in my example.

PEP 695 version of type-vars currently don’t support explicit variances. There has already been a proposal going for it: Proposal: Optional Explicit covariance/contravariance for PEP 695 - Ideas - Discussions on Python.org .

This is explicitly unsafe due to the mutability.

If we extend your example by just three more lines, the issue is clear

k = object()
m1[k] = object()
x = m0[k]  # x is an instance of object, but not an int, breaking expectations of m0

LazyMap doesn’t support __setitem__, so no.