Proposal: Optional Explicit covariance/contravariance for PEP 695

PEP 695 introduced a new type parameter syntax, where the variance of parameters is supposed to be implicitly deducted by the type checker. (PEP 695 – Type Parameter Syntax | peps.python.org)

I see 2 main reasons why giving the option to explicitly express the desired variance is benefitial:

Reason 1. Implicit variance can be overwritten by accident. Author A might write a class with the intent that it is covariant in a certain parameter T. Later, Author B adds an extra method to the class, inadvertently making it invariant in T. This has downstream effects to user C who imports the class. This change might pass silently if there are no checks to verify the covariance. Being able to explicitly state the covariance is an elegant way to have this automatically verified by the type-checker.

Reason 2. Self-explanatory documentation. When generating documentation with a tool like sphinx, or simply calling help(SomeClass) or SomeClass? in a Jupyter notebook, the covariance/contravariance information should be visible. Previously, this was achieved by the T_co, T_contra naming convention. However, since PEP 695 does not use this convention, I presume the intention is to deprecate it.


Proposal

Explicit is better than implicit.

The suggestion is to use the unary plus __pos__ and minus __neg__ operator to create explicit covariance annotation:

class ClassA[T1, -T2, +T3](list[T1]):
    def method1(self, a: T2) -> None:
        ...

    def method2(self) -> T3:
        ...

Here, we explicitly mark T2 as contravariant and T3 as covariant. Later, if someone were to add a method to ClassA that violates this variance, we get direct feedback from the type-checker. The plus and minus symbol are already widely used for this purpose in type-theory literature.

In regard to explicit variance, the PEP authors write (PEP 695 – Type Parameter Syntax | peps.python.org):

We considered adding syntax for specifying whether a type parameter is intended to be invariant, covariant, or contravariant. The typing.TypeVar mechanism in Python requires this. A few other languages including Scala and C# also require developers to specify the variance. We rejected this idea because variance can generally be inferred, and most modern programming languages do infer variance based on usage. Variance is an advanced topic that many developers find confusing, so we want to eliminate the need to understand this concept for most Python developers.

However, the point here is to make it an optional feature, which can be used to increase explicitness of the annotation and in order to avoid having to add extra tests that validate the variance assumptions.

Note: Annotations without prefix would be treated as specified by PEP695: in particular, there is no explicit annotation for invariant types. These are superfluous, since adding variance to an invariant type only widens the acceptable replacement types.

4 Likes

One could argue it’s not that intuitive to read but in my opinion, you will only use this feature if you already understand variance, in which case the syntax will be intuitive.

Considering the syntax is simple, unlikely to conflict with future features, and in the case of advanced users, will prevent potential bugs that are hard to fix in the future, I like this.

I have no idea how easy would it be to implement but at first glance it doesn’t seem that difficult.

You’ll only write this syntax if you understand variance. How likely is it that developers will encounter this syntax in (for example) libraries that they use, or PRs submitted to their project? That’s a genuine question - I have no idea how likely this is. But I do know that I’ve never managed to get covariance and contravariance straight in my mind, so this syntax is essentially line noise to me.

5 Likes

I’m not sure this should be necessary. Variance can already be detected by use of the variable, which is statically available information. At best, you give people a way to add redundant information, and at worst, you allow them to be wrong, creating a mismatch between what the variance should be and what it is, possibly obscuring variance-related issues. I’d prefer if tooling that checks for variance-related issues got better at detecting them for this in particular, it should not fall on users.

Could you please give examples where this isn’t the case, and where this new syntax would be required to fix an API?

My guess would be if a type-variable for a function parameter is inferred as invariant in one release of a library, allowing nominal subtypes to be passed in, but the library dev intends to use that argument contravariantly in a future non-breaking release.


Is this example valid?

v1.0

def sync[-T: Animal](animals: list[T]):
    for animal in animals:
        animal.colour = red

v1.1

def sync[-T: Animal](animals: list[T]):
    for animal in animals:
        animal.colour = red
    animal.append(Cat(colour=red))

Where without this syntax, in v1.0 a user could pass in a list[Dog], but that would be invalid in v1.1. With this syntax, the user must type their input as list[Animal] (or a nominal supertype of Animal)

I don’t know of any, I’m not sure there are any. The sentence you quoted is from the PEP.

Static information is not always available. For example, when importing a class from a library and calling help(TheClass) or TheClass? in a Jupyter notebook. Or when running a documentation builder like Sphinx. In such scenarios, having the variance information available at a glance is beneficial. Running the static analysis when calling help(TheClass) seems overcomplicated to me.

Obviously, optional features are not necessary. But it can add quality of live and additional safety layers, like the example given in the OP.

The main alternative I see at the moment is the T_co / T_contra naming convention. Not sure what the plans of the type-checkers are for the new syntax in this regard.

They do come at a cost, and I pointed out that one of those is potentially safety in my original response.

  1. either special meaning in the parser for ± here, or these are now infix ± on types, which could block off future syntax.

  2. “and at worst, you allow them to be wrong, creating a mismatch between what the variance should be and what it is, possibly obscuring variance-related issues.”

I don’t particularly think that help(class) providing variance info in a Juypter notebook is a compelling case. Why would this be useful information in this context? What safety would having this information provide that isn’t better provided by tools made to handle questions of variance?

I feel similarly with sphinx, but here it would actually be resolvable by a sphinx plugin without a situation where a programmer that got variance wrong made the documentation wrong, or where this isn’t updated to match changes by someone who knew enough to fix an issue, but not that it modified the generic variable’s variance, a complex topic in typing.

I can definitely see your point. It could be the case that the syntax is used by someone who understands variance but in a later date, that same code (or code that depends on it) will be maintained by someone who does not understand it.

The problem with variance, is that if you break it, you may not see the problem with it until a later date at which point it may be too late to change it back without breaking interfaces. So if someone accidentally breaks variance, it seems prefereable that they will be prevented from doing so, instead of letting them do so and face the consequenses later.

I’ll give a minimal example of what I mean:

Suppose I create the following interface

class AnimalCreator[T: Animal]:
    def create_animal() -> T:
        pass

T should be co-variant, meanning, if I created the following function:

def populate_zoo_cage[T: Mammal](creator: AnimalCreator[T]):
    print(f"Populating with {creator.create_animal()}")

I am able to pass AnimalCreator[Cat]
(co-variance means that if Cat subtypes Mammal, then AnimalCreator[Cat] subtypes AnimalCreator[Mammal])

So far everything is fine. I created my function and I’m using it in a lot of places.

Now let’s say someone was not careful and added the following function to the interface:

class AnimalCreator[T: Animal]:
    def create_animal() -> T:
        pass

   def is_healthy(animal: T) -> bool:
        pass

At first glance there is no problem. In fact, at runtime my populate_zoo_cage function will work the same way at runtime. However, a typechecker may report an error now:

creator: AnimalCreator[Cat]

# This is an error. AnimalCreator[Cat] is no longer a subtype of AnimalCreator[Animal]
populate_zoo_cage(creator)

This is because, from the perspective of the type checker, the function is allowed to do the following in its implementation:

def populate_zoo_cage[T: Mammal](creator: AnimalCreator[T]):
    ...
    creator.is_healthy(Dog())
    ...

These kinds of errors are hard to notice but at that point it may be too hard to revert the change to the interface. I think that even though this is an advanced subject, it would be preferable for the next developer to see this error upon adding the new function to the interface than realizing the mistake only in hindsight.

Returning to your point, the syntax will only be used by people who have an understanding of the subject, and yet it will help future maintainers from making these kinds of mistakes even if they don’t yet understand the subject.

4 Likes

I’m in support of this, [Feature Request] TypeVar variance definition alias · Issue #813 · python/typing · GitHub seems like related discussion. Personally though I think the only use cases I’d currently have for this would be for wrangling things around the type system.

From discussions with Jelle on the topic, I think the advice was just drop the prefix. I’ve been working on something that will make the parameter names useable at runtime so they become part of API and having this strange sort of Hungarian notation for the variance doesn’t really sit well with me.

Honestly, that API (which admittedly is made up, and therefore lacking context to help) is simply too complex for me to follow. My immediate reaction (in a code review) would be to say that it needs to be simplified.

But again, I have to reiterate, I’m not maintaining million-line Python projects being worked on my big teams of developers. Maybe this sort of thing makes more sense there (although if it does, that suggests that Java might not be as terrible as I assume it is, either :slightly_smiling_face:)

2 Likes

How often would this be specified in practice? If it’s only used in the standard library and a vanishly small number of classes where the author wants to ensure variance, then why not just use an actual symbol like typing.covariant[...] and typing.contravariant[...]. It will be easier to search for when it does come up, easier to read, harder to miss, and so rare anyway.

You may also want to use + and - for something more common?

Also, using keywords can be a good way to test this idea out to see if it works in practice. The symbols can be added to typing_extensions and be made to work with type checkers, and then you can verify how often they’re used in practice.

3 Likes