Proposal: Enhancing Type Safety for `__set_name__` in Descriptors

Abstract:
The current type system specification does not take into account when there is a discrepancy between the owner class statically typed to the __set_name__ method of the descriptor and the class actually assigned.
I propose a new static type constraint to address this problem.

Motivation:
In Python’s descriptor, we can implement __set_name__(self, owner, name, /).
This method is executed when the descriptor is assigned to the attribute’s owner class and receives the owner class and the name string.
The arguments of the __set_name__ method can be statically typed like other methods.
However, under the current type system specification, the static type checker does not take into account when there is a type mismatch between the class that the descriptor was assigned to the attribute and the class that was statically typed to owner.
By being able to perform this type check, it is possible to reuse the descriptor in a type-safe manner in various classes.

Summary Examples:
Define abstract classes and/or protocols before defining the descriptor.
By typing these abstract classes and subtypes to the second argument of __set_name__, we can constrain the class to which the descriptor is assigned.

from abc import ABC, abstractmethod
from typing import Protocol, Self


class MyProto(Protocol):
    def method1(self) -> str:
        ...


class MyAbstract(ABC):
    @abstractmethod
    def method2(self) -> int:
        ...


class Descriptor1:
    def __get__(self, instance, owner, /) -> Self | None:
        ...

    def __set_name__(self, owner: type[MyProto], name: str, /) -> None:
        ...


class Descriptor2:
    def __get__(self, instance, owner, /) -> Self | None:
        ...

    def __set_name__(self, owner: type[MyAbstract], name: str, /) -> None:
        ...


class ConcreteA(MyAbstract):
    def method1(self) -> str:
        return "hello from ConcreteA!"

    def method2(self) -> int:
        return 1234

    d1 = Descriptor1()  # OK
    d2 = Descriptor2()  # OK


class ConcreteB(MyAbstract):
    def method2(self) -> int:
        return 5678

    d1 = Descriptor1()  # Type checker error: `ConcreteB` is NOT a structual subtype of `MyProto`
    d2 = Descriptor2()  # OK


class ConcreteC:
    def method1(self) -> str:
        return "hello from ConcreteC!"

    d1 = Descriptor1()  # OK
    d2 = Descriptor2()  # Type checker error: `ConcreteC` is NOT a nominal subtype of `MyAbstract`


class ConcreteD:
    def method3(self) -> float:
        return 3.14

    d1 = Descriptor1()  # Type checker error: `ConcreteD` is NOT a structual subtype of `MyProto`
    d2 = Descriptor2()  # Type checker error: `ConcreteD` is NOT a nominal subtype of `MyAbstract`

Related informations:

Any opinions are welcome. I hope this proposal will be beneficial for the Python community.

1 Like

Is this something that needs to be taken care of in the typing standards somewhere? Shouldn’t this just be something type checkers can implement of their own accord?

This is, indeed, not something that cannot be realized without changing the language specifications like PEP 695 or PEP 604.
Therefore, it might be a feature that can be implemented if a type checker developer community or maintainer wants to implement it.
However, there might be other developers like me who use both mypy and pyright and want this feature in both.
I have also seen some discussion at the Typing Council that “the xxx type interpretation in each type checker should be unified” after each type checker implemented its own xxx type interpretation.

I would like to know the community’s opinion on whether this feature should be implemented at the discretion of each type checker as part of the type ecosystem’s flexibility, or whether a standard specification should be decided and should be introduced to many type checkers.

Thank you.

Ultimately this is part of Python’s object model, so it really doesn’t need to be part of the type specification. Since the language and its object model already specifies at which point __set_name__ gets called and with which arguments.

So I’d really consider it a type checker bug if they don’t invoke __set_name__ for class level assignments, as they already do for __get__ and __set__ for instance/class level access.

But I can also see why this is perhaps not high on the list of priorities for maintainers of type checkers, since doing anything more complex than using the name in a __set_name__ is generally frowned upon and you’d already get a type error if your __get__ was annotated correctly. So all you’re really changing is how soon you know about the error and it doesn’t help you inside stub files anyways, since you don’t know if there is an actual assignment at runtime, so you can’t emit the error there.

In my understanding, this is generally done if different type checkers have different interpretations of how this should behave, i.e. if this is a point of divergence in what library authors need to annotate for example. But I don’t think there is much to be discussed here. All it primarily needs is someone to do the work to actually implement this. You should ask the maintainers of the type checkers you care about directly, but I don’t think either of mypy or pyrigth are going to be opposed to this if this is just a small addition.

Writing standards alone does not implement stuff in the type checkers. That still requires work done by someone, and in fact the same amount of work whether or not there is a standard to look at (assuming the expected behavior is obvious enough, which it is here). The fastest way to get this check to be added is for you to add it yourself.

I agree this doesn’t look like it requires a spec change. The spec also doesn’t say that type checkers should verify that in a for loop, the object being iterated over is iterable. That falls under the general idea that type checkers should model the runtime behavior and flag cases where they can see that something will fail. This example seems to fall under the same category.

I’d encourage you to contribute an implementation to type checkers that you are interested in.

2 Likes

This is a pretty obscure part of the object model. I’ve never seen it used in a code base.

Support for __set_name__ has been requested in the mypy issue tracker, but in the past four years it has received only two upvotes (thumbs-up reactions). And no one has ever requested this support in the pyright issue tracker.

This has helped my understanding.
Implementations using for loops are clearly more numerous than those defining custom descriptors.
If even such things are left to the flexibility of each type checker without a static type specification to check, I fully understand that the static type specification for __set_name__ should not be added/changed.

Thank you.

Thank you everyone.

I made this proposal because there was a descriptor in my project that I wanted to reuse in a type-safe manner.
Once the project I’m currently working on settles down, I’m considering proposing/contributing to the type checker I use.
Also, if there is someone who can implement this feature in any type checker on my behalf, I would be happy to cooperate.

By the way, just in case this got lost along the way: If your descriptor is a true descriptor and uses at least one of __get__ /__set__/__delete__ you can currently annotate the owner/instance parameter for those methods with your Protocol. You will then get an error if you try to access/set/delete the descriptor on an object that doesn’t satisfy the Protocol.

So you get almost the same level of safety[1] this way.


  1. and in some cases actually more safety, since __set_name__ doesn’t take into account potentially unsafe subclassing ↩︎

1 Like

I understand that such errors can be raised at runtime in those situations.
I made this proposal thinking that if the type checker could blame such a codebase with static analysis before execution, it could efficiently prevent bugs.

Thank you.

I misunderstood your post as referring to runtime behavior and wrote my reply based on that misunderstanding. Let me correct and add to that.

Even in static analysis, indeed, if the owner/instance is not subtype of the protocol when __get__ is called, for example, in pyright, a reportAttributeAccessIssue occurs.

However, I thought that it would be more useful if it could be discovered earlier, at the point where __set_name__ is hooked.

Thank you.

The one code base that I thought may use __set_name__ due to the library heavily utilizing descriptors would be param and they seem to side step that hook completely and get the owner and name through a different mechanism.

They do have this note