Issue: There are many ways to realize attributes that are mutually incompatible.
When annotating an attribute foo: Foo in a class or protocol, often all I care about is that getattr(obj, "foo") succeeds and produces an instance of Foo. Now, python offers quite a few different possible ways to achieve this:
regular instance attributes
class-attributes (ClassVar)
@property
@cached_property
Custom descriptors.
These are not interchangeable when using type hints. But there should be a way to type hint foo in a parent class that is compatible with the duck-typing assumption isinstance(getattr(obj, "foo"), Foo).
Introduce a special form Attribute (and maybe also MutableAttribute) so that
class A:
foo: Attribute[Foo]
is compatible with (1)-(5) above. (Or alternatively, make foo:Foo the generic case and foo: Attribute[Foo] the case that is specifically satisfied by a regular instance attribute only.)
Use overloading of __getattr__ with Literals (ugly imo)
class A:
@overload
def __getattr__(self, key: Literal["foo"]) -> Foo: ...
Well, this is already supported, but PEP 544 makes it clear that foo: Foo in a Protocol does specify an mutable instance attribute:
To distinguish between protocol class variables and protocol instance variables, the special ClassVar annotation should be used as specified by PEP 526. By default, protocol variables as defined above are considered readable and writable. To define a read-only protocol variable, one can use an (abstract) property.
Consequently, type checkers will flag if a sublcass implements it otherwise. Here is a compatibility chart (mypy 1.7.1 /pyright 1.1.339)
subclass\parent
attribute
classvar
property
cached_property
attribute
,
,
,
,
classvar
,
,
,
,
property
,
,
,
,
cached_property
,
,
,
,
EDIT: mypy results table was accidentally transposed⦠(mypy-playground)
Edit: The table was updated and the text below reflects a previous version.
Considering this example
class Bar(Protocol):
foo: Foo
class AttrBar:
def __init__(self, foo):
self.foo = foo
class PropBar:
def __init__(self, foo):
self._foo = foo
@property
def foo(self):
return self._foo
It makes sense to me that both AttrBar and PropBar ducktype as a Bar, so to me pyrights behaviour in column āattributeā is weird. I guess you can say that because you canāt set PropBar.foo itās not compatible but that seems to be orthogonal in a sense? Iām not very good at type theory so it would be nice if someone could explain in detail how attributes and properties differ for a type checker.
The problem is that āreadonlyā isnāt exactly part of the type - the value isnāt āreadonlyā, itās the place the value is stored that has that property. So yes, itās somewhat orthogonal. But (as far as I know - Iām far from an expert) Pythonās type system canāt express the idea that an attribute (or indeed any ālocationā) can be readonly - leading to this sort of dilemma where you either have to be unnecessarily precise, or you canāt express what you really mean (which is āI will never write to this and the value in it has type Fooā).
Just to make this clear, this proposal here is not about the ability of specifying read-only variables, in fact it is about the ability of writing type hints that are agnostic about the writeability of a variable.
I think this is actually intended behavior (save for maybe cached_property). property in a Protocol sort of already means āI only care about being able to read this attributeā, thatās why itās compatible with both a regular attribute and a ClassVar downstream, since either of those will still be readable. At least as long as you donāt also define a setter in the Protocol. Iām not really sure how it behaves then.
What is a lot more frustrating to me personally is how this interacts with custom descriptors, so while this workaround for the lack of having a ReadOnly in Protocol works for some cases, it usually does not work for custom descriptors, which is where Iād really like to be able to use it.
Take for example a SQLAlchemy model vs. a NamedTuple or a dataclass. Thereās no way to write a Protocol that will accept both a Mapped[T] and T when all you care about is getting a T when accessing that attribute on an instance.
A protocol that says youāre only allowed to read from the property, but classes are considered to support the protocol even if they declare a setter for the property, and
A protocol that classes only satisfy if they prohibit writing.
I was thinking of (1), which is (I think) what you mean by agnostic, rather than (2). But āreadonlyā may not be the best way of describing it, I agree. I hadnāt appreciate that your question was specifically about how type checkers decide if a type satisfies the protocol.
Or am I still misunderstanding, and thereās something apart from the question of whether a class satisfies a given protocol that matters here?
Thatās it. The only detail that still matters is the distinction on what happens on the type vs the instance. pyright rejects overwriting a property with an attribute, because when querying the type, they will return different things. For instance,
class A(Protocol):
@property
def foo(self) -> int: return 42
actually makes 2 promises: if isinstance(obj, A) then isinstance(obj.foo, int) and if issubclass(typ, A) then isinstance(typ.foo, property).
What I want is the ability to write a Protocol-Class HasFoo that captures the structural type that encodes the set of all runtime values which satisfy the condition:
isinstance(obj, HasFoo) if and only if hasttr(obj, "foo") and isinstance(obj.foo, Foo)
Without any additional assumptions about the writeability of foo or what happens when trying to access foo on a type instead of an instance.
Technically that hasattr check canāt be evaluated statically (you could define a __getattr__ that returned a foo attribute only on a Tuesdayā¦) but I think it should be possible to come up with a check that is possible to handle statically which is close enough for all practical purposes.
For regular classes I would consider it a bug correct [1]. Although Protocol is a bit different in my mind, since it only needs to be structurally compatible and IIRC the mypy docs explicitly mention the use of property in a Protocol as a stand-in for a read only marker, so I donāt think the argument holds as long as you actually special-case property in Protocol. [2]
But thereās other reasons why we need something like a Readable anyways, e.g. to accurately represent a whole bunch of types defined using the C-API, since there itās possible to have actual read-only attributes, that arenāt properties. So the current workaround of annotating those attributes as property isnāt fully type safe.
In ABCs it could be useful as well, if you want to be able to be more loose and change the contract with what subclasses have to implement to be considered compatible.
I think this seems right except FWIW there are a couple places where the type system does recognize something at least similar to this distinction. Thereās the Final qualifier, which isnāt about the value of a variable but says that you canāt store something in the same name later, and similarly for @final on methods/classes. Thereās also a still-under-discussion PEP to mark keys of a TypedDict as readonly, but TDs are a special case.
Itās arguably a little bit weird to have these ātype qualifiersā or whatever you want to call them that arenāt describing the type of a variable/value, including other qualifiers like ClassVar or @deprecated etc., so maybe we donāt want to add tons of them for every possible situation, but itās been done and type checkers have implemented the logic to understand those things.
I think itās easy to achieve what you want if you donāt insist that foo must be an attribute. Just make it a zero-arg method, e.g.
class A(Protocol):
def foo(self) -> int: ...
Methods are āimmutableā by nature so thereās no assignability issues. As long as a class defines a 0-arg int-returning foo() method, it will be considered an instance of A. The only thing thatās lost is that youāll need to access the data via obj.foo() instead of obj.foo which, IMHO, is a very minor syntactical annoyance.
Thatās fine as a workaround if you are designing a new API from scratch, but often you try to either be backwards-compatible or want to be compatible with multiple dataclass-like objects that come from various other libraries where you donāt have control how they are going to look, they may use regular attributes, but they also may use some custom descriptor. You have no easy way to be compatible with all of them, even though it should be really easy if you only care about an attribute being readable and containing a certain type.
I am of the opinion that PEP-705ās ReadOnly annotation be extended to support this use case. Itās not in the current PEP only to keep things in standalone pieces.
Thereās a subtle but significant difference between ReadOnly and Readable. ReadOnly to me says the implementation is not allowed to make this attribute writeable, while Readable leaves that option open. We also have ReadableBuffer vs ReadOnlyBuffer.
A more rare use-case but maybe still significant for a Protocol in order to allow it to be contravariant would be a Writeable modifier.
What you are suggesting is the opposite of whatās desired here. I want to be able to write Protocols that are flexible enough to not care about implementation details (like attribute vs property). This is crucial in order to be able to type hint generic functions that can interact with classes from different libraries, without having to write tons of overloads. For example, I may have a protocol like
class SupportsShape(Protocol):
shape: tuple[int, ...] # note: usually not writeable.
That can be used for numpy.ndarray / pandas.DataFrame / torch.Tensor, etc. If some library decides to implement shape as a property, this Protocol suddenly doesnāt match anymore.
One way to think about it is that these modifiers can be translated into knowledge about the classes __getattr__ and __setattr__ methods (*):
Note: I abbreviate Literal["foo"] ā "foo", otherwise the table gets too wide.
Modifier
self.__getattr__
self.__setattr__
foo: Readable[T]
(name: "foo") -> T
ā
foo: ReadOnly[T]
(name: "foo") -> T
(name: "foo", val: Never) -> None
foo: Writeable[T]
ā
(name: "foo", val: T) -> None
foo: WriteOnly[T]
(name: "foo") -> Never
(name: "foo", val: T) -> None
foo: Mutable[T]
(name: "foo") -> T
(name: "foo", val: T) -> None
(*) If Never is interpreted as the true, uninhabitable bottom type (uninhabitable means that no instances can exist, i.e. calling __setattr__ with Literal["foo"] and T is equivalent to raising an exception). It has come to my knowledge that unfortunately Never is not considered uninhabitable by pythonās type-checkers, so possibly there needs to be another PEP to introduce a true bottom type that is uninhabitable.
Example of applying these principles
class A:
foo: Readable[int]
bar: ReadOnly[bool]
baz: Mutable[str]
From a type-theory POV, this should be translatable to
EDIT: For ReadOnly, the __setattr__ might actually be better represented by (name: "foo", val: Never) -> None than (name: "foo", val: T) -> Never. This still prevents calling obj.foo = ..., but at the same time allows contravariant overriding, so that a subclass could replace a ReadOnly variable with a Mutable variable.
@alicederyn I wonder if this can be somehow combined with PEP 705, the only essential difference is that you want to apply these constraints on __getitem__ and __setitem__ rather than __getattr__ and __setattr__. This could be special-cased for TypedDict, I guess the way it works is that metaclasses can decide what they want to do with these annotations, so for mapping-like containers they can translate it into constraints on __getitem__ and __setitem__ instead.
I wonder if there is a possibility for something similar like dataclass_transform that allows this to be a general concept, so that type-checkers do not need to special case TypedDict as much.