Should structural subtyping consider `getattr`?

grievejia · March 18, 2025, 9:16pm

Consider the following code snippet:

from typing import Protocol

class P(Protocol):
    x: int
def f(p: P) -> None: ...
    
class C:
    def __getattr__(self, name: str) -> int: ...

f(C())  # Should this be an error or not?

Different type checkers have diverging behaviors today: Mypy and Pyre1 thinks that this is fine, and Pyright reports an error. Looking at the spec, I was also not able to find any text that explicitly defines what should happen here. This section briefly mentioned what should happen for __getattr__ on attribute accesses, but it didn’t specify how that should interact with structural subtyping.

Personally I don’t really have a strong opinion either way – I think one could come up with reasonable justification for both directions (i.e. including or excluding the effect of __getattr__ on protocol matching). But I wonder what other folks think about it and whether this kind of choice should be pinned by the spec.

erictraut · March 18, 2025, 9:58pm

I don’t have a strong opinion one way or the other. I could make arguments for both, although I lean slightly toward disallowing the use of __getattr__ to satisfy a structural subtype.

If we decide to standardize this, I think it should be consistent. It looks like mypy and pyre are inconsistent currently. They allow __getattr__ to be used for a structural subtype when it appears in a class or a metaclass, but they don’t allow it to be used in a module. Pyright consistently disallows it in all of these cases.

# my_module.py
def __getattr__(name: str) -> int:
    return 0

# test.py
import my_module
from typing import Protocol

class P(Protocol):
    x: int

class Impl:
    def __getattr__(self, name: str) -> int:
        return 0

class Meta(type):
    def __getattr__(cls, name: str) -> int:
        return 0

class MetaImpl(metaclass=Meta):
    pass

p1: P = MetaImpl # pyright: Error
p2: P = Impl() # pyright: Error
p3: P = my_module # mypy, pyre, and pyright: Error

Is your question based on a real-world use case, or is it purely theoretical? I’ve searched the mypy and pyright issue trackers, the typing GitHub forum, the historic typing-sig archive, and this forum, and I can’t find any cases where someone has brought up this issue. Based on this, it may be reasonable to conclude that no one is relying on this behavior (one way or the other), so maybe it’s fine for us to leave this unspecified for now.

If you have a solid use case in mind and would like to see this standardized, then I think it’d be fine to propose wording for the typing spec.

grievejia · March 18, 2025, 10:13pm

Yeah I agree the behavior should be consistent across classes, metaclasses and modules.

The question was not based on real-world usages. It came up when I tried to add support of __getattr__ to Pyrefly. Happy to leave it open for now until a future issue would force the decision!

randolf-scholz · March 19, 2025, 1:46pm

It would be useful if this were part of a more general principle that applies across the board and not just specifically to __getattr__. The following two basic principles come to mind:

x: Proto = Impl() allowed if the type checker can prove that Impl satisfies Proto.
x: Proto = Impl() allowed if the type checker cannot disprove that Impl satisfies Proto.

A concrete example where this may be useful is the pandas-library: For a pd.DataFrame, the columns — if their names are valid python identifiers — can be accessed as attributes. One may want to use a Protocol to describe DataFrames with certain schema.

Should structural subtyping consider `__getattr__`?

Should structural subtyping consider `getattr`?