Full structural subtyping at runtime

maggyero · March 11, 2022, 11:42pm

S is a structural subtype of T if S includes the method names and types of T, or structural subtypes thereof.

But I noticed that the structural subtype checks at runtime implemented in the __subclasshook__ method by the abstract base classes of the collections.abc module and the typing.Protocol abstract base class only check method names and not method types (i.e. method signatures), which makes the checks incomplete. I would expect either full structural subtype checks at runtime, or no checks at all. So this is my question: why are we ignoring method signatures in structural subtype checks at runtime?

Here is the relevant __subclasshook__ method implementation in the collection.abc module (typing.Protocol has a similar implementation, albeit more complex as it automatically generates __subclasshook__ for its subclasses):

def _check_methods(C, *methods):
    mro = C.__mro__
    for method in methods:
        for B in mro:
            if method in B.__dict__:
                if B.__dict__[method] is None:
                    return NotImplemented
                break
        else:
            return NotImplemented
    return True

class Hashable(metaclass=ABCMeta):
    __slots__ = ()
    @abstractmethod
    def __hash__(self):
        return 0

    @classmethod
    def __subclasshook__(cls, C):
        if cls is Hashable:
            return _check_methods(C, "__hash__")
        return NotImplemented

Jelle · March 11, 2022, 11:49pm

I wasn’t around when this decision was made, but I think it was a good call. Full structural compatibility checks are difficult to get right and better left to external tools that can evolve faster than CPython.

guido · March 11, 2022, 11:57pm

Because the method signatures are not available at runtime. (Or if they are, through some function in the inspect module, accessing them is prohibitively expensive and nobody has tried to see if this might even have the right semantics.)

maggyero · March 12, 2022, 12:55am

Extracting method signatures with the inspect module would indeed be expensive. About the semantics, doesn’t mypy already implement it for static structural type checking (i.e. protocols)?

A simple example that checks method names and signatures but does not support variance of method parameter types and return types (i.e. it does not support depth subtyping but only width subtyping):

def interface(cls):
    interface = {}
    for cls in reversed(cls.__mro__[:-1]):
        for name, value in vars(cls).items():
            if name in ['__init__', '__new__']: continue
            if callable(value): interface[name] = inspect.signature(value)
    return interface

class Iterable(abc.ABC):
    @abc.abstractmethod
    def __iter__(self): pass
    @classmethod
    def __subclasshook__(cls, subclass):
        return interface(cls).items() <= interface(subclass).items()

class A:
    def __iter__(self): pass

class B:
    def __iter__(self, foo, bar): pass

assert isinstance(A(), Iterable)
assert not isinstance(B(), Iterable)

Jelle · March 12, 2022, 1:01am

Mypy does it, but mypy isn’t part of the stdlib. Your sample is incorrect because implementing classes can have additional parameters or more permissive types. Getting that right is complicated, and the stdlib isn’t the right place to develop a type checking algorithm.

EpicWink · March 12, 2022, 1:35am

On a note philosophical now, I get the impression that Python specifically doesn’t care about any typing (by default) and parameters (outside of actually calling) at runtime, and so checking types/parameters of concrete subclasses’ methods would go against that idiom.

We already have external tools like MyPy and PyCharm which do a fantastic job of type-checking for users which want it, but still allow the flexibility for other users with text editors to not have to care

maggyero · March 12, 2022, 2:10pm

That is why I am surprised that we are trying to perform runtime checks of method names with __subclasshook__ since it is a form of structural type checking (though incomplete). It makes the dynamic view inconsistent with the static view of structural subtyping (which is fully checked in static type checkers like mypy). Shouldn’t we instead drop any structural type checking at runtime for now?

steve.dower · March 14, 2022, 5:30pm

The runtime type checking isn’t intended for verification (“has the caller passed the right thing”) so much as identification (“what has the caller passed me”). It doesn’t need to be anywhere near as thorough for that.

Structural pattern matching is the logical extension of __subclasshook__, so once you understand how the former is intended to be applied, you can work backwards to see how the latter was designed to achieve the same thing.

Static type checking (such as mypy) doesn’t have equivalent pre-existing features in the Python language. None of the language/library has been built around “did the caller pass the right thing” ^[1], and if you try and view existing language features through this lens, you’ll just make the wrong inferences.

For example, making an API that can handle either a dict-like object or an instance of a particular class could use runtime type identification to choose a slightly different behaviour (such as converting the dict into a “real” object before proceeding). This is different from using it to raise an error because the caller didn’t name everything right, or they didn’t use exactly your type instance but made their own that looks the same.

Some development has been built around the similar concept of “I don’t know what to do with this thing that you passed me”, but this is subtly different from type checking. ↩︎

maggyero · March 20, 2022, 10:20pm

It does indeed seem that Python is built around partial structural type checking at runtime.

For instance, as @mjpieters rightly pointed out in a Stack Overflow comment, the abc.ABCMeta metaclass only enforces method names. When the abc.ABCMeta metaclass is instantiated to a class, the abc.ABCMeta.__new__ method sets an __abstractmethods__ attribute on the class to a frozenset of the method names (not the methods signatures) of the class that have an __isabstractmethod__ attribute set to True . Then, when the class is instantiated, the object.__new__ method checks that the class has no __abstractmethods__ attribute set to a non-empty iterable of str since only non-abstract classes can be instantiated:

>>> import abc
>>> class A(metaclass=abc.ABCMeta):
...     @abc.abstractmethod
...     def f(self): pass
... 
>>> A()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class A with abstract method f
>>> class B(A):
...     def f(self, x, y, z): pass  # non-matching method signature
... 
>>> B()  # yet the class can be instantiated despite the fake override
<__main__.B object at 0x1082d0610>