How should type checkers handle a program like this?
from typing import reveal_type
class X(int, str):
pass
def f(x: int):
if isinstance(x, str):
reveal_type(x)
Existing behavior
At runtime, the class definition will fail and no object can be an instance of both int
and str
.
However, currently pyright allows the class definition and reveals Type of "x" is "<subclass of int and str>"
.
Mypy raises some errors about incompatible methods for the class definition and (with --warn-unreachable
) claims that Subclass of "int" and "str" cannot exist: would have incompatible method signatures
. While this subclass indeed canāt exist, mypyās algorithm here is wrong; it will claim that many classes canāt exist when in fact they can.
Ty, Astralās in-development type checker, points out that Class will raise
TypeErrorat runtime due to incompatible bases: Bases
intand
str cannot be combined in multiple inheritance
and reveals Never after the isinstance() call. This is correct and consistent with the runtime, but tyās implementation relies on hardcoded knowledge of a number of standard library classes.
At runtime, what happens is that every class in CPython must be backed by a C struct. Generally, for classes implemented in Python this struct is always the same size (except if they define __slots__
), but for classes defined in C, each class usually defines its own struct. To construct a subclass, CPython must create a layout that is compatible with all the bases, which is generally only possible if at most one of the bases is a class with a custom layout (known as a āsolid baseā); Iāll present a more precise definition below. Ty is attempting to mirror this mechanism in its implementation.
Why does this matter?
Knowing that a subclass of two classes canāt exist is primarily important for detecting unreachable code. If a type checker can detect unreachable code in a program, that often means the programmer made some incorrect assumption, so itās good to provide more tooling to help type checkers find unreachable code.
I am also thinking about adding support for intersection types. With intersection types, itās important to be able to reduce uninhabited intersections (such as int & str
) to Never. Therefore, a better way to detect incompatible bases would be a good complement to intersection support.
Proposal
We add a new decorator @typing.solid_base
, which can be applied to class definitions. Semantics are as follows.
Every class has a single solid base. It is determined as follows:
- A class is its own solid base if it has the
@solid_base
decorator, or if it has a non-empty__slots__
definition. - Otherwise, if there is a single base class, the solid base is the baseās solid base.
- Otherwise, determine the solid bases of all base classes. If there is only one, that is the solid base. If there are multiple, but one is a subclass of all others, the solid base is the subclass. Otherwise, the class cannot exist.
Type checkers should raise an error for class definitions that do not have a valid solid base, and simplify intersections of nominal classes without a valid solid base to Never. If they warn about unreachable code, they should use this mechanism to detect unreachable branches.
Example:
from typing import solid_base
@solid_base
class Solid1:
pass
@solid_base
class Solid2:
pass
@solid_base
class SolidChild(Solid1):
pass
class C1: # solid base is `object`
pass
# OK: solid bases are `Solid1` and `object`, and `Solid1` is a subclass of `object`.
class C2(Solid1, C1): # solid base is `Solid1`
pass
# OK: solid bases are `SolidChild` and `Solid1`, and `SolidChild` is a subclass of `Solid1`.
class C3(SolidChild, Solid1): # solid base is `SolidChild`
pass
class C4(Solid1, Solid2): # error: no single solid base
pass
Discussion
- The exact rule is a bit more complicated (see above), but a good practical rule of thumb is: if a class has
@solid_base
, then any child classes of that class canāt inherit from any other class with@solid_base
. - I used the name āsolid baseā because thatās what CPython calls it internally (code link). The term doesnāt currently appear in CPython output anywhere, so we could choose a different name for the typing decorator if we wanted, but āsolid baseā seems like a pretty good name to me.
- There are some other reasons why a base class would not be able to exist, such as incompatible metaclasses. Type checkers should use those too but Iām focusing more narrowly on incompatible instance layouts here.
- CPython doesnāt directly expose whether or not a class is a solid base, but itās mostly possible to reconstruct it by looking at some attributes of the type object. I implemented this in pycroscope; if the
solid_base
decorator is added to the type system, we could add the same logic to tools like stubtest to validate the presence of the decorator in stubs. @solid_base
should usually be applied to classes implemented in C, and therefore it should be used in stubs. However, I think we should also allow it in implementation files as a way to allow users to restrict double inheritance in their class hierarchies. Itās not a common ask but I think it can be useful; for example, this might help the first post in this issue.- One concern might be that weāre encoding CPython-specific implementation details into the type system. However, this specific implementation detail is quite stable in CPython, and PyPy has a similar (but apparently stricter) restriction (example issue with some discussion).
Is this a useful thing to add to the type system?