Type hints for bool vs int overload

jorenham · January 22, 2025, 10:19pm

It only violates the LSP if you consider Just (a Protocol) as a subtype of object. Since it’s a Protocol, I don’t see how that can be the case, because Protocols can only inherit from other Protocols, so that excludes object. That’s why I believe that the @override should be removed, and that the reported error is a false positive.

Most off the functions in numpy have a dtype parameter. If you pass dtype=bool, it returns something with a scalar-type np.bool (assuming import numpy as np). If you pass it dtype=int, the scalar-type of the return type is np.int_. But np.bool and np.int_ don’t have the same subtyping relation as builtins.bool and builtins.int. Without Just[int], I’d be forced to annotate the scalar-type resulting from dtype=int as np.bool | np.int_. This is problematic when functions only accept np.int_ as input, and reject np.bool, such as np.sub.

I agree.

I also don’t consider this a type-checker bug. I only brought it up to illustrate that there are scenario’s where a types don’t behave like “sets of possible runtime objects a type represents” (Type hints for bool vs int overload - #47 by carljm).

As I’ve said in the beginning of this post, I don’t think that Just violates the LSP, and therefore think that Just is sound.

jorenham · January 22, 2025, 10:25pm

Don’t get me wrong; I truly want this to be the case.

Another example that comes to mind is typing.TypeGuard. And I’m guessing that there’s also some way that typing.Any can be (mis)used that shows this.

My point is that the third overload isn’t unreachable, which shows that type-checkers don’t follow your (imo very reasonable) description of what a type is.

carljm · January 22, 2025, 10:34pm

No, I don’t think it shows that. I think it shows rather that you have a misunderstanding of the semantics of @overload in the Python type system. An overload is not a runtime match/case. If a call fails to match the overload requiring arg: A and also fails to match the overload requiring arg: B, that does not imply that the argument cannot be an object that inhabits A or B. It only shows that the static type we have for the argument is not sufficiently precise for us to know that it must inhabit A, or that it must inhabit B. That’s very different, and it means that a subsequent overload for arg: A | B is not unreachable at all, and should not be unreachable.

This is the same reason why there must be rules about overlapping overloads.

erictraut · January 22, 2025, 10:39pm

All values in the Python type system (including any whose types match a protocol definition) must subclass from object. That means all protocol instances are object instances. As a consequence, any protocol is assignable to object, and it must conform to the interface of object. If any class (including a protocol) overrides some aspect of the object contract in an incompatible way, it’s an LSP violation.

The consequences of this LSP violation can be seen in several of the examples provided above.

oscarbenjamin · January 22, 2025, 11:36pm

I just looked up object in pyright’s typeshed and it gives me this:

class object:
    __doc__: str | None
    __dict__: dict[str, Any]
    __module__: str
    __annotations__: dict[str, Any]
    @property
    def __class__(self) -> type[Self]: ...
    @__class__.setter
    def __class__(self, type: type[object], /) -> None: ...
    def __init__(self) -> None: ...
    def __new__(cls) -> Self: ...
    # N.B. `object.__setattr__` and `object.__delattr__` are heavily special-cased by type checkers.
    # Overriding them in subclasses has different semantics, even if the override has an identical signature.
    def __setattr__(self, name: str, value: Any, /) -> None: ...
    def __delattr__(self, name: str, /) -> None: ...
    def __eq__(self, value: object, /) -> bool: ...
    def __ne__(self, value: object, /) -> bool: ...
    def __str__(self) -> str: ...  # noqa: Y029
    def __repr__(self) -> str: ...  # noqa: Y029
    def __hash__(self) -> int: ...
    def __format__(self, format_spec: str, /) -> str: ...
    def __getattribute__(self, name: str, /) -> Any: ...
    def __sizeof__(self) -> int: ...
    # return type of pickle methods is rather hard to express in the current type system
    # see #6661 and https://docs.python.org/3/library/pickle.html#object.__reduce__
    def __reduce__(self) -> str | tuple[Any, ...]: ...
    def __reduce_ex__(self, protocol: SupportsIndex, /) -> str | tuple[Any, ...]: ...
    if sys.version_info >= (3, 11):
        def __getstate__(self) -> object: ...

    def __dir__(self) -> Iterable[str]: ...
    def __init_subclass__(cls) -> None: ...
    @classmethod
    def __subclasshook__(cls, subclass: type, /) -> bool: ...

This is incompatible with many types e.g. object is hashable but many types are not hashable. I could list many more examples but in general the type annotations here do not match many widely used types in Python. Maybe your answer is that those are all LSP violations but if so then to me that says that we somehow need a type system that can handle the LSP violations.

mikeshardmind · January 22, 2025, 11:57pm

Issues with Hashable are being discussed already here: `__hash__`, `__eq__`, and LSP and there’s a viable resolution that doesn’t require throwing away LSP to function.

Same with pretty much any other you can pick there.

Imprecision in other places in object’s type doesn’t mean we should throw away the concept of LSP, it just means there’s other places where we know there’s unsoundness from the type system not expressing everything in accurate detail.

It does mean that we shouldn’t advise people to build with a reliance on the unsoundness of the parts that are currently unsound.