How should C-extensions deal with `__eq__`?

Tensor libraries like numpy or torch override the == operator to act element-wise on the array, in accordance with PEP207. However, since the right-hand-side version __req__ of the __eq__ dunder method does not exist (as per data model, __eq__ is its own reflection), it seems that it is currently impossible to annotate the Tensor classes inside stubs to accurately reflect their runtime behavior.

Example:

import torch

x = torch.randn(3)
assert_type(x == 0, torch.Tensor)  # ✅
assert_type(0 == x, torch.Tensor)  # ❌ got bool, not Tensor

How to resolve this inconsistency in the type system?

  • leave it, and suggest users to use a linter that suggests replacing Number == Tensor with Tensor == Number.
  • allow annotating __req__ and __rne__, but only inside stubs.
  • something else?

Moreover, how should a type-checker know when to use the __eq__ of the left operand, since generally __eq__ is annotated to allow comparisons with arbitrary objects?

See Also:

My first reaction is that this is a type checker bug: the type checker infers a different type for the expression 0 == x than its actual runtime type.

Unfortunately, it’s a difficult bug for type checkers to resolve, since the actual semantics of __eq__ are complex and can’t be fully encoded in the method signature using the current type system. So a fully correct solution likely requires changes to the type system. There’s been a few other recent threads (Encoding mypy's "overlapping types" in the type system, `__hash__`, `__eq__`, and LSP) dealing with similar issues. I don’t have a solution but I think someone needs to look at all the related issues and try to come up with a solid solution.

I don’t think adding a typing-only __req__ method is a good solution; that would create a second complicated way of representing __eq__ semantics that’s different from the runtime semantics. It’s better for the type system to accurately reflect how things work at runtime. Also, the mention of C extensions in the header here seems like a red herring; the same thing can come up with objects implemented in Python.

1 Like

To add onto this, there are further issues not represented in either of those threads with the actual __eq__ behavior. Subclasses can steal priority from the LHS in all of rich comparison methods, not just __eq__, as well as in some cases not having an exact type, but some compatible type which may have extended/changed the behavior of __eq__ from the lower bound.

To me, part of the problem is that it’s somewhat unclear how some special methods should be annotated. Equality should never fail, and object.__eq__ requires via LSP that the parameter should be object. But it’d be more useful to annotate the objects that equality will succeed at. For the implementation of the method, you’d want the parameter to be object so that type checkers will warn if you haven’t done checks to ensure mismatched types are just unequal.

Potentially you could use an overload, first with the valid types returning Literal[True], second with object returning Literal[False] or NotImplemented. But that’d be incredibly annoying boilerplate, even if type checkers supported it.