Probably subclass-builtin (FURB189) | Ruff
It is in preview, so you’d have to run with --preview for now. ruff check --select=FURB189 --preview file.py
Ruff for VS Code is still on v0.7.1, so this rule isn’t yet available for me, but this issue isn’t exclusive to builtins:
class FooBase:
def __init__(self, bar=0):
self.bar = bar
def __ne__(self, value):
return not isinstance(value, FooBase) or self.bar != value.bar
class Foo(FooBase):
def __eq__(self, value):
return self is value
def __hash__(self):
return id(self)
>>> f1, f2 = Foo(), Foo()
>>> f1 == f2
False
>>> f1 != f2
False
Note that a user writing a subclass can’t easily determine which of the 2 methods is implemented. Let alone be aware of this problem.
Having that not isinstance(value, FooBase) condition in __ne__ probably just means that the class doesn’t support subclassing if this was in real code. If you mark FooBase as @final, then mypy and pyright will catch it.
Interesting. I think it’s a carefully crafted run-time bug in the program’s logic, not detectable by static typing analysis or enforcing linting rules. Is that your point?
If you’re going to add customised weird implementations of methods like __eq__, or any other hacks, extensive tests also need to be added to guarantee all behaviours are as intended. Don’t rely on Ruff et al. to check for every possible bug under the sun.
Or is there some compiler, e.g. Rust or other tool that would catch this before run-time, in any language?
Not per se, I believe you can just check that if you overwrite __eq__(), __ne__() should also be overwritten if defined in a super class.
Very true It’s just that this could have slept under my radar if I hadn’t used != in my tests. Although Ruff should probably warn me if __hash__() is overwritten, but not __eq__().
Have I not understood the need to override __ne__? If that method is deleted, the equality operations work more normally:
class FooBase:
def __init__(self, bar=0):
self.bar = bar
class Foo(FooBase):
def __eq__(self, value):
return self is value
def __hash__(self):
return id(self)
f1, f2 = Foo(), Foo()
assert not f1 == f2
assert f1 != f2
Ah. So, I’m not saying duplicate keys are a good idea (far from it! ) , but was the original issue in the str sub class because str does something custom in str.__ne__ ?
This allows k1 == k2 and k1 != k2 (hopefully that was the intent!):
class DuplicateKey(str):
"""A key that can appear multiple times in a dictionary.
>>> k1, k2 = DuplicateKey('key'), DuplicateKey('key')
>>> {k1: 'value 1', k2: 'value 2'}
{'key': 'value 1', 'key': 'value 2'}
"""
_count = 0
_counted = False
def __eq__(self, value):
return self is value
def __hash__(self):
if not self._counted:
self._counted = True
self._count = DuplicateKey._count
DuplicateKey._count += 1
return hash((self._count, str(self)))
__ne__ = object.__ne__
k1, k2 = DuplicateKey(), DuplicateKey()
k1, k2 = DuplicateKey('key'), DuplicateKey('key')
assert not k1 == k2
assert k1 != k2
print({k1: 'value 1', k2: 'value 2'}) # Expected: {'key': 'value 1', 'key': 'value 2'}
The __ne__ method follows automatically from __eq__ only if __ne__ isn’t already defined in a superclass. So, if you’re inheriting from a builtin, it’s best to override both.
This is true only if the value is the object itself.
There is no bug in your subclass. The only issue is with the statement: ‘A key that can appear multiple times in a dictionary.’ These instances may have the same string values, but they are two separate instances. I am not aware of any case where two different instances share the same memory, except for primitive data types.
Yeah, I feel this is the right answer. What OP is considering a bug is that __ne__ is not the opposite of __eq__, but the language allows defining both, so this can happen if you are not careful in how you define both (or maybe by design in some libraries).
I should have used an abstract example instead of a hack to have duplicate keys (the discussion wasn’t about that):
class FooBase:
def __ne__(self, other):
return not isinstance(other, FooBase)
class Foo(FooBase):
def __init__(self, value):
self.value = value
def __eq__(self, other):
return isinstance(other, Foo) and self.value == other.value
But no your solution isn’t ideal for me. By subclassing string I don’t need special handling in the encoder of my json library and it works with a DuplicateKey type from another library without needing to register it somehow.
There’s no syntactic restriction that prevents JSON objects from having duplicate keys.
But programming languages usually don’t support this, to preserve the exact structure of the file additional keys are converted to DuplicateKeys, which are a duplicate of the regular key:
So what you’re saying is, you want to be able to generate malformed JSON without having to specially code it. That’s a pretty weird thing to do, so I don’t really see why it’s a problem to have to explicitly write a method to do it (rather than just messing with equality and subclassing).