We’re having problems when trying to implement __hash__ for a dataclass that has successors. Investigation led to the following lines in dataclasses.py:
class_hash = cls.__dict__.get('__hash__', MISSING)
has_explicit_hash = not (class_hash is MISSING or
(class_hash is None and '__eq__' in cls.__dict__))
It seems dataclasses decides whether a class has an explicit hash by examining only the immediate class, and not its mro. This is at the very least inconsistent, as in other places dataclasses.py does examine the mro to reach similar decisions (eg, slots).
Would this be considered a bug? Would a PR that changes it be considered?
No, this is expected behavior - a subclass can change wether or not it’s safe to hash instances of that type with zero relation to what superclasses implement or don’t implement.
If this check doesn’t do what you want because you are in some interesting edge cases, you should manually provide the correct parameters.
But also, what is your situation? Can you provide a small example where it behaves incorrectly?
Ok, but this is so much of a toy example that I still don’t understand why you want this.
The reason this doesn’t work is because despite there not being any new attributes, B generates a new __eq__. Is that the behavior you want to see changed?
Also, don’t forget that without frozen=True, it is strictly unsafe to support hashing.
Feel free to add frozen to the example, it doesn’t change the issue.
When you say ‘this is expected behavior’ (== dataclass ignoring inheritance for a particular method but not all others) is this because it was actually discussed? Was there any agreement that this is the right behavior?
It is not dataclasses “ignoring inheritance”, dataclasses is just respecting the behavior of python here. Remove @dataclass from the example and instead manually define __eq__ in bothA and B and you will observe the exact same behavior.
Note if you use @dataclass(eq=False) on B then it will inherit the __hash__ method.
dataclasses behaviour is trying to emulate the behaviour of Python if the __eq__ method was written in the class even though it is adding it after the class has been defined.
from dataclasses import dataclass
def eq_method(self, other):
if type(self) is type(other):
return self.a == other.a and self.b == other.b
return NotImplemented
@dataclass
class A:
a: int = 1
b: int = 2
def __hash__(self) -> int:
return hash((self.a, self.b))
class B(A):
__eq__ = eq_method
class C(A):
pass
C.__eq__ = eq_method # This is roughly how dataclass adds the __eq__ method
print(B.__hash__) # None
print(C.__hash__) # <function A.__hash__ at 0x76deda1a0220>
Dataclasses has to do the check and clear out the hash itself because it wants to look like you’ve written class B and not class C even though the way it works is more like class C.