Hashable data types

Why lists are called hashable? What are the other examples of hashable data?

Actually, lists are NOT hashable! For more see:

Why is a list not hashable? This is very briefly explained in the glossary link that @ullix posted. The main stumbling block is that lists are mutable containers. Tuples are immutable containers, string are also immutable containers - both can be used as hash keys, but lists cannot.

Why can mutable objects not be used as hash keys? A hash table is a table, an associative array. In a normal array the integer indices allow you to access its elements. In a hash table it’s the keys that act as indices. Those keys (the objects you put in the table) first need to be mapped to integers – which is what the hash(some_object) function does.

Cannot we implement a hash function for lists to generate hash keys? Yes, we can. We can simply define a special __hash__ function in the class definition of our custom lists. But could that work? There are only two possible ways: the __hash__ function either needs to rely on the contents of the list or it does not. And there you run into (big) trouble.

For example – here is a piece of code that at first seems to work, but quickly gets mired in quicksand.

>>> class X(list):
...     def __hash__(self):  
                # a hash function that relies on the contents of the list...
...             return sum(self)   
... 
>>> x = X([1,2,3])
>>> y = X([4,5,6])
>>> dd = {x: 1, y: 2}   # define a dict (hash table)
>>> dd
{[1, 2, 3]: 1, [4, 5, 6]: 2}  # it works...
>>> dd[x]   # it really does...
1
>>> dd[y]
2
>>> x.append(3)  # ok, but what happens when x changes...
>>> x
[1, 2, 3, 3]
>>> dd
{[1, 2, 3, 3]: 1, [4, 5, 6]: 2}  # still works?
>>> dd[x]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: [1, 2, 3, 3]       
>>> dd[X([1,2,3,3])  # one more try
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: [1, 2, 3, 3]

So, if you rely on the contents of the object (or also if the object is mutated and its hash value changes), you will no longer be able to retrieve it from the hash table, even though it may appear to be in there…
The hash table itself will no longer be able to find the object back. In fact I wonder what the internal state of it is.

On the other hand, if the __hash__ function would not rely on the content of the list (which in a way is the list), then hashing that object, adding it to an associative array, makes no sense…

Objects have no contents, but instances of object are hashable, because they are immutable and no two instances of object compare as equal. I say this just to point out that lists are not hashable not because they don’t define a hash function, but because they actively disable their inherited hashing from object by setting the __hash__ attribute to None.

>>> list.__hash__ is None
True
1 Like

I actually wasn’t aware of this! Thanks for pointing this out. – I think it’s still a good question to then ask once more why is a hash function not defined and even disabled. I tried to say something about that underlying why question.

Absolutely! I wasn’t trying to contradict anything you did say, just clarifying a point I felt you left unsaid.

1 Like