hash(None) Mk.2

At least you are making some sort of argument that I can respond to. You really should have started with that.

If an operation returns a constant result (as can be observed from the source code, which is open), running it by definition confers no information to an attacker. I don’t need to be a security expert to know that. If anything, it is the default object hash being returned on statically allocated objects that’s the security risk, since it basically tells you where in memory the Python binary was loaded into

  • Somebody has to implement it.
    I did

  • Somebody has to document it.
    Not too sure about that. I mean I wrote a line about it in the blurb. But since no one depends on this behavior for their code before or after the change, nothing bad would happen if we don’t tell them about it. It is also not a change to the requirements.

  • Somebody has to write tests for it.
    How? None is only equal to itself, right? So as far as requirements go, its hash can be any value that stays constant throughout the run. Pretty sure a literal int32_t constant does that. The only test we could write given the requirements is an assertion that hash(None) == hash(None) which is tautologically correct for a literal constant.

  • Those tests have to run every time we run the test suite, and on the CI server.
    I don’t think there is value in running code that checks tautologies in CI
    do you have tests today that check hash(None) == hash(None)? no? But what if there’s a bug in Py_HashPointer? It actually does something less trivial than to return a constant.
    If you don’t test it now, why test it after the change
    If you do test it now, what other tests are needed?

  • Now it is a feature of the language that every other Python interpreter has to implement.
    I disagree. There is no change to the requirements. They can implement it however they like.

  • And that people have to learn.
    No, people can stay oblivious to it.
    People also have to learn that hash(None) is not a constant. Some are very surprised by it. I know I was, and I’ve talked with enough other people to tell you I wasn’t the only one that was surprised.

To a Python dev who knew about it for 10 years now I’m sure it seems like an obvious thing, though.

  • Every extra branch or line of code adds more places that bugs can occur.
    This is true for any change including mine. Hard to see where in a return constant function we could hide a bug, but in general you are right.

  • In most cases, new features add code to the interpreter, making it bigger. If that feature isn’t generally useful, it becomes just bloat.
    something like 3 lines of bloat but yes, the cost is not zero.

  • Once we’ve given None a constant hash, what about other singletons? In a month, or a year, somebody will be back with a feature request to make Ellipsis or NotImplemented behave like None.
    None is different than the other because of how Optional is defined.
    Not that I even think it’s bad for other sentinel values to hash to constants. They probably should, it just doesn’t matter in practice.

  • And every new feature comes with some risk: if we make a mistake, some unforeseen problem occurs because of this (I can’t see what that might be, but that’s the problem with unforeseen problems) we’re stuck with it for a long deprecation period before we can remove it.
    There is no change to the requirements, therefore as a special case of that, no new feature.

But I also understand the general sentiment here, that’s why this thread should die. You will go on believing whatever it is you want to believe. This is a terrible change. These are not the droids you were looking for.

1 Like