hash(None) Mk.2

Some costs are one-off up-front costs. Some are on-going costs.

  • In this specific example, address randomization is used for security reasons. Is it safe to change the hash of None to not rely on its address? Don’t know. A security expert will need to consider it.

  • Somebody has to implement it.

  • Somebody has to document it.

  • Somebody has to write tests for it.

  • Those tests have to run every time we run the test suite, and on the CI server.

  • Now it is a feature of the language that every other Python interpreter has to implement.

  • And that people have to learn.

  • Every extra branch or line of code adds more places that bugs can occur.

  • In most cases, new features add code to the interpreter, making it bigger. If that feature isn’t generally useful, it becomes just bloat.

  • Once we’ve given None a constant hash, what about other singletons? In a month, or a year, somebody will be back with a feature request to make Ellipsis or NotImplemented behave like None.

  • And every new feature comes with some risk: if we make a mistake, some unforeseen problem occurs because of this (I can’t see what that might be, but that’s the problem with unforeseen problems) we’re stuck with it for a long deprecation period before we can remove it.

These costs might be small. Okay, that’s great! That is a point in your proposal’s favour. But balanced against the (hypothetically) small costs is that the benefit is likewise small.

As I pointed out in the other thread you have at least three options for avoiding this issue, and even if we agree to the proposal you won’t get your ultimate aim – consistent set iteration order across separate runs.

You might get something which looks like consistent set order by accident, but it will be unsafe and could be broken at any time. In a year, you’ll be back complaining that despite None’s consistent hash, a bug fix point fix broke your set order consistency, and we’ll say “We told you so!” and then we’ll need to have another forty or fifty post thread about why sets are unordered and why its okay to change set iteration order in a bug fix release.

So we have to make a judgement. Small cost, versus small benefit. Which wins? If the cost is higher than the benefit, then this proposal makes Python worse rather than better. If you could find even one single core developer who is willing to champion a PEP, you might have a chance.

Or it might be that in a month or a year or five years, some core developer will be annoyed enough by None hashing under address randomization that he or she will just go ahead and implement it, vindicating your position, and you can come back and tell us “Told you so!”.

The process is not perfect, and it is often annoyingly conservative. I’ve seen many proposals get rejected for many years (e.g. the ternary if operator) until something changes and we suddenly accept it.

But that’s the thing: errors of ommission (failed to add a good feature) are much less important than errors of commission (added a bad feature where the costs are higher than the benefit).

8 Likes