Why is it better to write `if a is None` instead of `if a == None`?

Lucas_Malor · December 31, 2023, 4:35pm

As title. I know the first style is recommended by PEP 8, and when I write the second style, PyCharm gives me

PEP 8: E711 comparison to None should be 'if cond is None

But why it’s better to write is? IMHO the fact that None is a singleton is an implementation detail, and there’s no explaining in PEP 8 about this “rule”.

Maybe it’s because __eq__ can be overloaded?

barry-scott · December 31, 2023, 4:44pm

Yes it is becuase __eq__ may be defined.

Using a is None has no side effects.
Using a == None will run a.__eq__ if defined.
Worse if a.__eq__ does not allow for None being the other object you can see tracebacks.

kknechtel · December 31, 2023, 4:47pm

Not really - it was consciously implemented so that the a is None code would work. Otherwise the alternative would have to do a type check or something like that, which is nowhere near as easy within the existing framework.

Yes, as Barry said.

Stefan2 · December 31, 2023, 6:14pm

It’s not.

Lucas_Malor · December 31, 2023, 11:18pm

I know the docs let me rephrase: it should be an implementation detail.

But since you can’t trust __eq__, I suppose this is the way to go.

cameron · January 1, 2024, 12:57am

Pointedly, __eq__ might match/not-match some other object. Generally
things are not ==None, but since you’re using a’s __eq__ method,
it might do something you hadn’t planned for,

a is None is succinct, reliable, and well understood as an idiom.

Cheers,
Cameron Simpson cs@cskk.id.au

CAM-Gerlach · January 1, 2024, 3:28am

On the basis of what, sorry? None is defined in not only the docs but in the language specification as being a singleton built-in constant:

This type has a single value. There is a single object with this value. This object is accessed through the built-in name None.

I’m curious what the rationale would be for changing this now.

Given that per the language definition, None is a singleton built-in constant, identity rather than equality checking s semantically the most appropriate operation to use. Additionally, there is a performance advantage, with equality checking being substantially slower.

For example, in Python 3.11, comparing even trivial instances of the built-in types (str, int, object(), etc) to None is nearly 50% slower for equality vs. identity:

$ python -m timeit --setup "obj = 'test'" -- "obj is None"
5000000 loops, best of 5: 43.3 nsec per loop

$ python -m timeit --setup "obj = 'test'" -- "obj == None"
5000000 loops, best of 5: 64.8 nsec per loop

While for a simple custom class:

class Spam():
    def __init__(self, value):
        self.value = value

    def __eq__(self, other):
        if self is other:
            return True
        if type(other) is not type(self):
            return False
        return other.value == self.value

It is around 7 times slower:

$ python -m timeit --setup "from temp import Spam; obj = Spam(42)" -- "obj is None"
5000000 loops, best of 5: 43.3 nsec per loop

$ python -m timeit --setup "from temp import Spam; obj = Spam(42)" -- "obj == None"
1000000 loops, best of 5: 304 nsec per loop

Even for the best case, objects that are None, there is still around a 40% speed penalty due to the extra lookup and logic:

$ python -m timeit --setup "obj = None" -- "obj is None"
5000000 loops, best of 5: 43.3 nsec per loop

$ python -m timeit --setup "obj = None" -- "obj == None"
5000000 loops, best of 5: 60.1 nsec per loop

On older versions of Python, e.g. 3.7, the speed penalty is even greater, e.g. nearly 15x for trivial classes with a custom __eq__:

$ python -m timeit --setup "import temp; obj = temp.Spam(42)" -- "obj is None"
10000000 loops, best of 5: 36.9 nsec per loop

$ python -m timeit --setup "import temp; obj = temp.Spam(42)" -- "obj == None"
500000 loops, best of 5: 522 nsec per loop

Rosuav · January 1, 2024, 3:41am

When it’s nothing more than a performance question, though, we tend to use == comparisons (see eg interning of strings and small ints). But the question of whether to use is None or == None is more one of semantics. We usually DON’T want to allow the other operand to decide whether or not it’s equal to None, since that only creates confusion.

Lucas_Malor · January 1, 2024, 9:06am

I want to say in advance I understood it’s better to write is None instead of == None. So, since you have to write is None, it’s good that in the docs None is specified as singleton.

Mine was only a generic consideration. In general, IMHO the fact that an object is a singleton should be considered an implementation detail, and end users should not rely on this. For example 1 is 1 is true, but it’s use is discouraged.

Lucas_Malor · January 1, 2024, 9:08am

Side note: a PEP can be modified? Can I open a bug for asking to make more explicit the fact that is better to write x is None because of the __eq__ problem?

Rosuav · January 1, 2024, 9:51am

There’s a huge difference between singletons by design and singletons for performance, though. 1 is only works in CPython because small integers are cached; and actually, 12345 is 12345 will (happen to) work, because integer literals are shared within a compilation unit. But modules are singletons by design. Suppose I do something like this:

import some_module

try: some_module.do_stuff()
except subprocess.CalledProcessError: pass

then I expect that my idea of subprocess.CalledProcessError is the exact same one that some_module would be raising - that is, that the subprocess module is a single module, and we’re looking at two references to the same thing. If modules were NOT singletons, this would entirely break, and we’d have to identify exceptions by some kind of name instead.

This is not merely an implementation detail. It’s a vital part of the definition of module importing. And that’s true of a lot of other singletons too - one way or another, they’re being tested for identity, and equality simply wouldn’t work.

Stefan2 · January 1, 2024, 4:06pm

In the common (and asked about in the title here) if context it seems to be even more extreme, as is gets faster and == gets slower:

13.4 ns  x is None
29.1 ns  x == None
 9.1 ns  if x is None: pass
33.1 ns  if x == None: pass
Python: 3.11.4 (main, Sep  9 2023, 15:09:21) [GCC 13.2.1 20230801]

The benchmark script:

from timeit import repeat
import sys

for c in 'x is None', 'x == None', 'if x is None: pass', 'if x == None: pass':
    t = min(repeat(c, 'x = "test"', repeat=100))
    print(f'{t*1e3:4.1f} ns ', c)
print('Python:', sys.version)

Attempt This Online!

CAM-Gerlach · January 1, 2024, 11:45pm

Right—performance is only a compelling rationale on its own if there aren’t clear correctness and/or semantic reasons one way or another, as there are as I indeed mentioned first (if relatively briefly, since you and others had well covered this aspect). I mentioned it here because it only further enhances the case for comparing to None by identity rather than equality.

Indeed, with interning, there’s still a similar performance gap when comparing by equality rather than identity (the performance gain being in reducing the cost of object creation rather than comparison), but using is as a micro-optimization would of course be both semantically inappropriate and logically incorrect, as it relies on the explicitly implementation-dependent rather than language-guaranteed interning behavior.

In my view, the order of cause and effect is somewhat reversed. It was considered most logical for None to be a built-in constant of the language rather than a non-singleton, as redefining it would be rather pathological, and therefore it is guaranteed as such by the language spec/docs and conforming implementations. As a consequence, is is the most correct and semantically-appropriate test for determining if an object is None. If the former were not the case, you’d have a lot more to worry about than just __eq__, e.g. anyone/thing could replace None with any arbitrary object, which would not only not compare identical, but may not compare equal as well, or may raise an error on comparison, or have arbitrary side effects, or…

This consideration is only true for cases when it is an implementation detail, versus a guarantee of the language. None, True, False, and (barring hackery) imported modules, are all guaranteed by the language to be singletons, and thus can and, when semantically appropriate, should be relied upon. By contrast, interning, or e.g. set or (before 3.7) dict order is an interpreter-specific implementation detail, and should not. The generic consideration here is that one can and (when appropriate) should rely on language guarantees, but not on implementation details (barring extenuating circumstances), same as UB in C/C++.

See what I wrote in PEP 1 for the general answer to this question. As to PEP 8 specifically, it would be up to the active authors/maintainers of that PEP (Guido, Barry, etc), along with general concurrence of the PEP Editors and Python core devs, in the case of a clarification/explanation of an existing rule (a more substantive change to the guidance itself would presumably require a more formal discussion and debate)

In my view, that specific issue simply a side-effect of the fundamental semantic rationale, that None is a built-in constant singleton, and thus is is the most correct and appropriate comparison. Upon checking the PEP text, this is in fact indeed what is currently stated:

Comparisons to singletons like None should always be done with is or is not, never the equality operators.

Stefan2 · January 2, 2024, 6:18am

That’s btw because the common if x is None: has special bytecode support. Even the compiler says using is is a good idea!

Abbreviated dis outputs:

----------------------------------------
x is None
----------------------------------------
  1           2 LOAD_NAME                0 (x)
              4 LOAD_CONST               0 (None)
              6 IS_OP                    0

----------------------------------------
x == None
----------------------------------------
  1           2 LOAD_NAME                0 (x)
              4 LOAD_CONST               0 (None)
              6 COMPARE_OP               2 (==)

----------------------------------------
if x is None: pass
----------------------------------------
  1           2 LOAD_NAME                0 (x)
              4 POP_JUMP_FORWARD_IF_NOT_NONE     2 (to 10)

----------------------------------------
if x == None: pass
----------------------------------------
  1           2 LOAD_NAME                0 (x)
              4 LOAD_CONST               0 (None)
              6 COMPARE_OP               2 (==)
             12 POP_JUMP_FORWARD_IF_FALSE     2 (to 18)

Attempt This Online!

Lucas_Malor · January 3, 2024, 9:32am

I completely agree

Ok, but PEP 8 says " Comparisons to singletons like None should always be done with is". It does not specify that the singletons must be singletons defined in the language. An example that 1 is 1 is bad will be an improvement for understanding, without involving complicated considerations about __eq__

I agree in general, but PEP 8 is special, since it’s more a guide to how to code in a clear manner.
Maybe I’m asking too much, but it’s not better to have a version of PEP 8 in the docs, so modifications to the style of writing and clarifications are more easy?

Oh. And how I can disturb all these people?

Side note: about the considerations about the speed, IMHO the fact that is is faster is only a good coincidence. It’s better to write 1 == 1 instead of 1 is 1 even if the latter is more fast

Stefan2 · January 3, 2024, 10:22am

Since when is 1 a singleton? (Since what CPython version)?

MegaIng · January 3, 2024, 10:32am

There are different “categories” of objects that can be used checked for with is:

singletons (None, NotImplemented, ...)
Enums (bool, i.e. True/False, stdlib Enum subclasses)
constants inside a single code object ((1,2,3) is (1,2,3) will be True)
interned strings or ints

Only for the first two of those categories does it make semantic sense to use is instead of ==: There is a guarantee that there is only a single instance of that value. For all of these, there is technically a performance benefit.

Rosuav · January 3, 2024, 10:42am

Small integers are cached/interned in CPython for performance. That’s not the same as being singletons though. The term ‘singleton’ generally precludes anything where there are other instances of the type, so True and False technically don’t count, though if you call them a pair of “doubletons” and treat them the same way, it becomes just a matter of terminology.

But I would definitely say that 1, even if interned, is not a singleton. It so happens that there will (normally) only ever be one instance of an integer with that value, but this is not the same thing. Plus, there are lots of objects that compare equal to 1, including True, any instance of the float value 1.0, any Fraction with value 1/1, etc. It would be quite unusual to require both that it be an instance of int itself (and not a subclass) AND that its value be equal to 1, so there’s no real reason to compare with is, which is an even stronger check.

(Side note: It would be perfectly valid to implement Python without actually having “real objects” for most integers. You could do something whereby most objects are stored as pointers, but if the low bit is set, all the other bits represent the actual value of the integer, and then you could have effectively all integers up to ±2**62 be effectively interned. Are they now singletons? No. They don’t even exist - they’re entirely mythical objects!)

Stefan2 · January 3, 2024, 10:47am

It’s really a question for the one who does consider 1 to be a singleton, i.e., Lucas.

Stefan2 · January 3, 2024, 11:02am

(Not saying you can’t reply, just saying that it won’t work if you don’t even think that )