I am trying to understand why the following statement returns true.
a = 10000
b = 10000
a == b # True
Here is my thought process: Both a and b are integer. Integer is a class that extends object. Object contains
__eq__ method that is implemented using
is operator like this:
True if x is y else NotImplemented. We can not use this since we are interested in comparing value a and b object, and not their identify. So Integer class overrides(In C?) object’s
__eq__ method and returns boolean(True in this case) by comparing values of a and b. (How we access the value of a and b is not my interest as doc say there is not canonical way to represent value of an object). Did I get the flow correctly?
I thought I did until I read the following line in the doc. This lines implies we are still using Object’s comparison method.
Most other built-in types have no comparison methods implemented, so they inherit the default comparison behavior.
Source: 6. Expressions — Python 3.12.0 documentation
What may not be obvious is that whereas other languages/implementations regard an identifier or variable-name as a label representing an address in memory (containing a value), Python’s identifiers are pointers to a value/space in memory. Thus, using your example, once “10,000” is held in-store, the pointers “a” and “b” can both point to the same memory address. Thus, we can short-cut any equality/identity comparison by looking at the two id()-s. They will match because they are aliases to the same memory-location, and the two values will match, by-definition.
This point is made a few paragraphs later (than earlier quotation):
The default behavior for equality comparison (
!=) is based on the identity of the objects. Hence, equality comparison of instances with the same identity results in equality, and equality comparison of instances with different identities results in inequality. A motivation for this default behavior is the desire that all objects should be reflexive (i.e.
x is y implies
x == y).
For some lighter reading about identity and equality:
Mike (Mouse vs the Python) Driscoll: Python 101: Equality vs Identity: https://www.blog.pythonlibrary.org/2017/02/28/python-101-equality-vs-identity/
and Trey (PythonMorsels) Hunner: Equality vs identity in Python: Equality vs identity in Python - Python Morsels
The confusion is slightly compounded by using integers as an example. I have it in my head (apologies: have not found a web.ref to illustrate/prove) that a group of smaller integer values are retained in RAM, by some implementations, as a speed-up.
That is what I was trying to point out by saying Object’s
__eq__ method uses identity as equality.
I skimmed the two articles and I don’t think they address my question.
Python doc is saying the opposite. Here is a quote,
An object’s identity never changes once it has been created; you may think of it as the object’s address in memory. The ‘is’ operator compares the identity of two objects; the id()
Source: 3. Data model — Python 3.12.0 documentation
I believe you’ve got it right. In the doc-page you cited, it actually describes how different built-in types implement comparison operators. For numeric types, it says: “they compare mathematically (algorithmically) correct without loss of precision”, which basically means that the numbers compare by value. So when they later say “Most other built-in types have no comparison methods implemented”, this does not include numbers, for which the comparison methods are implemented as the (mathematical) value comparison.
Yes, this is implemented via C-functions
long_richcompare in cpython/Objects/longobject.c.
Why it's called "long"
In Python 2, int and long were different types:
- int denoted the architecture-native integers (like 32-bit ones, depending on the machine) like in C/C++,
- and long denoted arbitrary precision “long” integers, that is mathematical integers.
In Python 3, only long survived, but it was renamed into int, which now denotes arbitrary precision integers.
Thank you for the dig. Can you give me a 10,000 feet view on how you’ve figured it out? The file looks daunting.
What you’d be looking for is the
PyTypeObject struct, which is what a class object actually is on the C level. As you can see it’s got a whole pile of function pointers (and pointers to additional function pointers), which correspond to all the special methods that can be defined. In this case we want
tp_richcompare, which actually implements all the equality/comparison operations in one function.
I’m not sure I would be able to recreate all steps now.
Roughly, it was some googling and searching/digging through cpython-files for something like “comparison operators in built-in objects”. Then eventually I’ve found
do_richcompare function in cpython/Objects/object.c which redirects to type-specific comparison operators, and
tp_richcompare field mentioned above.
In any case, it was a fun experience researching cpython-internals, so thanks for the ask.