Flow analysis of equality comparison of Integer class

I am trying to understand why the following statement returns true.

a = 10000
b = 10000

a == b # True

Here is my thought process: Both a and b are integer. Integer is a class that extends object. Object contains __eq__ method that is implemented using is operator like this: True if x is y else NotImplemented. We can not use this since we are interested in comparing value a and b object, and not their identify. So Integer class overrides(In C?) object’s __eq__ method and returns boolean(True in this case) by comparing values of a and b. (How we access the value of a and b is not my interest as doc say there is not canonical way to represent value of an object). Did I get the flow correctly?

I thought I did until I read the following line in the doc. This lines implies we are still using Object’s comparison method.

Most other built-in types have no comparison methods implemented, so they inherit the default comparison behavior.

Source: 6. Expressions — Python 3.12.0 documentation

1 Like

What may not be obvious is that whereas other languages/implementations regard an identifier or variable-name as a label representing an address in memory (containing a value), Python’s identifiers are pointers to a value/space in memory. Thus, using your example, once “10,000” is held in-store, the pointers “a” and “b” can both point to the same memory address. Thus, we can short-cut any equality/identity comparison by looking at the two id()-s. They will match because they are aliases to the same memory-location, and the two values will match, by-definition.

This point is made a few paragraphs later (than earlier quotation):

The default behavior for equality comparison (== and !=) is based on the identity of the objects. Hence, equality comparison of instances with the same identity results in equality, and equality comparison of instances with different identities results in inequality. A motivation for this default behavior is the desire that all objects should be reflexive (i.e. x is y implies x == y).

For some lighter reading about identity and equality:
Mike (Mouse vs the Python) Driscoll: Python 101: Equality vs Identity: https://www.blog.pythonlibrary.org/2017/02/28/python-101-equality-vs-identity/
and Trey (PythonMorsels) Hunner: Equality vs identity in Python: Equality vs identity in Python - Python Morsels

The confusion is slightly compounded by using integers as an example. I have it in my head (apologies: have not found a web.ref to illustrate/prove) that a group of smaller integer values are retained in RAM, by some implementations, as a speed-up.

That is what I was trying to point out by saying Object’s __eq__ method uses identity as equality.

I skimmed the two articles and I don’t think they address my question.

Python doc is saying the opposite. Here is a quote,

An object’s identity never changes once it has been created; you may think of it as the object’s address in memory. The ‘is’ operator compares the identity of two objects; the id()
Source: 3. Data model — Python 3.12.0 documentation

I believe you’ve got it right. In the doc-page you cited, it actually describes how different built-in types implement comparison operators. For numeric types, it says: “they compare mathematically (algorithmically) correct without loss of precision”, which basically means that the numbers compare by value. So when they later say “Most other built-in types have no comparison methods implemented”, this does not include numbers, for which the comparison methods are implemented as the (mathematical) value comparison.

1 Like

Yes, this is implemented via C-functions long_compare and long_richcompare in cpython/Objects/longobject.c.

Why it's called "long"

In Python 2, int and long were different types:

  • int denoted the architecture-native integers (like 32-bit ones, depending on the machine) like in C/C++,
  • and long denoted arbitrary precision “long” integers, that is mathematical integers.

In Python 3, only long survived, but it was renamed into int, which now denotes arbitrary precision integers.

1 Like

Thank you for the dig. Can you give me a 10,000 feet view on how you’ve figured it out? The file looks daunting.

1 Like

What you’d be looking for is the PyTypeObject struct, which is what a class object actually is on the C level. As you can see it’s got a whole pile of function pointers (and pointers to additional function pointers), which correspond to all the special methods that can be defined. In this case we want tp_richcompare, which actually implements all the equality/comparison operations in one function.

2 Likes

I’m not sure I would be able to recreate all steps now.:joy:
Roughly, it was some googling and searching/digging through cpython-files for something like “comparison operators in built-in objects”. Then eventually I’ve found do_richcompare function in cpython/Objects/object.c which redirects to type-specific comparison operators, and tp_richcompare field mentioned above.

In any case, it was a fun experience researching cpython-internals, so thanks for the ask.

2 Likes