Magic functions to C functions in Cpython

I am looking into Cpython implementation and got to learn about how python tackles operator overloading (for example comparison operators) using something like richcmpfunc tp_richcompare; field in _typeobject struct. Where the type is defined as typedef PyObject *(*richcmpfunc) (PyObject *, PyObject *, int);. And so whenever there is need for PyObject being operated by these operators it tries to call tp_richcompare function. My doubt is that in python we use magic functions like __gt__ etc. to override these operators. So how does python code gets converted into C code as a tp_richcompare and is being used everywhere where we interpret any comparison operator for PyObject.

My second doubt is kind of general version of this: How code in a particular language (here Python) to override things (operators, hash etc.) which are interpreted in another language (C in case of CPython) calls the function defined in first language (Python). As far as I know, when bytecode is generated it’s a low-level instruction based representation (which is essentially array of uint8_t).

Another example of this is __hash__ which would be defined in python but is needed in the C-based implementation of the dictionary while lookdict. Again they use C function typedef Py_hash_t (*hashfunc)(PyObject *); everywhere hash is needed for a PyObject but translation of __hash__ to this C function is mysterious.

If the rich comparison methods are overridden, then the type’s tp_richcompare slot gets set to slot_tp_richcompare(). This function supports calling the rich comparison method of the type. For example:

>>> class C: pass
...
>>> hex(id(C))
'0x229c9aa3500'
>>> DebugBreak() # Windows: break into an attached native debugger

Initially tp_richcompare is inherited from object, i.e. object_richcompare() in C.

0:000> ?? *((python312!PyTypeObject *)0x229c9aa3500)->tp_richcompare
<function> 0x00007ffd`63f50110
 _object*  python312!object_richcompare+0(
        _object*,
        _object*,
        int)

If __lt__() gets defined, then tp_richcompare gets changed to slot_tp_richcompare().

>>> C.__lt__ = lambda s, o: True
>>> DebugBreak()
0:000> ?? *((python312!PyTypeObject *)0x229c9aa3500)->tp_richcompare
<function> 0x00007ffd`63f57600
 _object*  python312!slot_tp_richcompare+0(
        _object*,
        _object*,
        int)

@eryksun I want to know how code written in python under __gt__ etc. is converted into function in C ?

Maybe you’re missing the other side of the problem, in terms of how the interpreter handles calling the type’s tp_richcompare slot function. The COMPARE_OP bytecode operation is implemented via PyObject_RichCompare(), which does most of the work in do_richcompare().

@eryksun that part is fine. I just wanted to know exactly how code written in python in magic functions are call from these slot functions rich_compare ones. I know i may sound weird but i have been stuck on this for long. I just want to know how that translation is happening or there is no translation is it just bytecode only being executed inside these slot functions and producing the output?

Python code is compiled to bytecode that gets executed by an interpreter. The CPython interpreter is written in C, which is compiled to machine code that gets executed by the CPU. As I said above, the implementation of the bytecode instruction COMPARE_OP calls do_richcompare(), which calls the type’s tp_richcompare slot. In my first message, I gave you a link to the implementation of slot_tp_richcompare(), which is the tp_richcompare slot function for dynamic types. It maps the compare operation to one of the rich comparison method names (e.g. "__lt__" or "__gt__"), looks up the method on the type, and calls it with the self and other objects passed as parameters.

1 Like

Thanks @eryksun I think with your last answer and looking back to the attached code snippets, it now makes sense. I believe my doubt is a result of half-baked knowledge about bytecode generation for functions. I was thinking that it would be transpiled to C or something weird but it’s really just executing the appropriate bytecode for the magic function on the VM.