In short, should using Py_Is and other similar macros/functions be strongly preferred to their explicit variants?
I see couple of reasons for this to being a better practice (as opposed to explicitly typing contents of the macro):
“Plenty of people learn the internals from reading the CPython source, so it would be nice to make it easier on them :)” - @ZeroIntensity
There is always a possibility of new additions that would potentially need to break equivalence of == and Py_Is. Such as: Backquotes for deferred expression. Thus, adhering to C API standards would make things smoother for possible unexpected turns.
I am not suggesting modifying existing code.
But I think that maybe it might be worth considering using Py_Is as opposed to == in the same spirit as Py_TYPE as opposed to its explicit variant going forward.
I consider Py_Is deadborn, like iso646.h macros or trigraphs. A change which makes Py_Is and == not equivalent will break not only CPython code, but every one Python extension. It will also likely require changing the assignment operation and comparison with NULL, so PyObject *x = foo() and if (x) will be needed to be rewritten with some new macros. This is such larger breaking change, that equivalent to rewriting all extensions from C to other programming language. It will kill Python ecosystem.
The main case where I see it as better is when you’re rewriting Python code in C. It’s more clear in that case that you actually want is and not equality, but otherwise I use ==.
Just chiming in to say that for PyPy this API would be extremely useful, because PyPy’s “is” is not implementable with a pointer comparison on the C level (due to unboxing we need to compare integers, floats, etc by value). Right now, C extension code that compares pointers is subtly broken and cannot be fixed by us.
I’m aligned with Peter’s remark that extension module developers often look to the CPython repo for inspiration. I think it would be worth it recommending these (and similar CPython implementation detail agnostic APIs) for the C code in Modules/[2].
This is true for any semantic change to any public C API, so I consider this comment off-topic and borderline FUD. Perhaps you are misunderstanding the intentions of the API. Please see @cfbolz’s comment on the original issue for why this API has a purpose.
No other change to public C API has such enormous effect. For now, using Py_Is is nothing but code obfuscation. I discourage using it and will not approve any PR that uses it.
If PyPI has issues with using == for identity checks, they have a large issue with the C API.
They do have a large issue - it’s called movable objects. A big part of the reason we can’t even begin to experiment with movable objects, or indirection via handles, is because we use the PyObject pointer as the identity.[1]
I’m not a fan of overly strict guidelines anywhere, but I can certainly see the value in at least allowing people to use Py_Is where it makes sense in their code. Forbidding it is more limiting to ourselves than anyone else.
We can try these under C++, because we can override == there, but not in C. ↩︎
I think this is a thorny problem (how friendly do we want to be for PyPy) but it’s too early for threats like “I won’t approve”. Let’s keep the discussion open.
I think the way to support PyPy and GraalPython is with HPy or something like it.
Adding macros like Py_Is is the worst of both worlds. It doesn’t by itself allow C extensions to work with PyPy or GraalPython, but it does obfuscate the C code and uses a name that might have been useful with a HPy like interface in the future.
HPy has a HPy_Is(ctx, x, y) function. Using Py_Is(x, y) ease the migration to HPy: it’s easier to replace Py_Is() with HPy_Is() than going through all x == y and x != y comparisons to check if objects are compared.
I understand that changing every existing == to Py_Is is way too disruptive in most cases. This is particularly true for code that is part of CPython, which is not going to be used in alternative python implementations anyway.
However, there is still value in having the function as part of the API, for several reasons:
code generators like cython can generate it
when an extension module reports a bug when running on PyPy that is caused by using ==, there is a way to fix it that works on all implementations.
Also, I wanted to give some more background on how PyPy does things: right now, we mostly don’t break == in C, at the cost of a much less efficient emulation of the C-API. There is one situation where we do break ==, but that happens only for immutable builtin type, where users mostly already know that you shouldn’t depend on identity. Python is deals with that case correctly, but when using ==there is no way to achieve this fix in C extensions.
The research I recall from extreme programming is that it is better to only code what you need and not attempt to predict what you may need in the future. The research shows it is wasted effort as the predictions are almost always wrong.
When HPy is ready I plan to port to it, but not attempt to change code in an attempt to make the port possibly easier before hand.