Cython and Numpy like to access the internals of CPython data structures and make assumptions about the semantics of struct fields.
This is problematic for (at least) two reasons:
We can’t change anything for fear of breaking 3rd party code
We break 3rd party code, despite being careful, because our understanding of the semantics of C structs differs from Cython/Numpy
Therefore, I propose adding a lot of
inline static C functions to control access to C structs.
These would be guarded by
static inline _PyLong_IsNegative(PyLongObject *l)
return l->long_value.ob_size < 0;
Initially, most (or all) of these functions won’t have normal API equivalents, in order to avoid making the C-API even larger than it currently is.
We could add the slower stable versions, if there is demand for them.
When we change the internals of
PyLongObject again, there would be no need for Cython/Numpy to change their code, although they would need to be recompiled.
This is a limited solution to a limited problem. For more general solution, see
PEP 689 -- Unstable C API tier
Could we add these as a new header file that has to be included explicitly?
February 7, 2023, 1:11pm
How is this different from the unstable API tier?
#include longintrepr.h in
Python.h, so all code already has access to the internals.
Adding a new file makes the new functions less discoverable.
This is for a limited use case, that we need to fix really soon.
February 7, 2023, 1:18pm
So, just time?
The unstable API PR is up for review: gh-101101: Unstable C API tier (PEP 689) by encukou · Pull Request #101102 · python/cpython · GitHub
It’s missing the Devguide docs, but that should be fine here.
TBH I think having different APIs is, in general, a bad idea.
Multiple ABIs, sure. Trading performance for portability.
But there should only be one API, IMO.
Maybe we should narrow the scope here.
UNSTABLE_PYLONG_ABI as it is the access to
PyLongObject that we need to fix.
February 7, 2023, 1:34pm
And yet, here you’re sugesting new, exclusive API. What am I missing?
February 7, 2023, 1:41pm
Why make these internal APIs (= with underscore) instead of public ones ?
IMO, having a rich Python C API solves most of the “hiding away internals” in the most effective way. If extensions don’t need to go for internal struct fields to have a fast way to determine e.g. whether a Python int is negative, we’d come closer to resolving the issue of putting an abstraction layer between core Python and the huge set of Python extensions, allowing the internals to more forward independently from the extension’s use of the API.
Long term, there is probably no reason not to have these functions as part of the API.
But, for 3.12 at least we want to keep these semi-private until we are sure that we have the API right.
February 7, 2023, 1:47pm
That’s what the unstable tier is for. PEP 689 even has this example:
PEP 590’s “provisional”
_PyObject_Vectorcall was added today, it would be initially named
That’s an unstable API, which is different from this use case (an unstable ABI).
We can have a stable API built on a stable or an unstable ABI, depending on compiler time flags.
February 7, 2023, 2:19pm
I was referring to this kind of “semi-private”, provisional API.
February 8, 2023, 11:44am
I guess there was a misunderstanding somewhere. What are the stability expectations you need?
I thought it’s:
ABI can change up to the first release candidate, then is stable in patch releases
API can change up to the first release candidate, then is stable in patch releases
That’s the regular ABI and unstable API. What should be different here?