Python ABIs and PEP 703

IMO the nogil project is likely to fail if we don’t offer a “simple” solution to distribute C extensions working on regular (GIL) Python build and nogil Pyhon build.

I do not think having to build an extra wheel will cause the nogil project to fail, even if it adds some inconvenience for package maintainers.

However, I am concerned that tying PEP 703 implementation to other issues like fixing the C API in general or requiring the stable ABI will cause the project to fail.

What I don’t get is how can a C extension maintainer support nogil and Python 3.12 with the stable ABI?

This is discussed in second paragraph of this section. They would need to build an additional wheel per platform.

What happens with a stable ABI compiled on Python 3.12 is used on Python 3.13 nogil build?

It would not load.

Honestly, the nogil change sounds big enough to motivate creating a new abi4 version

I don’t think calling these ABI changes abi4 solves any problems; it only introduces problems. The proposal adds a few functions to the stable ABI whose functionality already exists (as direct field access.). The proposed ABI changes to cp13-abi3 keep the ABI compatible with older Python versions.

1 Like

Will I be able to build and install --disable-gil next to --no-disable-gil in parallel?

As a packager of popular packages with abi3 wheels (who is very excited about the glorious nogil future)-- my view is that as long as:

a) This is a one time thing (i.e., there’s abi3 wheels targeting minimum version through 3.12, and then 3.13+)
b) pip correctly selects between them (including older pip versions)

Then we’re probably ok. We cannot afford to do a build for every single Python version, but we can do 2x versions.

This is purely from a “package maintainer burden” perspective, I’m not thinking through the other considerations here.

2 Likes

I think the answer is “only to the same extent that you can install multiple versions of Python next to each other”. There is not a plan for a unified installers.

Fair enough. It’s actually pretty easy to do so for pythonX.Y on systems I care about, much harder for pythonX.Y.Z (e.g. you need to use different installation directories when installing from source). gil vs no-gil installations of the same pythonX.Y will likely also require different installation directories and can’t live side-by-side.

4 posts were split to a new topic: Naming the Python binary when --disable-gil is set

Depends on what you mean by “in parallel”. You could do a “fat” wheel where you put two .so/.dll files next to each other for each build as long as we make sure the file names we search for in Python itself won’t clash. Otherwise it’s like installing any other package unless you want to change the import system.

Pip’s logic uses packging.tags which calculates the list of acceptable wheel tags. The code specifically related to the stable ABI is covered in:

So old versions of pip today will consider any abi3 usage for the version of Python being installed for and older as being compatible. If some magical point in time where abi3 wheel tag means something different for Python 3.13 than all previous versions then that will only work with new pip versions. If you change the ABI version (i.e., abi4) then those wheel files will simply be ignored by old pip versions.

1 Like

Just to give more information on the use cases you need to cover:

I maintain a stable-ABI wheel which we distribute to our customers, so they can run on anything from Python 3.7 up. (I know 3.7 is now EOL, but it is still supported by some of our supported Linux distros, and there isn’t a lot of advantage in changing to 3.8+).

I am currently working on switching to use Python 3.11 to build this wheel. I do not expect to be able to use Python 3.13 for at least two years (possibly quite a bit longer).

I don’t actually care about no-GIL builds at the moment, which I currently regard as very experimental, so my requirements are:

  • It must be possible to load a stable-ABI .so which targets python 3.7+ into a “with GIL” build of Python 3.13+.
  • Pip must successfully install a cp37-abi3 wheel into a with-GIL build of Python 3.13+.
  • Pip must refuse to install a cp37-abi3 wheel into a “no GIL” build of Python 3.13+
  • There must be a plan for how I can support both GIL and no-GIL builds of Python in the future. This would either involve building two wheels (GIL and no-GIL or cp37/cp313), or building a “fat wheel” which contains both shared objects, and loads one or the other conditionally.

Other thoughts:

It is a real shame that when the stable ABI was defined, it wasn’t done properly, with all interactions with the python run-time done via function calls and completely opaque pointers. However, until we get the Software Development Time Machine™ working, it’s too late for that.

If we are going to have to make the change to make incref and decref be via function calls, I think we should take the opportunity to make everything else be via function calls too, so that we don’t have to go through this ever again.

2 Likes

This are where I would like API/ABI to go:

  • Stable ABI should have first-class support for non-C languages, which can only access exported symbols, not macros or inline functions.
  • Stable ABI should be implementable by non-CPython implementations, which might not use refcounting or immortality.

I plan to discuss this at the sprint & come up with a high-level direction proposal. It might turn out we don’t want to go this way, in which case the rest of this post is moot, but if those sound like good ideas, read on :‍)


Some rules of thumb follow. Of course Py_REFCNT & co. are very special and can bend these, but only if necessary.

  • All API functions should be exported as proper symbols (perhaps in addition to macros/inlines for C)
  • All functions should be able to signal failure. (This allows us to deprecate them cleanly – with a runtime warning that -Wall may turn into an exception.)

With that, I’d change your proposal to a add bunch of fully public functions:

  • PyObject_Refcnt: returns SOME_HUGE_VALUE if refcounting does not apply, returns -1 with an exception set on error (even though CPython will currently never raise here)
  • int PyObject_SetRefcnt: returns 1 on success, 0 if it did nothing (refcounting does not apply), -1 with an exception set on error. (Users must handle the 0 case, but AFAIK this function is mainly used for happy fast paths where you can switch to the slow path)
  • PyObject_IsImmortal: returns 0 for non-immortal (incl. if the implementation doesn’t use immortality), 1 for immortal, and -1 with an exception set on error
  • PyObject_GetObSize: returns -1 & sets an exception on error. (I prefer the weird-sounding ObSize to avoid confusion with PyObject_Size which calls __len__.)
  • int PyObject_SetObSize: returns -1 & sets an exception on error

In version-specific builds, these can be inline functions with assert(result != -1), so compilers can remove users’ error handling.
(And yes, I know, users will forget to add the error handling if it does nothing. That’s a bug to be fixed in their code.)

Note how PyObject_SetRefcnt turns what a C programmer would expect to be a compile-time error into a runtime one. This is what stable ABI should generally do. (We can of course also add a compile-time deprecations/errors later, for users that happen to recompile.)
Note that Py_INCREF cannot raise. A moving GC could have to make it “pin” the object (possibly with high overhead, which is OK for stable ABI). But I can’t think of a similar hack for PyObject_SetRefcnt, so let’s allow it to raise.

(That was all assuming replacements for Py_REFCNT and Py_SET_REFCNT are necessary for stable ABI extensions. Perhaps we can not include them at all?0


IMO, one point is missing from the proposal: PyObject should become opaque, so e.g. all uses of obj->ob_refcnt need to switch to Py_REFCNT(). That’s the biggest pain point in the stable ABI, and if we’re doing any kind of compatibility break we should include this one.

Like Victor, I would like to call the new ABI abi4. But here’s the twist: we can make abi4 extensions compatible with abi3, as long as they don’t use the newly added functions like Py_REFCNT/PyObject_Refcnt!

Hers’s the situation I’d like to get:

  • Which CPython can use an extension with the given ABI:

    3.8-abi3 3.12-abi3 3.13-abi3 3.8-abi4 3.12-abi4 3.13-abi4
    CPy 3.8 ✓*
    CPy 3.12 ✓* ✓*
    CPy 3.13
    CPy 3.13-nogil ✘**

    *) packaging tools will need to rename the extension file on wheel install, so that the old Python recognizes it
    **) cannot work as it possibly uses ob_refcnt directly

  • Which CPython can compile an extension with the given ABI:

    3.8-abi3 3.12-abi3 3.13-abi3 3.8-abi4 3.12-abi4 3.13-abi4
    CPy 3.8
    CPy 3.12
    CPy 3.13
    CPy 3.13-nogil

Eventually we drop the abi3 series (and force everyone off ob_refcnt), but that’s a separate discussion.


And for clarity:

Yes. The only remaining things that aren’t function calls are the refcount, ob_type, and ob_size.
(The other non-opaque structs are the interop-related Py_buffer and various “blueprint” structs like PyType_Spec – of which one, PyModuleDef, is a PyObject which will cause some trouble for abi4.)

5 Likes

I don’t think we should worry too much about whether old versions of pip work properly for the experimental, --disable-gil build of CPython 3.13. As it is, old versions of pip frequently do not work with new versions of Python. For example, pip==23.1.1 and older (from just 5 months ago) will break if installed in CPython 3.13 (missing pkgutil.ImpImporter).

  1. 3.13-abi3 doesn’t use ob_refcnt directly. That was changed to use a function call in 3.12. If we can make abi4 work with GIL and --disable-gil then we can probably do the same for 3.13-abi3.
  2. You don’t state it explicitly, but I guess the idea is that refcounting in abi4 could use the older Py_IncRef function call, which is available in CPython 3.8.

Ooof, I’m not sure how to address PyModuleDef.

1 Like

Pip’s policy is that you should always upgrade to the latest version. I think it’s fine if an older version breaks for newer Python. But I’m less sure that silently installing the wrong package is acceptable. People do use older versions of pip, and loud breakages aren’t the same as silent errors. But we (pip) don’t have a specific policy on this.

I don’t personally have a strong opinion on this, but I do think that if we want people to try out the free-threaded builds, having to debug ABI compatibility issues like this is likely to be rather off-putting for them.

2 Likes

As a package maintainer, I beg of folks: pip silently doing the wrong thing creates a flood of issues for packages like pyca/cryptography.

5 Likes

Py_INCREF doesn’t use ob_refcnt, but the field is exposed and can still be used directly. Disallowing that (along with ob_type/ob_base/ob_size, and stuff like PyObject_HEAD and anything else that needs PyObject struct size) is the necessary API break. (edit: API break, not just ABI)

Well, should stop being a PyObject, but the details might be off-topic in this thread.

1 Like

I think this is a critical missing piece, so I’d appreciate any suggestions on how to handle it.

1 Like

Do you suggest that users would check PyObject_Refcnt and take different code paths depending on whether reference counting applies? Is this meant to just distinguish immortal objects, or distinguish whether the underlying runtime uses reference counting?

Do you actually need and want to distinguish those situations? Could the API be made generic such that the user does not have to worry about and cannot abuse implementation details, such as immortal objects, or in general which memory management strategy the runtime uses?

Let’s first ask the question: what is the use case for PyObject_Refcnt in third-party (non-CPython) code? The only one I can think about is to check whether an object is shared or not, and in that case PyObject_RefCnt(x) == K (where K may be 1 or 2 depending on the context) is sufficient and will continue to work ok.

2 Likes

What is that useful for?

I assume this is all about 3rd party code, I don’t see a reason why CPython would internally do a call (to a non inline function) for something like this. In general, public API for extending (C)Python should be just something else than internal APIs used in CPython, they have different purposes, different guarantees, different lifespans, mixing the two is IMO source of some of the issues.

For example to allow for mutation or not, or perhaps for copy-on-write optimizations.
I’m not saying it’s good practice to rely on Python refcounting for that, btw.

Right. Checking whether an object is shared or not, by comparing the refcount to some K, sounds like you’re deep enough in the internals to want the version-specific ABI, a least.

2 Likes

OK. Starting from:

  • If PyObject becomes opaque, PyModuleDef_Base must lose its PyObject_HEAD so that users can define it.
    It follows that PyModuleDef must be not cast to PyObject (and we can bikeshed on how to document/enforce that for users).
    FWIW, in CPython, the only use for it being a PyObject is determining whether the PyInit_* function returned a def (multi-phase init) or a complete module object (single-phase init).

The design space here is pretty big and needs more thought than I can give it right now, but a solution exists:

  • PyModuleDef_Init can return a newly allocated PyObject that wraps the provided PyModuleDef. (This object will leak, but possibly only once per PyModuleDef per process, if it’s stored & reused.)
    To allow users to avoid that, we could make CPython look for a symbol named PyModDef_<name> before it looks for PyInit_<name>. PyModDef_<name> would contain the module definition rather than a function that returns it. (And so users get to write less boilerplate, not just avoid a small leak!)

There might be better ways with fewer downsides.

1 Like