Making `PyObject` opaque in the limited API

Hello,
I’ve spent some time gathering thoughts about a new version of the stable ABI, and I’d like to start some focused discussion around it.

There are a lot of things to design and discuss (EOL dates, file/wheel names, and so on), but I think the critical path contains deciding what to do with the PyObject struct.

To keep lengthy background info out I’ll take the following for granted; if you disagree then ping me and I’ll write a longer post for separate discussion. (Otherwise I’ll save it for the PEP.)

  • Having opt-in, long-term-stable ABI is good :‍)
  • We need a stable ABI for free-threaded builds.
  • If we break the stable ABI, we should keep API compatibility, as much as possible. Ideally, users just rebuild existing code with different settings.

The main problem

Exposed PyObject and PyVarObject are the major known issue in stable ABI that can’t be solved with deprecations or API-compatible changes. The fixed layout prevents some features/optimizations (as seen in the free-threaded builds where the struct is entirely different); almost everything else in stable ABI (functions, interop structs, etc.) can be deprecated but kept ABI-compatible.

Making PyObject opaque has two parts. The relatively easy one is disallowing access to the members (ob_refcnt, ob_base, ob_size). We (mainly Victor, thanks!) have been adding function accessors to these; we can add a few more and leave the rest as Python API (e.g. instead of Py_SET_TYPE, setattr the __class__).

The hard part is that sizeof(PyObject) is unknown.
This means you can’t embed PyObject in custom class structs, as in “The Basics” in the tutorial:

typedef struct {
    PyObject_HEAD
    /* Type-specific fields go here. */
} CustomObject;

This is a giant API compatibility break, on par with disallowing static types.

PEP 697 (Limited C API for Extending Opaque Types) adds rudimentary API for opaque types, but it was only added in 3.12, and it was meant as a first step, not really to support making all objects opaque.

A more complete solution would be Mark’s Grand Unified Python Object Layout. I guess that will become a priority.

Possible solutions

Let me brainstom some ways we can solve this.

Just hide it

The easiest for CPython devs would be to stop exposing PyObject in limited API 3.14. This means stable ABI 3.14+ would be compatible with both regular and free-threaded builds. Build tools won’t need to change. But, users would need to redo how they define types, and until they do won’t be able to use the 3.14 additions to limited API.

Split the limited ABI

We could make several variants of stable ABI, something like:

  • abi3” – sizeof(PyObject) is fixed. This is the current limited API: API-compatible with existing code, and ABI-compatible with all (3.2+ [1]) non-free-threaded builds.
  • abi3t” – sizeof(PyObject) is fixed, but higher. This is also API-compatible with existing code; ABI-compatible with all free-threaded builds (3.14+?).
  • abi4” – PyObject is opaque. This will not be API-compatible (i.e. needs changing the source), but compiled extensions can be ABI-compatible with all builds (even old ones, though most extensions would need 3.12+ API for custom types).

That means that stable-ABI projects will have two options:

  • Build twice, for abi3 and abi3t. No source changes needed.
  • Port to abi4, and build once.

I consider building twice OK: I’ve been convinced that stable ABI shouldn’t be about building exactly once, but about building for a range of versions. See cryptography which currently builds for cp37-abi3 and cp39-abi3. Other projects might build a version-specific (optimized) wheel for, say, cp313 and cover the rest with abi3.
(There exist projects where introducing a “for loop” in the build system would be problematic; those would need to choose just one of the three options.)

Freeze PyObject size

A cop-out would be to change PyObject size once and make it again part of stable ABI. This change, and any future ones, would break the ABI and require users to rebuild – but we could say in advance that it could happen sometimes, and add versioning & version checking to support it.

This is roughly equivalent to the “Split the limited ABI” option but removing abi3 without any deprecation (and not introducing abi3t at all).


Extra problem: PyModuleDef is a static PyObject

Another effect of making PyObject opaque is that users can’t define statically allocated objects. There’s not much reason to do that in Stable ABI, except for one that every extension needs: PyModuleDef. (Which, if you ask me, it should never have been a PyObject, but here we are.)

To solve this, we can add a new non-object variant of PyModuleDef, called e.g. PyModuleSpec, and a PyModuleSpec_ToObject function to allocate a PyModuleDef and fill it up.

PyModuleDef and PyModuleDef_Init can then be soft-deprecated.


  1. (We might switch to smaller support windows, so “3.2+” might turn out to be e.g. “3.10-3.20”; I’d like to keep details of that out of this discussion.) ↩︎

6 Likes

Yeah, just to state this explicitly from the PyCA maintainers perspective, as long as we have O(1) builds, that’s ok. What we can’t/won’t do is O(n) where we need new builds for every Python release.

My main piece of substantive feedback is that for those of us working in non-C extension modules (Rust, in our case), API compatibility doesn’t do a ton, so figuring out what the desired ABI story is is incredibly important to us.

4 Likes

I’m way outside the realm here but why not just have the sizeof the PyObject be the first field in the PyObject? That way, a user can fetch that at runtime. I’m not really sure how that could be a compile time check though.

Or just add a pointer at the end to an opaque ‘extended’ struct then freeze PyObject’s size at that point.

1 Like

Nothing you say seems too unreasonable.

I know originally there was talk about trying to make Stable ABI builds that would be compatible with Python ~3.8+ and freethreaded. I’ve become convinced this would be a mistake (just because PyList_GetItemRef is so essential to making things thread safe). So I’m comfortable that this idea seems to have been dropped. Having a separate “freethreaded stable” build seems reasonable in the medium-term.

Is it completely impossible for abi3t to support all Python 3.14+ builds (e.g. by making the non-freethreaded build use PyObject+a bit of padding)?

I suspect it’d be a while before Cython was able to support abi4. That’s fine if ab3 (and maybe abi3t) continues to work. I don’t think there’s anything fundamentally incompatible with PEP697 and how we do things - just a lot of small changes.

It’s almost certainly worth making sure that abi4 includes all the breaking changes that people have been waiting to do. I’m sure you know this already though.

1 Like

One further thought: how much value is abi3t adding here? Should freethreaded Python + stable ABI just require abi4?

1 Like

I aim to treat C a the language for describing the API/ABI, and for the implementation on the CPython side. You need to know C to translate to Rust, but you should be OK with the straightforward parts of C. I like to think the rest of the C API WG is on board, more or less enthusaistically.
I don’t actively use other languages though[1]; tell me when you see a rough edge!

For the universally compatible build, #define PyList_GetItemRef PyObject_GetItem.
That is, the proposed abi4 can be compatible with 3.8. Whether it’s a good idea is another question – but that’s more about how long our support windows should be, which I’d like to avoid in this topic.

I’m also sure I forgot something. If you want to make checklist for the upcoming PEP(s), now is the time!

It’s API-compatible: you’ll need to recompile with different settings[2], but you shouldn’t need any code changes.

The point is to avoid extensions fetching directly from memory, so we can do things like add new fields.
Adding a constant value to every python object would be quite wasteful of memory, and something similar is needed for every superclass, not just object. So, storage size info is attached to the class. [3]


  1. except ctypes↩︎

  2. for Rust it should be something like: “with an updated PyO3, or whatever else defines PyObject memory layout and build settings for you” ↩︎

  3. See PEP 697 – if everyone used that we wouldn’t have the issue, but, not many projects can switch to a slower, 3.12+ API. ↩︎

ABI changes are few and far between, and abi4 appears to have the potential to be stable for a very long time.

That said, couldn’t there be a change to the python life cycle (number of years supporting a version, cadence of new versions, …) around special moments like the potential abi4 transition point. This might alleviate some of the maintenance concerns.