Idea: Tracking ABI additions and changes for better pre-release wheel management

I would like to initiate a discussion on the idea of tracking ABI additions and changes in CPython, as suggested by Nick Coghlan (@ncoghlan) in a previous thread about recognizing, managing, and installing non-ABI-stable Python wheels created with alpha/beta versions of Python.

Background
The main issue is that wheels for new Python versions are often not immediately available, causing delays in the adoption of new Python features and improvements. This is due to the ABI’s potential to change during the alpha/beta phase. To address this problem, I would like to discuss a system to manage wheels on PyPI created with non-ABI-stable Python versions, extending the wheel-building period from the current 2-month release candidate phase to the full 12-month development cycle.

Proposal
The proposal involves tracking two new numbers in CPython:

  1. Last ABI addition version
  2. Last ABI change version

Wheel building could then capture the first number (indicating all the APIs that might have been used in the wheel) for wheel consumption to check against the second number (if the wheel was built against an ABI that’s newer than the last breaking change, then it isn’t at risk of ABI-break induced crashes).

Benefits
By tracking ABI additions and changes, pre-release wheels can be useful in development without depending on full test coverage. This will allow for better management of pre-release wheels and speed up the adoption of new Python features and improvements.

Next steps

  1. Continue discussing the idea and gather feedback from the community.
  2. In the case of general support, write a PEP outlining the proposed changes.
  3. Implement the ABI number change on the CPython side.
  4. Implement a separate change on the packaging side for wheels to record the appropriate number.

I would love to hear everyone’s thoughts and opinions on this idea!

I can, and do, test against alpha and beta so that my projects are ready for a final build.
But until cpython is released I cannot build final kits and trust that my testing is reliable.

How does the ABI tracking help?

For many projects with light dependencies it’s relatively easy to build them all from source. But building NumPy, Cython, or wheels in the Geospatial field can sometimes be more complex, or just time consuming, especially in CI.

By allowing pre-release wheels to be uploaded earlier, it allows downstream dependencies of those project to start testing new Python versions without having to build all dependencies from source.

In this thread I go into it a lot deeper: Create system to recognize, manage and install non-ABI-stable Python wheels (created with alpha/beta versions of Python)

2 Likes

So you have wheels built against beta builds that may or may not work correctly againt the final cpython release. That is risky in my view.

If you can delay the general release of the final build of cpython, but provide that build to make wheels from that could work.

A big issue would be formalizing what exactly counts as an ABI change.
There’s abidump, but it’s Linux-only, and it often reports false positives so the Release Manager reviews it manually (when the ABI is frozen).

If the changes are too frequent – close to every Alpha/Beta – the system would bring no benefit.

IMO, porting NumPy/SciPy to stable ABI or HPy, and using version-specific API only for optional version-specific speedups (released as wheels in addition to more universal ones), might very well end up being less work overall.

That’s what the Release Candidates are for. Those are ABI-compatible with the final release, so you have a delay of 2 months.

I have always thought of the release candidates the last chance to find bugs and have them fixed. And that only bugs will be fixed, nothing feature related changes.

I’m experienced enough to know that even seemingly trivia fixes can cause issues.
That’s why I wait for the final release.

Do I have 2 months from last RC until that last RC becomes the final release?

No. But a 3.x.0 release typically has fewer changes than the 3.x.1 after it.
If no one tested with the RCs, bugs would only get fixed after the final release. A delay wouldn’t help.