I would like to initiate a discussion on the idea of tracking ABI additions and changes in CPython, as suggested by Nick Coghlan (@ncoghlan) in a previous thread about recognizing, managing, and installing non-ABI-stable Python wheels created with alpha/beta versions of Python.
The main issue is that wheels for new Python versions are often not immediately available, causing delays in the adoption of new Python features and improvements. This is due to the ABI’s potential to change during the alpha/beta phase. To address this problem, I would like to discuss a system to manage wheels on PyPI created with non-ABI-stable Python versions, extending the wheel-building period from the current 2-month release candidate phase to the full 12-month development cycle.
The proposal involves tracking two new numbers in CPython:
Last ABI addition version
Last ABI change version
Wheel building could then capture the first number (indicating all the APIs that might have been used in the wheel) for wheel consumption to check against the second number (if the wheel was built against an ABI that’s newer than the last breaking change, then it isn’t at risk of ABI-break induced crashes).
By tracking ABI additions and changes, pre-release wheels can be useful in development without depending on full test coverage. This will allow for better management of pre-release wheels and speed up the adoption of new Python features and improvements.
Continue discussing the idea and gather feedback from the community.
In the case of general support, write a PEP outlining the proposed changes.
Implement the ABI number change on the CPython side.
Implement a separate change on the packaging side for wheels to record the appropriate number.
I would love to hear everyone’s thoughts and opinions on this idea!
For many projects with light dependencies it’s relatively easy to build them all from source. But building NumPy, Cython, or wheels in the Geospatial field can sometimes be more complex, or just time consuming, especially in CI.
By allowing pre-release wheels to be uploaded earlier, it allows downstream dependencies of those project to start testing new Python versions without having to build all dependencies from source.
A big issue would be formalizing what exactly counts as an ABI change.
There’s abidump, but it’s Linux-only, and it often reports false positives so the Release Manager reviews it manually (when the ABI is frozen).
If the changes are too frequent – close to every Alpha/Beta – the system would bring no benefit.
IMO, porting NumPy/SciPy to stable ABI or HPy, and using version-specific API only for optional version-specific speedups (released as wheels in addition to more universal ones), might very well end up being less work overall.
That’s what the Release Candidates are for. Those are ABI-compatible with the final release, so you have a delay of 2 months.
Folks that need the higher performance versions would still face the same problems they do today, though.
As far as the two proposed numbers go, I think even the “API addition” number would increment more slowly than the Python version number, since adding new C API functions usually meets objections - it is preferred to call Python APIs via the existing general purpose Python APIs unless there’s a clear performance or usability benefit in expanding the C API.
The “ABI change” number should be updated even less frequently, since we actively avoid trying to change public struct layouts and function signatures. As more structs become opaque future updates to this value should become even rarer.
From an ongoing maintenance point of view, the two numbers shouldn’t be that much harder to maintain than the opcode magic number (easier in some ways, since they would be bumped at most once per release, and if they do get bumped, it would just be to the encoded release number rather than to an arbitrary value).
Exposing the numbers in the sys module would only need to happen once, and even exposing them in the docs could probably be automated.
Making use of the new numbers in wheel compatibility metadata would be a separate follow-up project that could only be considered if CPython decides to track and publish the values in the first place.