Create system to recognize, manage and install non-ABI-stable Python wheels (created with alpha/beta versions of Python)

I would like to open a discussion about recognizing, managing, distributing and installing non-ABI-stable Python wheels (a.k.a. wheels created with alpha/beta versions of Python), with the goal to speedup wheel deployment for new Python versions.

In this topic start I will expand on the current situation and problem, propose potential solution and give some context. But I would mainly like to open a discussion on this topic.

The current situation and problem

Wheels are extremely important for the Python ecosystem, and especially the scientific subsection. Newer Python versions offer many benefits, like new features and faster performance. With the release of a new Python version, for many packages wheels are unfortunately not immediately available for that version.

This is a problem with many aspects, and in this feature request a system is proposed to partly but significant improve this situation.

The Python annual release cycle, as described in PEP 602, knows 3 phases. In the Alpha phase new features are added and bugs are fixed, during 7 months. The 3-month Beta phase allows for bug fixes. The 2-month release candidate phase is not clearly described in PEP 602, but practically only essential bugs and security issues are fixed. Historically, it also starts the ABI-stable phase for that minor Python version (Python 3.10.0rc1 | 3.11.0rc1).

This 2-month phase is where currently the wheel-building race starts for many projects. And while for a single project two months should be more than enough, many projects depend on many other projects. On the scientific stack, for example, Cython needs to come in first, then NumPy, then SciPy and Pandas, then scikit-learn and statsmodels. And if you projects depends on one of the latter, this two-month window gets really short.

In this window, not only wheels need to be successfully build for the new Python version, but also need to be included in a tagged release. For many projects, this means backporting to a maintenance branch and tagging a new patch release.

Since amending PEP 602 would be a major undertaking and have many side-effects on the ecosystem, this feature request proposes a system for managing wheels on PyPI created with non-ABI-stable Python versions.

A potential solution

When wheels build with non-ABI-stable Python releases are handled differently in pip, it would be possible to allow (and encourage) uploading those wheels, without letting users accidentally install them. In this system where non-ABI-stable wheels can be uploaded, they should (by default) only be installed on the exact same Python release on which they are generated. Then, options could then be added to loosen this strict version requirement, for development purposes (including testing and CI).

The steps to create such a system could look like this:

  1. The exact Python version with which a wheel is build needs to be recognizable. Some help from wheel is needed here, and maybe the wheel specification needs to be modified.
  2. pip should (by default) install wheels generated with an Alpha or Beta Python version only on that exact Alpha or Beta Python version.
  3. A CLI and/or Environment variable needs to be introduce to loosen the strict version requirement above, to make it more usable for CI configurations. This could be called --experimental for example. Options could be 0 (the default) which only installs the wheels on the exact same Python version. 1 could mean only install wheels that are generated with the exact same or an earlier Python version (so on Python 3.12b1 wheels from Python 3.12b0 can be installed, but not visa versa), while 2 could mean install wheels from all Python versions with the same minor version (so install all 3.12 wheels).
    3.1. When using option 1 or 2 and not the exact same Python version is present, a note/warning should be always printed.
  4. Documentation and tests needs to be added.

Then, some optional, but useful extensions for the ecosystem could be implemented:

  1. The PyPI website could display an experimental tag in the Download files tab, together with the exact Python version with which the wheel was generated.
  2. cibuildwheel could be update to handle this functionally and allow control over it.

Implementing this, the wheel-building period could be extended from the current 2-month release candidate phase, to the full 12-month development cycle of Python. Which will increase wheel-readyness for a new stable Python version hugely.

Alternative Solutions

Amending PEP 602 would be a major undertaking and have many side-effects on the ecosystem, so is not preferable.

Apart from extending the wheel-distribution period, another angle could be a speedup in getting wheels on PyPI. One option for this could be allowing (periodically) generated wheels from a development branch. This way dependent projects don’t have to wait for a backport and stable release to be tagged. This solution could work in conjunction with the proposed solution in this feature request.

Additional context

Thanks to cibuildwheel, and it’s increasing adoption, Python 3.12 wheels are in a better shape then with earlier Python releases. Still it’s a race against the clock:

  • Cython had Python 3.11 wheels since 0.29.30, released on May 17, 2022, still in the beta period. The next release was in 0.29.32 on July 19th, also in the beta period. Since then no wheels have been released, so a warning that we’re testing with Cython beta wheels would be useful (as proposed in 3.1).
    • With the system above, releasing experimental wheels in the Alpha phase could also become more feasible.
  • NumPy had their first 3.11 wheels with the 1.23.2 release on August 14th, just a week into the release candidate phase (Aug 8). This started the cycle for other projects, like Pandas and SciPy.
  • Pandas followed with their wheels on September 19th with the 1.5.0 release.
  • SciPy followed with their 1.9.2 release, on October 8th.
  • statsmodels and scikit-learn don’t have 3.12 wheels available.

Discussion

So this are just some thoughts and ideas, with this topic start I would like to open the discussion on how we can improve this situation. I look forward to hearing everyone’s thoughts!

2 Likes

So, a couple of thoughts: we definitely can’t touch the CPython release cycle – that’s for CPython developers and the Steering Council to decide. :slight_smile:

And, secondly, we can likely achieve this by having alpha/beta release specific abi tags (of wheel compatibility tags, see packaging.tags documentation).

That would let builders tag things appropriately and for installers to use them in non-rc installs.

Just a quick thought for now, but I think the basic idea should be handled by having new compatibility tags that reflect the “pre-release ABI” somehow. This may need changes to CPython to report when the ABI changes, or it may be enough to have a “cp312b1” tag in addition to the normal ones.

I’d also suggest that we don’t host these wheels on the main PyPI. Otherwise, we’ll either end up with an overhead of storing pre-release wheels forever, or we’ll need to invent some mechanism for deleting “obsolete” pre-release wheels from PyPI.

Actually, what’s wrong with simply having a pypi-312b1 index that people can point to in order to use pre-release wheels? They could be published with cp312 tags, and there would be no clash because using them would be opt-in. If we wanted to go this far, installers like pip could automatically add an --extra-index-url when run with a pre-release Python interpreter.

1 Like

To answer my own question here, having additional copies of PyPI is a non-trivial amount of work. Although we don’t need a “full” PyPI instance - pytorch publish wheels on a separate index, and that seems to work fine for them. But I don’t think we should assume that hosting (essentially temporary) pre-release wheels on PyPI is without cost, either. I’d like to hear what the PyPI admins think.

Another option would be to promote the idea of pushing wheels for the Python alphas and betas and using the build tag to shadow the older versions. Could specify the Python version w/o decimal points, e.g. 30110c2. That would allow 30110f to represent the final wheel for the final release of Python 3.11.0. This does assume, though, people are consistently running the newest version of CPython when using wheels during the alpha and beta cycles.

1 Like

Hi @abravalheri, @jaraco, @agronholm, @joerick, @henryiii, @uranusjr, @ncoghlan, @dstufft,

As maintainers and/or listed experts on the pip/packing/wheel(building) projects, I’m very curious what your view is on this topic of pre-release wheel distribution!

Devpi provides a trivially easy way to mirror but extend PyPI… they even host a version for public use at https://m.devpi.net. I’ve used that to publish non-default experimental releases. PyPA could host an official mirror or maybe just configure and designate one there for alpha/beta periods.

FWIW, it’s not necessary to mirror PyPI – pip already allows for --extra-index-url which could be configured to do the right thing here, and so do various other installers. One reason to have these wheels be a part of Simple index rather than some sort of mirror/extra index page is that it’d provide an “out-of-the-box” experience for these wheels – just using pip install scipy will install the relevant release of scipy for this Python alpha/beta, along with its dependencies.

That said, indexes like https://pypi.anaconda.org/scipy-wheels-nightly/simple/, https://download.pytorch.org/whl/ already exist – adding “just” the compatibility tags might be good-enough to enable the workflows that @EwoutH’s mentioned; albeit requiring an alternative index that’s not PyPI which requires additional configuration to be passed into the installer.

A couple of additional thoughts on this… Wherever this is hosted, might want to have some sort of automated deletion policy for these wheels, say deleting them after an year or 18 months – even if storage is cheap, these wheels are basically useless after less than an year and they’ll bloat JSON/HTML responses for the page and, last I checked, bandwidth isn’t cheap (in $ as well as download times). :slight_smile:

The way I read it, you are proposing to have several races instead of just one. Each time a new alpha/beta comes out, all the projects in the dep chain need to make new releases with new wheels. What am I not getting?

1 Like

@hroncok You make an excellent point. I’ve made some assumptions - implicitly - in the proposal about ABI stability and testing workflows, so let’s make those explicit (because that’s better!).

Wheels are used for multiple things. Users use them, but also CI workflows, including testing. We don’t want end users ending up with non-ABI-stable wheels, because they will use them for longer times and it could later break things. So that’s one reason why uploading non-ABI-stable wheels is not encouraged on PyPI.

CI workflows however, use wheels regularly only once. They pull the latest wheel, build and test something, and throw the whole environment away (except when cached, but that’s not that common). For those workflows it doesn’t really matter if the wheel is ABI instable: these are testing workflow to catch errors early and prepare for new software versions. To fail sometimes (in an useful way) is what they are designed for.

If we strictly allow wheels to only be used on the exact same Python alpha/beta version, that could indeed lead to CI chain breaking every time a new alpha/beta releases, leading to multiple races. However (I assume) most of the time the ABI is not so significantly changed that it instantly breaks all builds. So we might be able to use a wheel - with caution - even if there is a minor ABI change in it.

If, for example, Cython deploys alpha wheels, then NumPy starts deploying pre-release wheels on the first beta, SciPy could and Pandas can have wheels on the second or third beta, and statsmodels and scikit-learn before the last beta. Then on the release-candidate release they al churn their CIs one final time, and all packaged building on top of those have a full two months to get their wheels ready.

There might be an ABI break somewhere on some beta somewhere in the chain, but that’s most likely a small fix and everything is running smooth again.

So the question I really would like to investigate in this thread, is:

  • “How can we lengthen the preparation and wheel deploying time for a new Python version, considering it’s a chain of dependencies and the ABI not being perfectly stable?”
3 Likes

Thanks. Now I understand the idea better.

(BTW Cython has a pure Python wheel ready for this, so it actually works right away since alpha one. But I realize other dependencies in the chain don’t.)

Since the root problem is the dependency chain that needs to be updated for a downstream package to test against prerelease Python, I’m not quite sure the proposal would be very effective in practice. While it does provide a system for projects to aim to, it still creates several races. While those new races are ahead of the main release—which solves a problem—there being multiple races creates a new one, since each prerelease Python is not ABI compatible with another, and it’s likely to result in projects building with different cadences and a situation where downstream packages find difficult to follow.

Perhaps a system similar to piwheels would be more helpful? Instead of coding the logic into the package manager (pip), we come up with a central effort to build a bunch of packages for certain Python prereleases (maybe following GitHub Actions?), make those wheels available somewhere (alternate index), and write up some instructions on how a project can pip install against those wheels in their CIs for people to use. I’m not sure whether any additional logic in pip’s wheel discovery would be needed or not to make this work reliably, but we can find this out incrementally once we actually have the wheels and index set up. This also gives us a concrete action item—let’s find resource to host those wheels, and start building them right now. The rest can be figured out by experiementing.

2 Likes

The underlying problem here feels similar to the problem that bytecode versioning handles for the Python compiler: allowing consuming systems to make the distinction between “This is allowed to change” and “This actually did change”.

At the moment, CPython doesn’t publish an ABI revision number that’s distinct from the CPython version number, so consuming systems are forced to make the pessimistic assumption that the ABI changes any time it is allowed to change (i.e. every pre-release version).

ABI additions (which existing projects necessarily aren’t using) are also distinct from ABI changes (which are the things which can cause previously built binary wheels to crash).

So while it wouldn’t represent a solution on its own, it feels like a technical process tweak on the CPython side to publish two new numbers would be useful:

  • last ABI addition version
  • last ABI change version

Wheel building could then capture the first number (indicating all the APIs that might have been used in the wheel) for wheel consumption to check against the second number (if the wheel was built against an ABI that’s newer than the last breaking change then it isn’t at risk of ABI-break induced crashes)

Without those numbers, any system based on publishing pre-freeze binary wheels would be plagued by the assumption that test suite coverage is going to be good enough to pick up problems.

3 Likes

Thanks for your thoughts Nick, I think you make a great observation that tracking ABI additions and changes is fundamental for pre-release wheels be be useful in development without depending on full test coverage.

How would the route to (proposing) such a process tweak on the CPython side look like?

You would probably start up a discussion on the specific topic somewhere here on discuss.python.org. Then assuming general support, write a PEP. This would require the ABI number change on the CPython side and then a separate change on the packaging side for wheels to record the appropriate number.

1 Like