Speculative: Wheel 2.0 and migration strategies

If we were to design a wheel 2.0 format that packed data files in a nested archive, how would we handle a migration to such a wheel 2.0 format?


This is sort-of a follow up to Improving wheel compression by nesting data as a second .zip and Making the wheel format more flexible (for better compression/speed).

The other two threads tended to lean more towards a discussion of specifics of the format, compression algorithms and compression characteristics of the nested archive. I would prefer if we don’t discuss that here. This topic is rather about the community management and implementation concerns around a backwards-incompatible format change.


To be clear: There’s no actual plans for a designing/adopting wheel 2.0 format at this time. This is mostly a discussion to try and figure out if it’s even feasible for us to roll out such backwards-incompatible format change for wheels.

2 Likes

Like wheel 1, support it in the tools but wait a year before allowing them on pypi. Outdated tools will print a reasonable error when reading the version number from the WHEEL metadata.

I think I agree. I’m looking more at the “what” than the “how”, but the way I look at it is:

  1. Get support in consumers (pip, etc), and get that out to users as the first priority.
  2. Get support in producers (build backends, basically). This can go in parallel with (1), but isn’t as high priority.
  3. Allow distribution of v2 wheels, by allowing them on PyPI.

Wheel consumers are likely to need to support v1 wheels essentially forever, unless we come up with some “auto upgrade” process for the whole of PyPI. And even then, we’d have to make it available so people could convert private indexes.

I don’t see any obvious reason why we’d want to let projects choose what wheel version to produce (except by the brute force approach of pinning their build backend version). And by implication, I don’t see any need for backends to support producing both wheel versions. It’s possible I’m being naïve here, though.

I’m not sure what communication or publicity will be needed. With the sort of transition I suggest above, everything might be pretty seamless, so it’s possible all we’ll need is something like “you might like to know this is going on behind the scenes”. I’d be amazed if the reality is that simple, though. There’s probably a bunch of custom workflow or scripts that depend on all sorts of details we haven’t anticipated. What we don’t want to happen, though, is for things like that to leave us with a situation where there’s a bunch of users demanding the ability to opt out of wheel v2 “until they are ready”.

We could also provide converters, especially if everything in the 2.0 wheel was representable in the 1.0 wheel. In the same way that we have converters for bdist_wininst or egg.

The different rollout schedules are a pretty solid reason. Pinning the build backend to a version before it’s updated would be an okay workaround, but would also slow down the overall migration.

Without some kind of compelling feature, this is exactly what will happen.

And “pip won’t install the old ones anymore” isn’t a compelling feature :wink:

I’m not sure I follow. Can you explain in a bit more detail? We may be working on different assumptions.

As I understand it, the compelling features of Wheel 2.0 are likely to be technical, of interest only to packaging specialists. So for packaging users, we would want to migrate to 2.0 seamlessly, ideally in a way that means they neither know or care that the migration has happened.

There’s always going to be a transition - “it says package X has no wheels”, “upgrade pip and it’ll see the new wheels”. And there will be holdouts - “I don’t want to upgrade pip”. I’m fine with people who remain on sufficiently old versions of pip being inconvenienced (we explicitly don’t support anything but the latest version of pip).

But as the topic title says, this is all speculative at the moment. I don’t have any real answers here, just assumptions. If Wheel 2.0 is expected to have actual benefits and/or implications for project maintainers, then I’d say that we ought to be explicit and clear about what those impacts are early in the design process, precisely so that we can factor them into any transition plans.

I think the only way to make a transition like this work is for pip to have near universal support (i.e. all in-use versions have 2.0 support), and PyPI and other third-party feed providers support the new format. Without that, publishers will need to pin to an older version.

In particular, you can’t have a transition point for backends that is before PyPI supports publishing them and pip supports installing them. Otherwise, literally everyone will pin to an old backend and potentially never update again.

The sooner we agree on a mechanism whereby Wheel 2.0 files can be identified by pip, the sooner we can roll out informative warnings and at least get those universally accessible. At that point, at least users will get a clear “you need to update pip” warning rather than whatever they’d see today (broken install?). That’ll give us the best chance to enable any new format and have publishers be able to use it.

Uh, yes, this was stupid of me. PyPI needs to be before backends. I did put pip (consumers in general) first. I don’t consider “all in-use versions have 2.0 support” to be an issue, as we don’t support older versions of pip, and our response to anyone complaining that pip doesn’t recognise a 2.0 wheel would be “please upgrade pip”. So IMO, the natural delays involved in the rollout I suggested would be sufficient here.

That would be useful, but to be perfectly honest, pip’s behaviour in this area isn’t good anyway. We often report a bland “no files available” when files are available but are rejected for some reason (often incompatible Python version requirements). That’s not an excuse for giving bad messages when version 2.0 wheels are ignored, but it would call into question any claims that poor reporting was a showstopper…

That may be your response, but it’s the publisher’s response that I’m worried about. “Can you please publish a Wheel 1.0 as well as we aren’t able to update pip yet” is likely to be a common request, and I’m sure at least a few will just decide it’s easier to pin their backend so that their wheels work everywhere (unless, as I mentioned above, there’s some really beneficial feature in Wheel 2.0, in which case they’ll happily tell their users to figure out how to update pip :wink: ).

Sure, the show will go on. No reason we can’t try and be better this time though :slight_smile:

Doesn’t such a mechanism already exist and is implemented? The WHEEL metadata file in the dist-info contains a Wheel-Version key, and per the Wheel spec, installers should warn if the minor version is higher than the maximum they support, and must raise an error if the major version is higher. So long as installers (pip, installer, etc) are following the spec, and the error they raise is suitably clear and informative, we should already be covered on this point without any additional action.

Indeed, I tested pip (22.3.1), and I do get a yellow warning (four of them to be exact) when installing a wheel with a higher minor version:

$ pip install ~/Downloads/pyroma-4.1-py3-none-any/pyroma-4.1-py3-none-any.whl
Processing c:\users\c. a. m. gerlach\downloads\pyroma-4.1-py3-none-any\pyroma-4.1-py3-none-any.whl
  WARNING: Installing from a newer Wheel-Version (1.9)
Requirement already satisfied: pygments in c:\miniconda3\envs\tools\lib\site-packages (from pyroma==4.1) (2.13.0)
[More requirements here...]
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in c:\miniconda3\envs\tools\lib\site-packages (from packaging>=19.0->build->pyroma==4.1) (3.0.9)
WARNING: Installing from a newer Wheel-Version (1.9)
Installing collected packages: pyroma
  WARNING: Installing from a newer Wheel-Version (1.9)
  WARNING: Installing from a newer Wheel-Version (1.9)
Successfully installed pyroma-4.1

and with a higher Wheel major version, it errors out immediately with a red warning:

$ pip install ~/Downloads/pyroma-4.1-py3-none-any/pyroma-4.1-py3-none-any.whl
Processing c:\users\c. a. m. gerlach\downloads\pyroma-4.1-py3-none-any\pyroma-4.1-py3-none-any.whl
ERROR: pyroma has an invalid wheel, pyroma's Wheel-Version (2.0) is not compatible with this version of pip

Are you suggesting something different here?

5 Likes

Nope, that seems perfectly sufficient. I just hadn’t tested it, and if anyone mentioned it in this thread then I missed it.

1 Like

A wheel 2.0 would be smaller and faster. In a previous thread (no longer hosted of course),

Since this idea represents exactly the same information (down to the individual ZipInfo objects in the nested zip, compared to even using a nested tar), this is like an encoding change - you could losslessly translate back to the old format and use the old tools and vice versa.

I don’t know how many people have trouble using the current version of wheel for size and speed reasons. But pypi’s bandwidth would get a large percentage of the benefit if the only wheel 2.0 package was tensorflow.

The less developed ideas would have something to do with representing new features, for example a more flexible installation scheme (fix the data-files feature) or maybe something to do with linking to system packages.

1 Like

I’d really like for this topic to not become a discussion of what we’d change in a wheel 2.0. We have other two topics for that discussion – let’s keep any details/discussion about what could change in those topics.


I think there’s consensus that:

  • 3 tooling categories are involved: installers, build backends and package indexes.
  • 2 user groups are involved: consumers and publishers.
  • installers need to be first within the tooling ecosystem to implement support.
  • For things to not be disruptive for consumers, they’ll need to be on a new-enough version of their installers to support installing from said wheel. The best way to deal with this is time – letting the newer version of the installers propagate through the ecosystem.

It’s not clear to me from context and this discussion which between build-backends vs package-indexes should come first. The obvious thing here is that the build-backends need to be able to build the new format before the publishers can publish it to the package index – but that doesn’t mean that they need to do it by default IMO. Personally, I’m fine with deferring to each build-backend to decide how they’ll handle the transition, and I imagine that all of them would change the default after the PyPI adds support; but that they can allow building Wheel 2.0 wheels before PyPI supports it (for other package indexes).

So… here’s more leading questions:

  1. How do folks feel about coupling the format change in build-backends + package-indexes with a Python version? :slight_smile:
  2. What is a reasonable amount of timeframe after the first release of (say) pip with wheel v2 support for a publisher to start publishing using it?

My thoughts:

  1. It’s… an interesting idea; especially if we do it after an year+ of the format being supported in installers. It gives us the new format organically when the “new” Python version is being bootstrapped, and there’s no major backwards-compatibility concerns (py3-none-any wheels are the only thing I can think of, and the answer there is for the publisher to either build with an older version of Python, or push users to upgrade their installers). I imagine that the installers would still support installing wheel v1 anyway though, so it’s not going to be a strong argument.
  2. @dholth said 1 year. I’m guessing 18 months. That said, seeing how pip versions roll out for PyPI (based on what-performed-request, not which-version-of-pip-was-downloaded) would be a good proxy/metric to guesstimate this.

Hmm… these error message should be updated – “invalid wheel” is the wrong language to use here.

I can – user shows up on the issue tracker (without mentioning that they’re using $lts-distro) that can’t use this open source project on it due to having an old version of pip; and project has a compelling reason to try and support them. Granted, the brute-force approach works here. :slight_smile:

As for the “type 1 change” no new features just compression wheel 2, you could have the build frontend build · PyPI do the format change. You could even point old pip at a local proxy that translated pypi’s wheel 2’s to local wheel 1’s, to be able to install wheel 2’s with old installers.

As for the “type 2 change” new features that cannot be represented in wheel 1, the build backend would need to care.

  1. -1 on this. It’s nothing to do with the Python version, we shouldn’t artificially tie it like this. Also, this would just add one more roadblock for projects releasing wheels for a new Python version in a timely manner - “we can’t release wheels for Python 3.12 yet, as our build backend doesn’t support Wheel 2.0”.
  2. I honestly have no idea. I don’t personally think it needs to be long, as we expect people to upgrade pip pretty eagerly. But I’ve already been told off once today for that view, so I’ll let the people who want to support their users on old pip versions give their views on this.

Here’s a question. Do we anticipate projects releasing multiple wheel versions? They wouldn’t be able to for a wheel 1.0 → 1.1 transition, as the filename would be the same. Do we want to require that the wheel version is part of the filename[1], so that multiple versions can be published? I’m personally -1 on such a requirement, even if we end up with 2.0 including a filename format change for other reasons.

I would personally much prefer to encourage, or even require, that projects only release one wheel version. PyPI will explode if something like pyAgrum-nightly started publishing twice as many wheels…


  1. Implied or explicit. ↩︎

1 Like

I expect we won’t change the file naming scheme (otherwise it won’t be a wheel 2.0, but a new format), so no.

So, it seems like we can do this, albeit with a migration timeline and some details to figure out still around the specifics of how we’d deal with the adoption curve/pain.

Oh, I just realised that I hadn’t pushed back on this idea. :sweat_smile:

I don’t know the distinction between technical features / non-technical features in a file format – but, I do feel that doing only things like compression changes or whatnot is a waste of the opportunity to, for example, add fix the weirdness around extras, or add default optional dependencies, or add metadata/semantics for GPU/CPU specialisation etc. I don’t wanna build a bucket list here (again, specifics of what we change can be discussed in other topics!) but doing a major version bump gives us the opportunity to add and change things, and it may be worthwhile to solve a few problems together.

That said, that path lies PEP 426/Python 3000 so we’d also want to be wary of trying a “let’s fix everything in one go”. :slight_smile:

Agreed. I’m on record as saying I’m confused by the distinction between major and minor versions in file format specifications, but if we take the view you stated elsewhere, that minor versions can only remove things, and all additions are major version bumps, then I think we need to accept that either we bundle as much as possible into Wheel 2.0, or we have rather a lot of major version bumps, one for each change (or group of changes) in your bucket list. I’m not sure which of those two options is the more palatable to me…

FYI Hatchling was designed with build targets changing over time in mind so it has the concept of versions already :slightly_smiling_face:

2 Likes

I’ve been enormously frustrated with what seems to be a “no incremental improvements” policy.

We could do pretty well with only zstandard (if we can get it past the stdlib) and symlinks (to avoid duplicated shared libraries to provide ld.so, ld.so.1, ld.so.1.1 aliases) and only on the 10 largest projects on pypi, we could probably save 50% of pypi’s total bandwidth (do the math?), some end users might like it too.

Some of the things you mentioned (extras) probably go in a metadata spec and would be independent of the wheel spec.

3 Likes