PEP 777: How to Re-invent the Wheel

To further hammer this in, the reason why filtering out at the resolver level by wheel major version is not the same as filtering out for python version in terms of compatability and user expectations, is because the python version is about compatability with the environment being installed into, the wheel version is compatability with tooling that has nothing to do with the runtime and that is expected to be updated. pip only supports the latest version, uv isn’t written in python, poetry hatch and pdm dont typically get installed into the environments they manage.

In terms of features people want, here’s how to do it with the existing wheel format, using tar.zst support from the pep as an example.

  1. define a new wheel major version
  2. add a field to metadata in that wheel major version for capabilities
  3. new capabilities can be added in any wheel version, they can only be removed in major versions
  4. say that for that wheel major version forward, you must check if your installer also has the listed capabilities as defined.
  5. the tar.zst capability says that there is a single archive file in .tar.zst format in the wheel that contains the files to be spread.

I’m sorry, I don’t think I can engage further on this point. I’ve seen a lot of condescension and general lack of respect, and it’s exhausting. I need to step away from this discussion thread for a bit.

There’s a fundamental disagreement about whether raising an error with a 2.0 wheel is published, vs installing an older version is better. I don’t think we’re going to convince each other one of us is right.

I’ll take some time to think if there is a mechanism to allow for changes we want to see while keeping .whl.

8 Likes

Please take your time. I’m sorry if I’ve contributed to that with some of my argumentation, speaking only for myself, but also assuming good faith in others, It hasn’t been my intent, and I think everyone involved only wants to improve the packaging ecosystem.

I’m also seeing that be the main place we’re currently not getting agreement at all, but this is probably the biggest way this will impact end users if the rest of the process works as intended and we roll things out in a structured way working alongside installers on this.

Thank you. I think the process on this is fundamentally a hard conversation with all of the considerations, and this is trying to create a roadmap for the future when we already have a roadmap built into the wheel format, so it’s tough to weigh whether changing the roadmap is beneficial without bringing back into scope things this pep is trying not to handle immediately.

2 Likes

I’m going to comment here, just to provide some background, but as @emmatyping needs to take a break from this discussion, I’ll just say what I want to add and leave it at that - I don’t want to contribute to a huge discussion to catch up on when Ethan picks this up again.

You’re absolutely right, there’s a fundamental question here - why does the PEP assume that the currently mandated behaviour, for installers to raise an error and fail if they encounter a wheel major version that they don’t support, is the wrong thing to do?

It’s certainly arguable that it actually is the right approach - after all, it was included in the original wheel spec in good faith, as a versioning strategy. However, in many discussions about adding new features to the wheel format, “bumping the wheel version will cause installers to fail” was commonly viewed as a problem that made a straightforward version bump problematic to the point of derailing the whole proposal. I’m afraid I don’t have links to specific examples, and I don’t have the time right now to search old threads to find some. Proposals that I do know ultimately failed because the breakage a major version bump would require wasn’t acceptable were symbolic links in wheels, and the discussions about better compression within wheels.

I don’t know how much background you have in the history of attempts to add new features to the wkeel format, and I don’t want to presume, but in my view, bitter experience has adequately demonstrated that the current “fail if you encounter an unknown version” strategy isn’t a viable way of enabling progress with the wheel format. You could claim that we’ve never actually tried a version bump, to confirm it’s unworkable, and that’s a fair point - but I don’t know how we could do that (or even why we need to) when the mere existence of the current requirement has been the explicit reason that progress has stagnated until now.

But it’s not good that a PEP gets based on being a fix for something that “everyone knows” is a problem, but which isn’t articulated properly. It is a lot of work to pull together evidence that the existing approach is problematic - because that evidence is scattered among literally years of long and complex discussions. So I can understand the temptation to just assume the motivation is clear to everyone and move on. I suspect the PEP does actually need to collect and present the evidence that the existing versioning approach has blocked multiple attempts to add new features to the wheel format. While that won’t necessarily change the position of the PEP, or your opinion of that position, it will at least give you the reasoning that led to where we are now (if this post hasn’t already done that…).

11 Likes

I also don’t want to get too far into the back and forth right now, and am working with someone else to put together a comparison of options that includes using the status quo

This would be my stance, and this post has not changed it. I think that the mechanism that exists is a good one, that we have reason to think the silent fallback is bad from other places it happens currently, and that people are so afraid of trusting users to understand why updates are necessary if presented with the information, and that as a consequence, we aren’t even trying to design the feedback they would need to have this be a smooth transition. The assumption that it will be painful is self-fulfilling if it prevents people on working on solutions that can be smooth.

2 Likes

I can’t find it now, but I thought there was a reference in this thread or somewhere else that discussed the distribution of pip versions that are out in the wild, pulling down packages–did I dream that? Is there data on this available?

The reason I ask is because it seems like one concern is that currently pip (and uv) will error at wheel versions greater than 1, but don’t say anything about what to do. This is reasonable because there isn’t anything one can do about it, but as preparation for a future bump in the wheel version, it seems fine to add a message suggesting that they update their installer. pip issue here, draft pr here.

Obviously there’s a sort of inconsistency there: you try to install a wheel, it fails and tells you to update, but there’s no updated version that will successfully install it for you. I’m going out on a limb and suggesting that since there’s no such thing as a v2.0 wheel this is not an important problem, and it’s better to push the update message out now in anticipation of future version bumps.

5 Likes

To answer my own question, this data is indeed available from PyPI BigQuery table. Here is a brief summary of pip downloads for the past 30 days broken down by major version, going back to 18 when CalVer started. There were a small number of downloads from earlier versions, all less than 0.02% except for v9 which was 1.4% of the total for some reason.

How I’d summarize this plot:

  • The majority (55%) of downloads come from the most recent major version
  • If you include 2023 you cover 71% of downloads. >= 2022 covers 81%, 2021 covers 93%, 2020 covers 98%.
  • So, adding sufficient documentation/warnings to pip, etc now would make it a little simpler to bump the wheel major version in as little as a year or two from now.
The query I used
SELECT
    COUNT(*) as num_downloads,
    details.installer.version AS `version`
FROM `bigquery-public-data.pypi.file_downloads`
WHERE details.installer.name = 'pip'
AND DATE(timestamp)
    BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
    AND CURRENT_DATE()
    GROUP BY `version`
    ORDER BY `version` DESC
2 Likes

All I mean is that until we know we have a mechanism in place to change the feature set while keeping a static file extension we don’t know that, in fact, a static file extension is workable. I’m okay with saying, “we hope so,” and going on faith that we will work it all out, but I just wanted to call out that’s what we’re going with.

I don’t think that’s necessarily true if you define that the suffix search must work in reverse order from the back of the file to the defined way to find the start of the version.

2 Likes

The majority (55%) of downloads come from the most recent major version

While I’m sure it doesn’t change the overall argument that people who are using PyPI as their index tend to be on fairly new versions of pip, I would just point out that pip follows a form of Calver, and the most recent major version is 24.2, and versions 24.0, 24.1, 24.1.1, and 24.1.2 are not considered part of the most recent major version.

2 Likes

Yeah fair enough. It wouldn’t be that difficult to plot this per-month instead but I don’t think it changes the overall conclusions.

Other than the idea to embed a binary field at the start of the file (and require custom parsers to load them), the proposal seems to be to keep the mechanism the same as what currently exists, but change the extension once to let installer tools know that we’re actually planning to use that mechanism at some point in the future (and use it as a bit of an excuse to make the resolver take wheel versions into account, rather than just the installer).

Otherwise, we would drop the wheel version field entirely (rather than moving it from the WHEEL file to the METADATA file).

2 Likes

This is the big issue for me, I can’t make sense of keeping it mostly the same but in the process directly create a situation of silent behavior which the community has already acknowledged is not good for user experience as part of the upgrade, even with the history of people being unwilling to progress the wheel.

  • The pep claims there won’t be a need for doing this again, but the process is largely the same as what we have now
    • We haven’t tried just using the mechanism we already have.
    • If we dont trust installers to handle it now, what’s going to change next time?
    • There’s nothing technical that forces installers to be version aware going forward, even though it is possible to force installers to be version aware on a technical level if we can’t trust installers to follow a spec on a social level.
  • The only behavioral change is saying resolvers should preemptively be filtering, which raises two issues, though one is an issue of “should they?” and the other of “why can’t we do this less disruptively?”:
    • We have strong indicators from other situations it already happens that resolvers ignoring files that are compatible because the resolver or installer doesn’t know how to handle them is not good for users and creates more issues (on an issue tracker) for library authors if it results in falling back to a source distribution, and more issues for users if it causes them to only see outdated version of the library.
    • We can augment resolver knowledge with index metadata.
  • discarding a spec for a near identical one to specifically subvert a compatibility mechanism that has not been shown to be technically flawed sends a very strong message about how much people can trust packaging specifications.

I’d rather leave all possible improvements off the table until we can fix the social issue of why we don’t trust the existing tools to be able to follow the specification and inform users to upgrade.

I believe that the lack of a stronger technical solution to these problems as part of the pep means we need to ensure the social issue that has gotten us here doesn’t become repeating. If we can do that, we probably don’t need this pep and can go back to revive the old improvements this is intended to enable, if we can’t do that, we need a more forceful technical solution that enforces versioning is respected going forward.

If this means we need a process that involves tying wheel major versions being valid to tooling adoption, that’s still going to be better. If we need to tie it to python releases, that’s still going to be better.

I’m not 100% done exploring a comparison, but I couldn’t do any theoretical change to add features to the wheel uniquely via 777’s changes. The one thing 777 could allow, but doesn’t take a stance on is changing the naming format of the wheel to be structured differently, and I would rather augment resolver knowledge in other ways with index metadata, something 777 itself suggests anyway.

5 Likes

Some pip messages point to URLs like:

  • https://pip.pypa.io/warnings/backtracking
  • https://pip.pypa.io/warnings/venv
  • https://pip.pypa.io/warnings/enable-long-paths

Maybe an unexpected Wheel-Version should point to a similar URL? The recommendation there (to update or not) could then be adjusted in the future also for old installers.

1 Like

Given that this is a standards compliance issue, rather than a pip specific one, such a document should be hosted somewhere else (probably in the packaging guide) so that other installers like uv can link to it as well.

But the important thing is what would that document say? Do we simply say “upgrade your installer” and offer no solution for people who can’t do that? Or do we try to offer help to the user to get things working again?

There’s also the question of what scenarios we consider worth addressing. Suppose there’s a new wheel version that adds symlinks, and binary wheels that include C code need that feature to do the .so file versioning thing that Linux does. So a project uploads a sdist, a pure Python wheel, and a wheel containing accelerated C code for certain platforms. The accelerated wheels use the new wheel version, but the pure Python one doesn’t need the feature and so is a version 1 wheel. In that situation, how does the user get a valid install (which would use the pure Python code if their installer didn’t understand wheels with symlinks)? Do we simply say “sorry, you’re out of luck”? Do we tell the user to ask the project not to do this? Do we pretend we hadn’t thought of this possibility and just not mention it?

1 Like

The symlinks one might not be the best example, symlinks could be designed to not need a major version, as optional support via a supplementary metadata file in a way that the archive points to the same data section to multiple places as real paths, but the supplementary metadata says how to change duplicated files to either symlinks or hardlinks or either at tool’s discretion.

Just say “update your package management tools”. Without the ability to time travel, that’s always going to be required. In terms of ensuring that’s a viable option for users, I don’t know have complete answers here. I have given it some thought, but this becomes a social and specification compliance issue and requires buy in from stakeholders.

We could tie wheel formats to tool adoption and suggest that projects should (not must) prefer waiting to move to a new major wheel version until 2 of the currently maintained installers support it, and that at least 1 would need to have a reference implementation for the features forcing the upgrade of the major version. (typing does something similar for it’s new major features)

We could also say that public indexes should not allow uploads of new wheel formats until 1 month after their adoption and loop pypi into this more strongly if there is appetite to ensure an additional time barrier here.

3 Likes

I wrote up an appendix about this as part of the PEP: Appendix: Analysis of Installer Usage on PyPI | peps.python.org

I think my view on this is that breaking even 7% of pip downloads is an issue (that’s still millions of downloads that could fail), so we probably need to be more careful about how we publish wheels that are incompatible.

3 Likes

Thank you for sharing the background! I will definitely look at issues with the existing mechanism (see e.g. below example), and try to find in past discussions if the current specification was listed as a reason a feature was tabled.

The current specification says nothing about what installers should do when encountering an incompatible wheel other than aborting. I don’t think it is fair to say that pip and uv are out of spec by not suggesting users upgrade. There’s no guarantee that a newer version of an installer will actually be able to handle a newer wheel format’s features. The environment the tool runs in can have an impact too.

I very much agree linking to some webpage that can be a living document is important here if we do not introduce resolver behavior related to wheel versions. However, I think it might be useful to have tool specific pages so that relevant implementation tracking issues can be added. If pip latest doesn’t support the latest wheel spec, I’d like to be able to subscribe to the issue so I know when it does!

I don’t think there’s a way to address all scenarios while keeping the explanation simple enough for users not to throw their hands up in the air and giving up if we keep the status quo and rely on documenting workarounds.

A slightly modified example from yours which I’m thinking of:

Assume some library uses pure Python code and ships a small wheel (10kB), but uses some accelerated code which is much larger (50MB). It is perfectly valid for them to release the pure Python wheel using wheel 1 without compression for maximal compatibility, while releasing the larger accelerated code wheels tagged to specific platforms with some advanced compression introduced in wheel 3 so their users get the advantages of faster download times. Let’s assume for argument’s sake that the library author also does not publish an sdist.

A user, even on an up-to-date installer, but with a CPython not built with the advanced compression, will be unable to install this package as far as I can tell. Resolvers will always pick a more specific wheel for the user’s platform, and the user would continue to get an error at install time. I’m not aware of any pip flag that exists today that we could even document for the user to use to get out of this situation. --platform, as far as I can tell, won’t help because if the user specifies --platform none-any, the platform specific wheels are compatible. You could maybe hack around this by passing --abi to some tag outside of all the platform specific wheels listed. That seems incredibly hacky.

And we can’t solve this by mandating all wheels are the same version, because most features will be optional, and that wheel 1 wheel in the above example could just as easily be a wheel 2 wheel, just without compression.

I think this question is for future PEPs that introduce new features, in their “Backwards Compatibility” section. Right now, it might as well just be a page that says “whoa, where’d you find that wheel?”

8 Likes

That’s an extremely fair point. My perspective on this is that if tools are aborting for a known case regarding compatibility, then not providing that information to the user is not in service of the user, but that’s isn’t currently a specified behavior.

I think this falls under things we can augment the current mechanism with without changing the format or discoverability by revolvers, but it is certainly something that could lean in multiple directions.

Right, I view compatibility here in sort of 2 phases.

  1. Is the package compatible with the environment it is to be installed into.
  2. The installer is compatible with the distribution format.

My primary concern for a bad user experience by having the resolver automatically skip over incompatible wheels is when the wheel would be compatible with the environment, and the only issue is an out of date tool that either vendors all of it’s dependencies and should never be a compatibility blocker to update (pip), or may not even be installed in the environment a solution is for (uv, hatch, poetry, pdm)

If the packaging guide had a canonical table for tools to link back to that should be freely updated to match package management tool’s support for wheel versions and features, we could ensure that the user knows when upgrading can solve this, but it still requires that their tools are updated.

In the case where there is no update available yet, we probably want a supported way for users to find the last version that their tools do support, but I would hope we can get enough coordination of future wheel peps so that reference implementations are available before the major version is accepted to minimize the window of time that packages could be in this state.

2 Likes

I just did a little bit more digging, as there’s another very pertinent bit of information we can get from BigQuery: the python version itself[1].

I think the upshot of this is a bit rosier: if you only consider versions of Python that are currently supported (3.9+), 99% of them are using pip 21 or higher. And if you consider the versions that will be supported next October (3.10+), 99% of them are using pip 22+. Of course this isn’t surprising, since a new install of Python will be using the latest version of pip.

I don’t know if this is the official policy, but I think it’s pretty reasonable for PEPs to only consider backwards compatibility for actively maintained releases.


  1. basically just adding details.python to the previous query ↩︎

6 Likes