PEP 817 - Wheel Variants: Beyond Platform Tags

Creating new variants isn’t necessarily the same as uploading new variants. We could have the situation where someone forgot to upload all variants at once. It’s also possible that different people are building different variants (because of the different hardware requirements) and uploading them at different times, even though they’re all working from the same sources.

1 Like

I once had a situation where my internet connection suddenly became flakey as I was making a release. The sdist was the first file uploaded but wheel uploads kept failing and it took a few hours to get them all up. Anyone doing pip install foo in that time would have seen many kinds of failures depending on whether they had C compilers and did or did not have libfoo.

I haven’t had that kind of problem since using trusted publishing (direct from GitHub to PyPI) but the non-atomic release issue is a big one. I think PEP 694 would solve this fully and with proper access controls would also help for supply chain attack risk since with trusted publishing everything about dispatching/rejecting the release has to be controlled on the GitHub side but there are no other controls on the PyPI side.

I can certainly imagine a situation where PyPI ends up having the variants.json file but not the wheels that it lists.

1 Like

Agreed with you, even if PEP 694 is not mandatory for Wheel Variants to operate. I think it will be a significant life improvement for package maintainers & publishers.

Just to specify what would happen in this scenario:

  1. It would pick the best variant available - if any is compatible.
  2. If no (compatible) variant is available => pick a non variant (if available)
  3. If no compatible wheel at-all is available (as in your above usecase), it would either pick the sdist (if you didn’t block it) or fallback to the previous release.

As said above, this usecase is technically supported (with some caveats)

  1. Variant configuration is specified inside pyproject.toml and static accross all variants.
    If the variant configuration does not change, you don’t have to worry about it because the variants.json is directly derived from pyproject.toml (not from the variants uploaded to the index).

EDIT: The variant configuration does not specify which variants are available / built. Please refer to the PEP: pyproject.toml: variant project-level data table

  1. If you end up with different pyproject.toml per variants… Now it gets funky …
    1. If you change the dependency list => you will panic both uv and poetry. They both assume dependencies are identical across all wheel artifacts. This has nothing to do variants and is already true today. In most cases, a new variant will require some new dependencies to be added, if that’s the case, it should be avoided at all costs. And it’s not a variant problem, it’s a don't publish a wheel with different dependencies than the others problem - variant or not - post published or all-at-once published.

    2. Now if you want to post-publish a new variant with strictly identical dependencies:
      Two scenarios:
      A. The variant configuration (inside pyproject.toml) doesn’t need to be modified => no problem (see my first paragraph).
      Example: You forgot to publish / forgot to build / build crashed and you fixed it / etc.
      B. You need to add a new provider / specify a new variant => you need to modify pyproject.toml => you need to modify variant.json => you should not do that

TL;DR: If you need to change the pyproject.toml in any way => NO, otherwise YES
Which means, if you already built them (with identical pyproject.toml) but forgot to upload some, there should be no problem.

What do you mean by “block it” here?

As far as I know I can choose only whether to upload the sdist or not. Do you mean something different from just not uploading it?

I meant pip install --only-binary <package> essentially blocked the sdist. Not sure if “blocked” was the right wording.

2 Likes

People read the absolute minimum they need to get going, and go from there. If I’ve heard that pip is the Python package manager, and I’ve used other package managers in the past, I’m not going to go through all possible Python package management documentation (while trying to sift through all the slop), but I’m going to run pip --help and go from there.

If the defaults are bad, people are going to get bad results. If people get bad results, they aren’t suddenly going to sift through all the documentation to get things perfect. They are going to do the absolute minimum to resolve their immediate problem and get going.

Requiring people to read more documentation doesn’t make them wiser. It only means they will be more frustrated, and take more shortcuts. And supply chain attacks are much more likely when people are tired, and are asked to keep looking for solutions, where the space is ripe for them to find bad advice.

And I don’t mean to insult anyone by claiming they don’t want to read. We’re living in times where ability to spend a lot of time and energy on reading documentation is a privilege.

6 Likes

This may be a bit of a tangent, but I think your two paragraphs there hint at some of the underlying issues here. I think many Python end users would be surprised to learn that pip does not consider itself responsible for the security of all its vendored code. And yet people do trust pip — not (I suspect) for any technical reason, but mainly because it comes with Python. In other words, the status quo may already violate the trust expectations you describe.

So from the perspective of user trust, it may not be possible to make things more trustworthy for the average user without additional work for pip maintainers. I can certainly understand why pip maintainers can’t or won’t take on that work, all I’m saying is that if it’s still the case that pip may contain untrusted code, I’m not sure anything about variant providers is going to make a huge difference. People may already be trusting pip more than they should even without variants.

7 Likes

Yes it would!

Well, I guess that’s true for the human, but good documentation does help your clone army, er, coding agents. So keep writing great documentation!

3 Likes

If one extra command in a “how to install this” is “a lot of time and energy reading documentation”, then I think the users you are trying to support come at too high a cost to everyone else with the proposed solution.

The default behavior the pep is pushing for is less secure than what we currently have, with harder to reason about consequences, and you’ve written off things I see as possible to abuse as “moot because nobody should do it.” Forbid the behavior if you’re unwilling to consider what would happen and think nobody should be doing it.

We can have better resolution of variants without the problems people have brought up, but it’s hard to actually move toward something of that nature with the pep authors being unwilling to budge on any aspect of what they’ve come up with and saying users who won’t even read a simple quickstart provided by a library are more important than the security concerns.

One of the following would be enough:

  • We standardize variants centrally, rather than having external providers. We update the definitions of them at least as often as vendored dependencies could have reasonably been updated. No third party code is recommended for this.
  • We allow users to opt in to other code being run as part of the package management process.
9 Likes

Note: this is a comment on behalf of several PEP authors who are most active in this review process this week, to summarize our common position on the security/vendoring/UX discussion.

One of the key goals of our PEP is to make pip install torch, uv pip install torch, pip install vllm, etc. “just work” for prominent libraries that take advantage of modern hardware. And yes, that includes other installers too - there’s just no good shorthand for <installer-name> <default-install-command> <package-name>.

UX is critically important (nod to Warsaw’s fourth law, nice @barry) in our opinion. We want the default install experience to be selecting the optimal wheels for the system/environment it’s running on. We know it’s possible after spending a lot of time[1] iterating on design and prototypes, and the tradeoffs seem acceptable after careful consideration. They may turn out to require changes to the PEP (e.g., use of a curated allowlist, a new packaging-like central library, or drop or modify MUST/SHOULD working) to be able to come to a consensus, but we’re convinced it’s the right goal. As PEP authors, we therefore choose a design that achieves that key UX goal[2]. We are not ready to compromise on that goal right now based on feedback received so far from a relatively small number of voices, while from our design work and a lot of experimentation[3] we still believe that the current design is feasible and the best approach presented so far.

@oscarbenjamin thank you for the thoughtful posts. We wanted to specifically address this one point. It’s possible you are right here, but we think it’s too early to lower our ambition level or change the design again. We also think that explicitly aiming for the intended end state rather for some intermediate rollout state is the right thing to do. We can always do what you suggest later, if necessary.

End user opt-in design suggestions

A number of suggestions around letting users opt into using variant wheels have been made[4]. These do not achieve the goal of making default install commands “just work”. Therefore, we will put these suggestions as a single “end user opt-in designs” into the Rejected Alternatives section of the PEP, with as clearly expressed reasoning as we can[5], in the next PEP update. It’s always possible to revisit that decision in the future, however for now that’s our stance. And as PEP authors, that is our choice to make. We’ll take new input after that next update, possibly in a dedicated thread.

Next steps on security/vendoring/UX

We will take some actions that have been suggested already in the thread:

  1. Reach out to other installer tool authors to get more input (@konstin already did for PDM, Poetry, Hatch and Pip),
  2. Update the PEP with some diagrams and worked examples to make it easier to understand what is actually proposed
  3. Update the PEP with sections “expected changes from PyPI and index servers”, “expected changes from install tools”, “expected changes from build backends”.[6]

Reviewing other important parts of the design

We’d like to move on to reviewing other parts of the design. There are a lot of other parts to the design that are important, would greatly benefit from community review, and are orthogonal to the security/vendoring/UX choices.

We’d like to review and get agreement on the general mechanisms of the PEP such as:

  • the PEP 517-style variant provider interface
  • the priorities and sorting logic for variants
  • the Ahead-of-Time (AoT) providers concept
  • the expressibility of wheel properties: whether we capture all platform combinations without being too expressive for resolvers
  • the abi_dependency special namespace

Any feedback on those topics would be much appreciated.

We also realize that fast-moving DPO threads aren’t ideal for everyone. We’d welcome feedback sent to us directly, or shared on other channels like the PyPA Discord. We appreciate some comments we already got outside of DPO, like some small mistakes we should fix and a question on filename length.


  1. Hard to be exact, but on the order of 2 person-years worth of effort ↩︎

  2. Arguments for better UX being better for security - or conversely “compromising UX is compromising security” - can also be made, because people will default to doing insecure stuff (like enabling all providers). ↩︎

  3. All done in the open, including prototype implementations, the prototype index and more specific discussions on an array of topics - there’s a lot of research we have done, much of which doesn’t fit in the (already longest) PEP. ↩︎

  4. Some abstract ones, and some concrete like $ [uv] pip install --enable-providers torch ↩︎

  5. A number of arguments have already been given, both in this thread and the previous one. ↩︎

  6. TBD where exactly. This PEP is already the longest PEP ever it seems, so we may have to use more appendices or separate docs to keep the content navigable ↩︎

8 Likes

Thank you for this clear statement of your position. I still disagree with the position that variant processing must be “on by default”, but I’ll respect your wishes and leave those views for another time.

One comment I’d make relating to something you said in a footnote:

TBD where exactly. This PEP is already the longest PEP ever it seems, so we may have to use more appendices or separate docs to keep the content navigable

I would strongly recommend that instead of simply adding more information, you spend some time working out how to streamline the PEP and reduce its size. As it is, the PEP is extremely difficult to read and understand. The “Motivation” section, for example, is huge - and frankly, I think everyone likely to read the PEP is fairly comfortable with the idea that the current state of affairs is far from ideal, so most of that content is unnecessary. Similarly, the “Rationale” section is supposed to explain why you made the design choices you did. But it reads more like a summary of the design, with too little detail to work as a specification, but too much to clearly state your reasoning. And conversely, the “Specification” section seems to have too little detail.

As an example, there’s a statement “installers SHOULD query the variant provider to verify whether a given wheel’s properties are compatible with the system”. I tried to work out how that query would be coded, and I ended up skipping all over the specification section, coming away none the wiser. I couldn’t even work out how to find out what properties a wheel has, given just the variant label from the wheel filename (I deliberately ignored the *.dist-info/variant.json file, as I’d been explicitly told earlier in the thread that I should not have to download the wheel, or a per-wheel metadata file, to do variant matching).

I’ll be honest, without some sort of rewrite like I describe above, I can’t even clearly identify the general mechanisms of the PEP, much less review and comment on them :slightly_frowning_face:

Please understand, I’m extremely appreciative of the amount of work that has gone into this proposal. I just wish that we’d been able to flag issues like this before that amount of effort had been invested, because we now have a “sunk cost” issue, where there’s likely to be a reluctance to extensively rewrite what took so long to develop[1].


  1. IMO, there’s a certain irony in the fact that the authors insist that making variants opt-in is a bad UI, while still making participation in the development of the PEP opt-in for interested participants, by having it occur off the default Discourse forum… ↩︎

7 Likes

Mike, mischaracterising the position of the PEP authors this way is taking the discussion in an unnecessarily antagonistic direction. The PEP has changed substantially from its previous iteration, so the authors are clearly taking feedback on board, even if the attempted resolutions aren’t necessarily what the folks providing the feedback were hoping for.

There are a lot of competing demands here, and different aspects of the proposal are aimed at different elements of the audience. Consider two of the most significant aspects:

  • The “variant selection should just work optimally by default” design goal is aimed primarily at folks dipping their toes into Python development spaces with complex hardware requirements for the first time. The status quo is that these folks either face multi-gigabyte downloads with lots of hardware support and complicated runtime dispatch logic, or they get smaller downloads that run even slower because they’re not making effective use of the user’s available hardware. Those users that ask the question “How do I make this better?” (or have someone more experienced providing that guidance) do find options for doing so in the relevant documentation, but actually setting things up to be optimal is fiddly, fragile, and frankly concerning from a supply chain security perspective (due to either combining multiple indexes, providing more vectors for typosquatting and other such attacks, or introducing intermediaries that have an opportunity to alter the provided software).

  • the “separation of expertise” established by the provider plugin model is designed to allow packaging tool maintainers to focus on providing a trusted delivery channel from package authors to package users, while delegating the question of “which variant is most optimal for this target environment?” to folks that better understand each defined axis of variation. There are then two primary potential vectors for establishing trust in those variant selectors: direct to end users (such as through trust on first use mechanisms with dynamic provider plugins) and indirectly via packaging tool maintainers (such as the vendoring model now suggested in the PEP, or a common library like variantlib). The level of trust actually needed isn’t particularly unusual (it’s no worse than any other packaging tool dependency), the only thing that makes it controversial is that it’s new code that may need to run during binary-only installation (but that’s also true for something like a packaging tool’s choice of UI library).

The PEP does blend concrete interoperability requirements with non-normative recommendations for tooling authors, but the latter aspect is largely in response to concerns raised with the previous iteration where folks were struggling to picture what the actual UX might look like (and were legitimately concerned about the security horror show that could result from implementing dynamic plugins in way that allowed them to run for installation commands that currently avoid arbitrary code execution).

14 Likes

Both the “Rationale” and “Specification” sections underwent major shuffling as a result of PEP editing process. It has been pointed out that “Specification” has too much detail, and the “Rationale” is hard to comprehend without understanding the design in the first place. Therefore the PEP was redesigned, so that “Rationale” can provide sufficient context to understand the problem without having to cross-reference “Specification” a lot, and “Specification” was limited to the bare metal details covering the implementation.

I admit that it is still possible that some things are not specified in great enough detail. We will be glad to improve them, but please bear in mind that going back to the form that was explicitly rejected during the PEP editing process would be counterproductive at best, and I honestly doubt that it would improve understanding.

Understood. But if you don’t mind, can I ask who was involved in the PEP editing process? I don’t recall it myself, so I assume I wasn’t (or my memory is just bad!) It may be that you’re now seeing views from a different type of audience, and that needs to be incorporated.

I sympathise with the difficulties here (I’m terrible at writing good documents) but it’s a complex PEP and explaining it well will be crucial to its success.

Also, I’ll reiterate my point that, as a pip maintainer, I couldn’t find the information I would need to implement the requirement “installers SHOULD query the variant provider to verify whether a given wheel’s properties are compatible with the system”. That seems to me to be an unarguable failure of the PEP to provide necessary information to key audience members.

2 Likes

It’s all been happening in public: PEP 817: Wheel Variants: Beyond Platform Tags by DEKHTIARJonathan · Pull Request #4740 · python/peps · GitHub

We’re going to try to improve this but as you’ve pointed out this is an awfully complex document, and we literally need time to figure out how to improve it.

If you’re looking for an immediate answer, then the index-provided {name}-{version}-variants.json file provides the same data as variant.json in the wheel (except that it combines data from all variants), and you’d use that to map the labels into properties.

As far as I can tell, nothing substantial has changed in terms of taking the security feedback given. It still allows any package to act as a provider, and for a provider to run whatever code it wants.

While there is a recommendation that installers install vendor-specific providers, There is no restriction in the PEP that installers only run vendored providers, so the vendoring “solution” isn’t solving the arbitrary code problem.

There is this statement:

All the tools that need to query variant providers and are run in a security-sensitive context, MUST NOT install or run code from any untrusted package for variant resolution without explicit user opt-in. Install-time provider packages SHOULD take measures to guard against supply chain attacks, for example by vendoring all dependencies.

But the pep authors have given no way to actually do this when given their stated goal during discussion of it is to not have that opt-in that reflects even the pep’s own acknowledgement of different security contexts.

The authors conflate different levels of trust in some places, but acknowledge that tools may run in different security contexts than the python application that will be run in the environment being setup. It’s not enough that a package is installed in the python environment to determine that it is trusted as part of the installation process.

It also leaves out of scope intentionally the ability to lock packages that include variants with a lockfile.

I’m sorry if my concerns here have come across too sharp, but it’s hard to have to keep reiterating that these concerns exist and haven’t been suitably covered by those insisting it has.

4 Likes

This doesn’t sound like a specification problem; it sounds like installer UX. Installers should be able to choose their own default level of risk, and their own configurability. Why try to force it into a specification? Spend the energy on controlling your own environment, and contributing reports of actual malicious behaviour in public releases.

They shouldn’t have that statement IMHO. “Tools may (and probably should) provide user-controlled options and configuration to disable variant resolution, thereby avoiding execution of potentially untrusted code.”

There seems to be plenty of good in here that’s being held up by installer UX discussion, which can absolutely be resolved after the interoperability aspects are settled. It seems this discussion could be helped by focusing on specifying the aspects that package developers, distributors and installers need to agree on with each other, rather than trying to then dictate how those things interact with the humans who are using them.

6 Likes

@steve.dower The PEP itself says things like

For a consistent experience between tools, variant wheels SHOULD be supported by default. Tools MAY provide an option to only use non-variant wheels.

as well. Paired with the comments in this thread about intended end state, the authors are attempting to specify a default experience that all tools should have, and it’s one that as proposed comes at the cost of security.

1 Like

They should absolutely describe the intended default experience (though I’m not a fan of using specification language for it), but the cost of in-depth security has to be carried by the ones who care about it.

Most users who have already chosen to install a package have expressed their trust for anything that package does at that point. They fully intend to run arbitrary code on their machine, there’s nothing to secure here, it’s game over already. Whether that package runs arbitrary code at install time or import time is irrelevant to these users.

Those of us who care about install time should choose/create installer tools that let us flick the switches (ideally in a global configuration that I can deploy to hundreds of machines, rather than having to teach hundreds of my colleagues to pass certain command line options). If a tool isn’t suitable, then ban it in our organisations.

But we don’t have any right to burden every single other tool with those constraints. And all you’re achieving by trying to do so is forcing those regular users into less secure solutions, where they learn that downloading and executing random shell scripts to install tools is “normal”, rather than staying within the part of the chain that we do have some control over.

8 Likes