PEP 817 - Wheel Variants: Beyond Platform Tags

I think you’re making a valid point (except for it being one per version, as was already pointed out). However, I would like to make it clear that usually the overhead will be relatively low (i.e. one extra request for a relatively small file), and it will ramp up only in a corner case.

One “optimizing” assumption (that’s not explicitly noted in the PEP, I think) is that variant processing (and therefore fetching the JSON file) can be deferred pretty late into the pipeline. An installer can first fetch the distribution list, filter it per version specifiers, wheel tags and other parameters it uses right now, order by versions, and only then fetch the JSON file for the newest version (that remained after the initial filtering).

Now, in the general case this will be the only file fetched, as at least one wheel will be suitable: either one of the regular variants, the null variant or the fallback regular wheel. I suppose there could be a corner case here if someone publishes a package that could have no suitable variant. Say, if someone makes a package that has only CUDA variant, then no variant would match and the installer could indeed keep trying older versions (and fetching their JSON files) until it processes all versions or finds a version predating wheel variants. I think we could address this corner case by adding a requirement that “the installer SHOULD NOT try older versions if none of the variants for the newest version that is otherwise suitable for installation is compatible”.

Could you explain the issue here in more detail? Is this a case where pip eventually finds a suitable version, or passes through all versions before failing? On what basis are the versions rejected? I suppose it could be a problem if pip actually needed to fetch the JSON file for every version, but it sounds a bit like a corner case of a corner case. Not saying we shouldn’t look into addressing it somehow, though.

Indeed, that’s a good question. To be honest, perhaps I wrongly extrapolated from our efforts on backwards compatibility and transition periods. I suppose we can’t guarantee that a particular tool will implement the standard, but then we can’t guarantee that it will follow the recommendation and basically discourage people from using said tool as a result.

Well, this I can answer: because of being unaware of its limitations :-).

I don’t think they’re expected to have higher security risks than pip itself or its vendored dependencies, and are likely to have lower (for example, I can’t think of a case when they’d need Internet access).

Here is an attempt that aims to capture all the important steps at a high level - I hope it helps. I’m sure this can be made more detailed for particular scenarios, done for building variant wheels, etc. - but let’s start with the 10,000 foot view of “install package <pkgname> please”:

I’ll emphasize that these are the conceptual steps added to the installation process for all installers that want to implement this design. We do have actual prototype implementations for uv and pip that can be used to answer more detailed questions.

6 Likes

One reason is that variants may not be useful to everyone. For the people that need variants, it can be extremely important, but if the current set of wheel discriminators is all you need, then you won’t care about – or perhaps even know about – variants. Think things like web development where you don’t care about GPUs or need to eek out the last drops of performance based on your CPU microarchitecture.

1 Like

Thanks. That’s incredibly useful - it’s not how I had understood the process at all. It would be great if it could be added to the PEP.

3 Likes

May I ask what part was highlighted the most. It helps a lot to identify what part could be better explained..

Sometimes an image can be worth a thousands words.

We’ll refine it - if needed - over the next few weeks and add it to the next iteration of the PEP, it’s a great proposal.

I couldn’t find this specified in the PEP, but regardless this is a problem, as package authors can, and do, upload new wheels some time after the initial release (eg to support more architectures).

Honestly, I feel like these hosted variant JSONs are more like a new index response page than a seperate file.


Regarding the diagram, my eyes will thank you of you added it as an SVG like in other PEPs, to benefit from dark mode.

1 Like

Thanks a lot @notatallshaw for your encouragements and the remark on the regex.

Just tried and as you can see, it’s perfectly behaving as expected (it doesn’t match).

I hope the diagram just published above by @rgommers helps a lot to address this, let us know if it’s still unclear.

Well this file does not have to be immutable, we actually developed the tooling to create iterative updates of this file. Index should probably reject conflicting changes, you can add new configuration but you can not change a previously released configuration without creating problem (wheels are immutable).

Now I do believe the usecase you mention is not a great idea anyway, many package managers (uv being one of them) assume that dependencies are identical across all packages, releasing a new “architecture” very often means slightly different dependencies and you would almost certainly break a non significant part of the ecosystem by doing that.

It’s unfortunately not a “standard” but very much a widespread assumption.

The safe solution would probably be to release a new “build number” of the same version and consequently you can modify the variants.json all you want.

@oscarbenjamin

In general it is known that pip install <arbitrary text> is not secure because if there are sdist-only packages then pip downloads and builds them and that means arbitrary code execution.

Thank you for the in-depth message I couldn’t agree more with everything you said.

I never understood why “vendoring providers” was actually a valid answer to the security problem.

  • If you don’t specify anything (no --only-binary or similar) you are already accepting to execute arbitrary code on your machine. And in that case I don’t understand or agree that variant providers pose any more of a risk to the user than an sdist or a build backend.

  • If you do specify “no remote code execution” then there needs to be a path that takes no provider at all. And we actually mentionned this in the Out of Scope Features

The format of a static file to select variants deterministically

@konstin had actually a proposal that we decided to leave outside of the PEP and as a “tool implementation detail”.

For me - not speaking for the other co-authors - I never understood how “vendoring” was fixing anything. I do not understand how variant providers pose a greater risk than enabling sdist & any build-backend by default.

I quite firmly believe that any solution that forces the user to opt-in will virtually defeat the purpose for most users.

  • People don’t read documentation, even if they should.
  • People will expect to obtain a package for platform A and obtain the non-variant, creating frustration and additional github issues to the wide open source community.
  • People will get variants transitively by dependency and have no idea they needed to look at the doc of a dependency to include --variant-plugin=X,Y,Z

If vendoring is the way - so be it - but I don’t understand why it’s necessary given that I don’t agree there’s any change in the security model of pip or other package managers.

And it’s not without its own problems as @pf_moore mentioned:

  • more work for package managers maintainers (arguably not too much but still, it’s valuable time)
  • any security update would need to wait the next release cycle instead of being immediately updated / yanked.

That being said, I am inclined to really consider any proposal or idea that puts User Experience as a priority and will not force everyone to read the documentation before installing a package.
Especially that you might get a variant transitively, and consequently you may not even realize that package-X was shipped as a variant.

It is an important answer to many security problems. You weaken the PEP significantly by saying this.

1 Like

PEP 694 may help with this, as it would allow you to create a publishing session, into which you’d upload all the wheels you want (including variants) and your comprehensive variants.json file before atomically publishing everything in the session.

1 Like

Thanks all for the additional information and clarification. I’m going to avoid making individual replies to reduce the chance of fractured threaded discussions, so in my generalizations of multiple posts let me know if I missed an important nuance.

Having vendoring act as a security check is not where I left the prior thread, I had stated in that thread that I think it’s important that existing tools start as opt-in, and could transition to opt-out over some period of time to give security conscious users time to enable force opt-in options. It was not clear to me that any consensus had formed, and that vendoring had become a linchpin of this proposal.

Otherwise, I think most take issue with my post in a way that can be summed up from this line in the PEP’s abstract:

The goal is for the obvious installation commands ({tool} install <package>) to select the most appropriate wheel, and provide the best user experience.

With that goal in mind, I think it’s important that as much of the logic for variant selection go into a standardized, and maintained, library that pip, and other tools, require or vendor. It’s not clear to me if variantlib is intended to be that or if that’s just PoC implementations.

And then as a maintainer of pip, my main issue with the vendoring approach is that it’s moving the problem of “blessing” specific providers from users, who aren’t isn’t experts in providers, to Python packaging install maintainers, who aren’t experts in providers. Each Python package installer will need to make it’s own choice (pip, PDM, Poetry, etc.). If blessing and vendoring providers is the security solution, I think this also need to be done in a standardized library, where a mechanism for choosing good providers to bless can be defined. Whether that’s the same library as the variant selection, or not, is unclear to me.

packaging has already been mentioned, so putting on my packaging maintainer hat[1], a core design is to be sans-IO as much as is possible. Can the whole variant selection logic be expressible without IO? If so that would be great anyway.

A pip prototype has been mentioned, is it GitHub - wheelnext/pip at 2025-05_PyCON ? Or is there a newer version?


  1. Bearing in mind I’ve not discussed this with any other packaging maintainer. ↩︎

I really don’t agree. It’s a solution, but I disagree with that variant providers are any more of a problem than sdist or build backend.

And I do not understand why it’s fine to have, default ON for sdist/build backend but not variant providers.

If one is dangerous all of them are. It’s perfectly fine to make the point all of themvare dangerous, but I don’t see anyone deeply concerned and actively trying for years to change the default behavior of package managers. So I suppose it’s not that important / concerning.

My only point is consistency, either it’s dangerous and all these cases are dangerous and should be treated with the same level of care. Or it’s not sufficiently a concern to sacrifice UX and then we are back at vendoring not necessary.

Or potentially I’m missing something and the risk surface of variant plugins is significantly greater and I’m totally missing the point.

1 Like

It doesn’t feel like the security concerns brought up in previous rounds were understood given the solution provided and the default experience suggested.

The BLAS/LAPACK version matching is also still not adequately explained as to why it is an issue. Bundled native dependencies were meant to be isolated with wheels, right? This seems like something has gone wrong and implementation details wheels were meant to package neatly are leaking, not like something that should require variants. If this is not the case, then why can’t another wheel provide BLAS/LAPACK directly? if it is the case, why are we looking to supporting breaking that isolation?

If the above isn’t an issue, we’re back to just hardware detection, and that should be easier to specify simply and with regular updates being easier to do while properly accounting for other concerns like security.

Yes it absolutely was built with this intent (if the community decides to adopt it)

We put great care into having very extensive testing , a lot of input validations with all the classes and tooling that installers & build backends could need (or at least that we could think of) and minimizing dependencies to allow vendoring easily.

And yes, the fork of pip that you mention was a tentative (fully functional) of variant support. Though this was not intended to be perfect but rather prove that it works and how it could work.

Branch: pep-xxx-wheel-variants (we shall rename it :sweat_smile:)

1 Like

I appreciate that that was your preferred solution, but you seemed to be more or less alone in that - the majority of security-conscious folks wanted no third-party code execution without opt-in, ever, period. That sentiment was strong and broad enough that that seemed close enough to consensus as a non-negotiable requirement. Hence we adopted it.

The key security concern was “no automatic third-party code execution”, which was understood and addressed. If you have new or different security concerns, it’d be nice to articulate them.

There is no “version matching” specifically for BLAS and LAPACK mentioned in the PEP I think? The motivation and examples given are a prominent case of build variants - it’s important enough that (for example) NumPy and SciPy already ship multiple wheels for the same platform[1], they’re just not user-selectable now. When done as wheel variants, they will be.

No isolation is broken, one can still vendor a BLAS library the same way, or depend on a BLAS-in-a-wheel package.

It is an issue. And there are other such dependencies or build variants (OpenMP, two choices for C++ string ABIs, etc.) as well as Package ABI matching. Without going into detail of any of them: there are important non-hardware use cases.

Even if it were just hardware detection, trying to separately standardize CUDA, then ROCm, then Intel XPUs, then Huawei Ascend, then whatever accelerators come next isn’t very realistic - the pace of chance is way higher than for CPUs, and it’s only accelerating.

On top of that, it isn’t really hardware-only even for what I’m listing here, since there’s a software driver component to that hardware support. See the NVIDIA/CUDA example in the PEP: it has both sm_arch (hardware architecture) and cuda_version_*_bound (software).

What you’re suggesting is in the Rejected Ideas (An approach without provider plugins), it doesn’t seem workable or future-proof-able a la manylinux.


  1. E.g., see the duplicate sets of wheels for macOS at numpy · PyPI ↩︎

Frankly, I’m concerned that the sentiment is skewed the population of people who are willing to persistently repeat their points in a 200+ post thread, and not by the community of Python package installer tool maintainers.

But regardless, as a pip maintainer, the concerns I express remain the same, there should to be a solution that does not involve the pip maintainers blessing what providers are good or not. And in general, if this choice can be made in a common place then it will serve all Python based package tooling.

2 Likes

I personally don’t dislike the idea of a “blessed list” of providers that all package managers are able to use.

The only reason I tend to be a little bit hesitant is the increased bureaucracy it would generate but if we make the requirements simple and measurable with minimum process i think it might just be the best “in between”

I think it’s a bit hard from our position to propose the idea and assign the job of maintaining the list to someone/group of people/entity. Unless said group of people is willing to agree to do it

So just thinking out loud here, but it could be a requirement file or lock file that lists blessed providers.

Then there might not have to be a requirement for tools to vendor the providers, just that they must appear on that requirement or lock file, and those providers must be opt out by default, tooling can then choose if vendorong makes sense or not.

It could be stored and maintained in variantlib and then up to variantlibs maintainers to figure out how to add or remove providers.

I personally share that concern. However, until there’s a major change in how Python packaging governance works in practice, and decisions get made fully async on marathon DPO threads, I also see no realistic way of avoiding that problem here - we need close enough to full consensus that discussion wraps up and a PEP can be approved.

Maybe in a year from now, in case PEP 772 gets accepted and a steering council gets elected, finds its feet, and starts meeting, listening, and gently steering, the dynamics will change. But we can’t just park this kind of work for a year for this to maybe change.

I think @konstin expressed a similar feeling with his uv hat on. If installer maintainers are all happy with something like that, I think it makes sense to aim for that.

1 Like

What precisely counts as “third party” here?

Speaking for pip, I do not consider vendoring to change the status of code from “third party” to “first party”.

1 Like