Selecting variant wheels according to a semi-static specification

dhellmann · May 16, 2024, 8:35pm

Mike Sarahan:

On the flip side, I’ll mention some expectations that I have perceived from the corporate community:

They don’t want to use indexes external to PyPI, but it has been a functional necessity. As you know well, the support for variant packages on PyPI is poor.

They really want things to be on PyPI, because it dramatically simplifies instructions for customers and reduces friction with the major projects (e.g. PyTorch, TensorFlow, JAX).

They need to operate on their release cycles. If it takes days or weeks to get a size override approved, there’s a LOT of angst going on inside the company.

Some of this stuff is not at all related to our current discussion, but I bring it up to give some context especially around the question of external repos. We can’t get rid of them soon enough, but they are a necessary evil at the moment, and they’ll be necessary until 100% of our software can be on PyPI.

Teaching the installer to be able to pick between variant builds with more granularity is a step towards simplifying things for end users. It would at the very least let all of the variants of a large packages that can’t be hosted easily on PyPI be hosted in one other place, so a user would only need to run their installer with --extra-index-url pointing to the other hosting location. Then if we can get the wheel sizes down to a reasonable limit, maybe they can all end up on PyPI. If not, maybe there’s room in the ecosystem for another hosting site that focuses on those larger packages.

pf_moore · May 16, 2024, 8:38pm

That suggests the user has to install cudf-cu12 manually, in order to specify the necessary config settings? Which seems like a suboptimal user experience, as depending on cudf-cu12 without that manual intervention wouldn’t work properly.

But I’m way out of my depth here, so I guess if it works, that’s fine.

pf_moore · May 16, 2024, 8:42pm

As I understand it, the wheel that’s built is chosen by the build backend checking various platform details. If those details can change (for example, the user installs a new graphics card) the choice that was made when the wheel was built will now be incorrect. But the installer won’t re-invoke the build system (it has a cached wheel, and it’s allowed to use that) so the user gets an inappropriate binary.

dhellmann · May 16, 2024, 9:06pm

OK, I understand now, and I think we’re worried about 2 different use cases.

The most common case, I think, that we want to deal with is the installer selecting a pre-built binary wheel from somewhere (PyPI, or another package index). Those of us distributing these wheels do not want our users to have to compile them during installation at all.

Even in the compile-from-source case, it seems like either the user will be sophisticated enough to somehow force a rebuild (clear the cache, pass a flag to the installer, etc.) or the installer itself could (eventually) be sophisticated enough to recognize that selector values added to a wheel filename during an earlier build are no longer correct for the host and therefore the cache contents aren’t a match and a build is needed.

Either way, I think we can solve for the pre-built wheel case first and then consider the build-from-source case separately. Unless they’re intimately linked in a way that I’m not understanding?

oscarbenjamin · May 16, 2024, 9:21pm

Two completely different proposals are being discussed in the same thread. The OP proposal is not about ever building wheels but Paul and Mike are now discussing a possible approach that NVIDIA might use that has not been spelled out explicitly anywhere previously.

Probably it would be better to have a separate thread about how the “build backend approach” works.

msarahan · May 16, 2024, 9:21pm

Sorry, I think I’m being messy with mixing ideas about here things are today and how they might be.

Right now, today, nothing detects CUDA version. It is on the user to specify it, and they have to specify it somehow for all of their packages that use CUDA.

So why does cudf-cu12 use the build backend? Because the wheels are hosted externally. The build backend is some sleight of hand to save the user from needing to use —extra-index-url.

Any idea about dispatch among implementations is speculative.

In my ideal world, the user would first configure their installer (let’s say pip) and change some setting that sets a preference for NVIDIA gpus to be used. The user then installs something like JAX or PyTorch, each of which indicate some support for NVIDIA gpus. These dispatch to an NVIDIA-provided package that inspects hardware. The hardware metadata is returned to the installer for JAX or PyTorch, which then use it to map to their known distributions.

cudf-cu12 is a hack to work around limitations described in What to do about GPUs? (and the built distributions that support them) - #64 by msarahan

The ideal situation is to have just cudf with variant dispatch and eliminate the hacks when they can be safely removed.

oscarbenjamin · May 16, 2024, 9:55pm

The design that you have is close to what I imagined.

Firstly, I am not the person you need to negotiate with. I am just trying to help you to propose something that is more likely to be successful.

A long term goal for many people involved in Python packaging is to be able to have static resolution and lockfiles. Any dynamic dependency behaviour flies in the face of that and all previous efforts on this front have ultimately failed.

I don’t know whether my OP proposal here is acceptable to those who want more static resolution or lockfiles etc: that is why I opened this thread. As I see it the OP proposal stretches as far as possible toward satisfying the concerns of those who want static resolution while still preserving the unavoidable dynamic component. If that is not far enough in the static direction then all attempts at dynamic resolution are doomed.

pf_moore · May 16, 2024, 10:32pm

Nor do I. I don’t even know if the original selector package proposal is acceptable or not. My personal concern is around implementability and how proposals will work with the resolver algorithms in pip and uv. I don’t know what other concerns people might bring up (you make a good point about lockfiles, that’s something that might need to be considered).

My feeling is that someone’s going to need to produce a prototype implementation of any proposal (possibly two, nowadays, to ensure that both pip and uv can support it). The evidence from the selector package discussions suggests that simply asking for feedback on a proposal just ends up with everyone talking in circles.

Maybe I’m being pessimistic here, though. Is this something that’s going to be discussed at the packaging summit at PyCon? Having a face to face discussion (if enough of the interested or affected parties are present) may be a much better way of moving things forward.

msarahan · May 16, 2024, 11:17pm

It doesn’t look like this topic made it on the schedule. I will be at the summit, along with Ethan Smith, a colleague from NVIDIA who is also working on this problem. We’ll be eager to discuss this.

pradyunsg · May 17, 2024, 1:18am

It did! You got a bad link.

pf_moore · May 17, 2024, 7:50am

Excellent! Hopefully someone will write up the discussion for people who can’t be present.

oscarbenjamin · May 17, 2024, 10:47am

Mike Sarahan:

Let me lay out how I understand our current approach:

A user wants to install cudf, which implements GPU support for dataframe (Pandas) operations. They can’t install cudf alone, because we currently put the CUDA version into the package name. So the user needs to install cudf-cud12.

If you look at the release page for cudf-cu12, you’ll see a tiny sdist: cudf-cu12 · PyPI

The sdist depends on a build backend, called nvidia-stub:
[build-system]
requires = ["nvidia-stub"]
build-backend = "nvidia_stub.buildapi"
nvidia-stub contains the lookup and download logic, returning a wheel that matches what the “build” of the sdist is expecting. This wheel then gets installed. This is the part that is essentially behaving like setup.py that you say people have been trying to kill off.

Actually looking closer at it this is quite significantly different from what I was expecting. Steps 1-3 are what I expected but step 4 is not. I was imagining that it works like:

User does pip install cudf
pip downloads cudf-0.6.1.tar.gz from PyPI
The build backend is nvidia-stub so pip installs that and asks it for a wheel.
The nvidia-stub build backend detects the CUDA version and produces a dummy wheel with a dynamically generated requirement like cudf-cu12 == 0.6.1.
Then pip goes on to download and install a cudf-cu12 wheel from PyPI.

Essentially this is the selector-package idea but using a PEP 517 sdist as the selector package.

What you are actually proposing is that rather than having a dynamically generated requirement the build backend would just download a cudf-cu12 wheel from somewhere and then present it to pip as cudf-0.6.1-*.whl. From an implementation perspective this works if you ignore people wanting to do offline builds from the sdist: a frontend like pip doesn’t really care how the backend “builds” a wheel and the wheel is allowed to be specific to the machine on which it is being installed.

I imagine though that having the build backend download the files itself (from where?) outside of pip’s control will upset many people for security reasons like supply chain attacks etc.

pradyunsg · May 17, 2024, 11:55am

Yup! The plan is to do the same thing as last year and take extensive notes at the event for folks who aren’t in the room.

msarahan · May 17, 2024, 3:05pm

This is where it goes back to the original reason this shim was created - to smooth over the need to provide --extra-index-url. Aside from the argument being something that users just don’t like to specify, especially for transitive dependencies, it also carries concerns about dependency confusion attacks, because there is no notion of index priority.

The build backend shim is nice in that it downloads from a particular URL directly, so it closes that dependency confusion attack vector. @emmatyping created this approach at NVIDIA, so maybe he’ll have more comment or correct me (maybe in a separate topic?)

I don’t want to belabor any point about that implementation any further. I’m embarassed that I didn’t mention sooner that we consider it a temporary solution, not one that we’d propose for long-term adoption. Hopefully we can iron out something better in the short term. Thanks for helping to seed the discussion with your proposal, Oscar.

pf_moore · May 17, 2024, 3:24pm

I think this is going to be the hardest thing to agree on. Currently, installers (by which I mean “pip and uv”, but it’s more a matter of principle than a design quirk) won’t search for packages anywhere except PyPI and indexes that the user has explicitly opted into. PEP 470 is fairly old now (10 years!) but it provides some useful background into the problems with silently looking on external locations.

Rather than trying to formalise a way for packages to silently redirect to external hosting, I think we’d be better finding a way to allow selecting variants from within the files available on the configured indexes. That means people wanting GPU accelerated packages (for example) would still need to specify an additional index which hosted them^[1] but they wouldn’t need to work out which index to use based on their hardware specification. And it’s perfectly possible to set the extra index in your config, so it’s a one-off “set and forget” action^[2].

unless splitting packages into variants made hosting everything on PyPI a practical option, of course ↩︎
assuming you trust the extra index - but that’s precisely the point here, trust is a decision you need to make yourself ↩︎

oscarbenjamin · May 17, 2024, 4:43pm

Returning to the original topic although this thread is a bit hijacked now…

Doug Hellmann:

oscarbenjamin:
I suggested two mechanisms for overriding the automatic selection:
locktool -r requirements.txt --selector-tags x86_64_version=x86-64-v2
locktool -r requirements.txt --extra-wheel-tags x86_64_v3
…
These are out of band wrt the requirements.txt because requirements cannot express wheel selection.
…
If the selector tag variables were exposed to the code that does marker evaluation for other things like python language version, then the requirements list could express wheel selection.

Yes, that’s true. You could have something like pip install python-flint+x86_64_v4 and similar in requirements.txt or anywhere else a requirement goes.

The question then is how do you satisfy that requirement when all you have is the sdist? Presumably you would need a way to ask the build backend to produce a wheel with the x86_64_v4 extra platform tag.

If you could have extra platform tags like this in the wheels then you could use them to encode the fact that there is an ABI dependency between PyPI wheels by adding a tag like +pypi to denote specifically the wheels that are built for PyPI. Locally built wheels would not match the +pypi extra platform tag. Then an installer could know if a PyPI package’s wheels are not expected to be compatible with locally built wheels of its dependencies.

pf_moore · May 17, 2024, 6:27pm

I think this is where I’m getting confused by the proposals here. Can I take a step back for a moment?

Currently, because of the file naming standards, a given version of a project can only have one sdist. It can have multiple wheels, which are distinguished by one or more of platform, abi, Python version and build number. None of the distinguishing factors currently encode the variants we’re discussing here (such as extended instruction set support or GPU details).

There are two distinct problems that I believe need to be addressed:

How to allow wheels that differ only in what variant(s) they support.
How to request that a sdist is built with a particular variant supported.

The simplest way of handling (1) is going to be to add one or more new tags to the wheel filename specification. We could simply redefine the platform tag to allow more fine-grained distinctions, but that’s problematic for various reasons, not least of which is backward compatibility. Any solution to (1) is going to mean a new version of the wheel spec, although the change might be relatively minor.

For (2), though, there’s a much bigger issue, because the build system interface currently has no API for specifying variants to the backend. Config settings could be used for this, but even that would require work to standardise a subset of the possible settings. The key thing here is that if a user requests an install of package foo, to support (say) x86_64_v4, and there’s either no binary builds, or the user has explicitly requested a build from source, then the installer has to have a backend-agnostic means of passing the request to support x86_64_v4 to the build system.

Assuming we solve (1) and (2), we still need to address the UI aspect of the functionality - how does the user say what variants they want, how do packages specify dependencies that need to be a particular variant, are variants inherited (if A depends on B, and the user asks for variant 1 of A, does that imply that variant 1 of B is also needed?) etc. Having said that, UI can’t be left to last because it’s what the user deals with. But equally, UI discussions without reference to the underlying architecture are incredibly difficult to follow.

I think we need to flesh out the use cases in more detail. We need to get to the point where we can answer:

Precisely what command does a user issue to install a package, assuming they have a system that supports a given variant?
What options does the user have, and how do they specify those options? “Give me a generic build”, “assume I have v2 rather than v4”, seem like reasonable possibilities here.
How does the package author define and build their project? How do they say what variants are supported? What wheels can/should they build?
Are there any publishing issues? In principle, I don’t think there are, but in practice GPU builds seem to be big enough that putting everything on PyPI is impractical. I think that specifying an extra index URL is a sufficient answer here, but I don’t want to prejudge the issue.

As far as I know, we have two pretty concrete use cases:

python-flint, where the variants relate to the level of extended CPU instruction set that is needed.
Packages that use the GPU, where the variants are what GPU (and programming model, if that’s the best way to describe what CUDA is…) the user has.

Can we maybe flesh out those two use cases in a bit more detail, for the non-experts trying to follow along?

If I’ve misunderstood or misrepresented anything in my comments, please let me know. I’m very aware of how limited my knowledge is here.

dhellmann · May 17, 2024, 6:43pm

Sure, although I was thinking more about the resolver/selector, than about a user specifying the value. For example, if the x86_64_version variable was exposed as a marker, then the metadata about each wheel could be served by the package index and the installer could just evaluate the expression without having to have the extra TOML file. It would need to know how to get the value of the x86_64_version variable, but the selector dependencies could be metadata provided by the package index, too.

Exposing the variable in marker expressions would also allow for different requirements of the package based on that marker, so the install requirements for python-flint could include a rule like:

some-optimized-lib; x86_64_version=x86-64-v4

(I probably got that syntax wrong, but hopefully you get the idea.)

If you’re building from source, presumably you want to pass the compiler flags in, and if these tags were a convenient (and standard) way to do that then it would solve one of the other issues we have today, which is that everyone is hand-coding the way to configure build flags for their package.

I think I like the idea. I’m not sure about saying “this is ABI-compatible with everything on PyPI” specifically, but something along those lines could be useful. I think we can do a lot to improve the current situation without solving for that specific problem in the first iteration, though.

oscarbenjamin · May 17, 2024, 7:12pm

In my mind the interpretation of the extra platform tag is scoped to the distribution/version in question. You wouldn’t say “I want all +pypi wheels” but rather “I want a numpy+pypi wheel” and exactly what “+pypi” means in the context of numpy is just something that numpy decides.

Hypothetically numpy could build wheels for pypi that bundle openblas etc and the wheels could be like:

numpy-2.0.0-cp312-cp312-win_amd64+pypi.whl

Building numpy locally would not add this +pypi tag at least by default but cibuidwheel could be configured to do it specifically for the wheels that numpy would put on pypi.

Then when scipy builds wheels for pypi they install the numpy pypi wheel, and build against it in an ABI dependent way (e.g. linking the bundled openblas directly or something) but add a requirement like:

requires = ["numpy+pypi == 2.0.0"]

The meaning of this requirement is that this scipy wheel requires a numpy 2.0.0 wheel that has the +pypi extra platform tag which a locally built numpy wheel would not. The +pypi part of this requirement would not apply to any other dependencies but just numpy. Other projects can freely define whatever tagged variants they want but the scope of the meaning of the tags is always limited to a single distribution.

I don’t think I understand enough about the mechanics of how an installer interacts with pypi to be able to reply to this… I thought that there was a predefined set of markers. Is there already a way to dynamically declare new markers like x86_64_version? If not then where could the project put that information if it is not in the wheel-selector.toml file?

oscarbenjamin · May 17, 2024, 7:15pm

This is basically like extras except that it selects/builds a different wheel rather than just pulling in extra dependencies. Maybe reusing extras for this somehow is a better approach…