Selecting variant wheels according to a semi-static specification

msarahan · May 16, 2024, 7:55pm

Thanks! I’m not looking to sell a certain approach. I appreciate your proposal here. Let me lay out how I understand our current approach:

A user wants to install cudf, which implements GPU support for dataframe (Pandas) operations. They can’t install cudf alone, because we currently put the CUDA version into the package name. So the user needs to install cudf-cud12.
If you look at the release page for cudf-cu12, you’ll see a tiny sdist: cudf-cu12 · PyPI
The sdist depends on a build backend, called nvidia-stub:

[build-system]
requires = ["nvidia-stub"]
build-backend = "nvidia_stub.buildapi"

nvidia-stub contains the lookup and download logic, returning a wheel that matches what the “build” of the sdist is expecting. This wheel then gets installed. This is the part that is essentially behaving like setup.py that you say people have been trying to kill off.

It sounds like this is what you gathered/expected/feared (though I’d certainly appreciate confirmation), and I’m very grateful for your feedback.

I’d like to try to understand what degree of dynamic behavior might be palatable. Am I reading your example correctly in that its constraints on expected value are what improve it relative to the “setup.py arbitrary code or things like it” (let’s call this ACE for short?) You still have some ACE in your design, but it is not involved in downloading anything. It only serves to map some system property to some pre-determined tag.

Can we establish some boundaries, then?

ACE may not write any files to the filesystem (including downloading files)
ACE may open and execute shared libraries (but these may not modify the filesystem)
ACE is limited to returning tags that match pre-defined values

Is that a reasonable start?

On the flip side, I’ll mention some expectations that I have perceived from the corporate community:

They don’t want to use indexes external to PyPI, but it has been a functional necessity. As you know well, the support for variant packages on PyPI is poor.
They really want things to be on PyPI, because it dramatically simplifies instructions for customers and reduces friction with the major projects (e.g. PyTorch, TensorFlow, JAX).
They need to operate on their release cycles. If it takes days or weeks to get a size override approved, there’s a LOT of angst going on inside the company.

Some of this stuff is not at all related to our current discussion, but I bring it up to give some context especially around the question of external repos. We can’t get rid of them soon enough, but they are a necessary evil at the moment, and they’ll be necessary until 100% of our software can be on PyPI.