Selecting variant wheels according to a semi-static specification

oscarbenjamin · May 18, 2024, 5:15pm

Yes, that is basically what I meant. Python’s packaging sort of has the notion that retrieving the python_flint-0.6.0-cp312-cp312-win_amd64.whl wheel from PyPI is equivalent to retrieving python-flint-0.6.0.tar.gz instead and then invoking its build backend. The resulting wheels would have the same name but are not equivalent though because of the bundled dependencies. This distinction does not matter for pip if it is just trying to install the project locally: if it managed to build the wheel then the necessary libraries must be available and probably the extra stuff that I did to make it so that the PyPI wheels work isn’t needed in the target system.

So there are two types of wheels:

Portable PyPI wheels should work on any system that safisfies the constraints on OS etc that are implied by the wheel filename.
Non-portable wheels that are built by a tool that naively invokes the PEP 517 interface.

From my perspective as a package author a PEP 517 frontend generates non-portable wheels but cibuildwheel is the tool that I use to make the portable PyPI wheels.

I ran cibuildwheel locally a few times when first getting a configuration together but I haven’t run it since. That is all just configured in CI and I hope it never breaks… Likewise I never try to make the portable wheels myself locally. Producing the wheels that are suitable for PyPI is not something that we should expect tools like pip to be able to do as part of installing packages into an environment and it is also not something that we really need to worry about “end users” doing.

There is less of a problem of automatic detection when building non-portable wheels from the sdist: the backend can detect the CPU, CUDA version etc. The PEP 517 interface allows the backend to pull in whatever dependencies like cuda_selector it needs (provided it is acceptable to run cuda_selector in an isolated venv) and to run whatever code it wants. This is in fact the exact mechanism that NVIDIA now proposes to use because it already provides the flexibility to do all the auto-detection they need.

Most build backends generate generic (e.g. x86_64) wheels by default even when building on the target machine. If we give up the pretence that a build backend generates portable wheels then there’s no reason we couldn’t just compile everything with -march=native which is how you tell gcc “I don’t care about portability: use every available feature of the exact CPU in this machine”.

I think it is reasonable for now to ignore the variants when building from sdist and just say that for e.g. pip’s purposes if the build succeeds then it should be fine.

I also think that it is reasonable to punt on the discussion of how variant wheels do actually get built. In practice this is something that package authors would do by configuring cibuildwheel somehow and passing some settings through to their build backend.

oscarbenjamin · May 19, 2024, 2:11pm

Actually I’m not quite sure how the marker idea is expected to work. What we want here is to select between wheels for the same distribution but markers are just used to express that a requirement for a distribution is conditional on something.

Is it that you have something in the sdist metadata somewhere like

[wheel-marker-requirements]
cu11 ; cuda_version == '11'
cu12 ; cuda_version == '12'

and then pip install cudf sees that and knows that it needs to check cuda_version somehow and then use that to select one of the two variant wheels:

cudf-0.6.0-cp312-cp312-win32+cu11.whl
cudf-0.6.0-cp312-cp312-win32+cu12.whl

So the advantage of using requirements syntax is just that an installer already has a parser and evaluator for that syntax? I guess it also means that you can easily combine other markers like OS etc in the logic.

I don’t see how you would use this marker in normal requirements like in pyproject.toml to express the conditionality unless you do it by having separate distributions so cudf has to be a dummy package with requirements like:

requires = [
    "cudf-cu11 ; cuda_version == '11'",
    "cudf-cu12 ; cuda_version == '12'",
]

and then you would have to have separate cudf-cu11 and cudf-cu12 distributions. It would be better if this could be done without needing separate distributions.

mdrissi · May 19, 2024, 9:13pm

Yes I was thinking of separate distributions so that tensorflow would have tensorflow, tensorflow-cpu, tensorflow-cuda11, tensorflow-cuda12 all as packages. User would only specify tensorflow, the marker would determine which of 3 underlying choices to pick.

Tensorflow sort of has done this in past and varied details of how it gets installed. There was tensorflow-macOS package which was basically tensorflow for m1 (not for x86 macs confusingly). Even tensorflow-gpu as a separate package was done at times, tensorflow-gpu · PyPI.

pf_moore · May 19, 2024, 9:23pm

I wonder whether this could be done by simply having an extension mechanism for markers, so that if an installer encounters an unknown marker, it consults some sort of “registry” to find a plugin that provides that marker, and calls the plugin to calculate the value?

That may have to be installer-dependent (the standard says something like “installers MAY provide a mechanism to add extra markers…”) as I’m not sure how we’d standardise something that was (efficiently) accessible from both Python and Rust. Maybe it could be a static JSON file, and the user has to generate it with a tool they run on their system? I don’t know.

steve.dower · May 20, 2024, 3:31pm

I don’t believe this is true. Allowing more fine-grained platform tags requires updating PyPI to allow publishing them (perhaps on a per-package basis, enabled by manual request?), and packaging.tags to report them, but no reason to change the spec.

If your list of supported tags looks like ['...-win32_cu12', '...-win32', '...-any'] instead of just ['...-win32', '...-any'], then you’ll simply pick up the most targeted build for your platform. We already have ways/ideas of handling platform tags manually in builders,^[1] installers and lockers.

Having some way to inject an additional tag(s)^[2] for packaging.tags to return, either manually or automatically, ought to handle the rest.

It also handles the “build the most specific wheel for my platform by default” idea that was suggested, and allows setting a non-specific dependency on a package that will result in getting the most-specific files to install.

At least those which support some amount of cross-compiling. ↩︎
So I can say win32_sse4, win32_cu12 and if a package happens to have provided both then I’ll get whichever I specified first. ↩︎

dhellmann · May 21, 2024, 1:50pm

Reusing the parser is one big benefit, yes.

Oscar Benjamin:

I don’t see how you would use this marker in normal requirements like in pyproject.toml to express the conditionality unless you do it by having separate distributions so cudf has to be a dummy package with requirements like:
requires = [
    "cudf-cu11 ; cuda_version == '11'",
    "cudf-cu12 ; cuda_version == '12'",
]
and then you would have to have separate cudf-cu11 and cudf-cu12 distributions. It would be better if this could be done without needing separate distributions.

That would be one way. Making that useful would require restructuring existing packages, so it’s not the main way it’s useful.

Another would be to say something like when $project is built for cuda, it has extra dependencies that don’t apply when not built for that accelerator. Something like

requires = [
  "cuda-special-sauce; accelerator="cuda",
]

It’s like “extras” combined with the automatic nature of other marker values like python version, so the consumer of the package doesn’t have to mention it explicitly.

dhellmann · May 21, 2024, 1:58pm

I like the generalization. I don’t know about the registry. Where would plugins be registered, and how is that managed? As a PyPA package, maybe?

Using a dependency mechanism avoids having to set up a central registry. There are downsides, of course, like multiple packages testing the same aspects of the install environment and possibly colliding. But I could see a registry process introducing more friction to adoption at community scale and more burden on someone to maintain it.

dhellmann · May 21, 2024, 2:06pm

My impression (from the outside) is that manual processes like approving organizations and size limit overrides on PyPI is already overwhelming the people signed up to do that. One of our goals for this change should be to avoid as much manual intervention as possible.

How much do we want to squeeze into the platform tags’ semantics? If I wanted to add a way to express that wheels built for Fedora that link to libraries in system packages instead of bundling those libraries, would that make sense as a tag?

How is the precendence ordering of those injected tags managed? The original post gave an example of preferring different levels of acceleration based on several that might be compatible, for example. I know platform tags have that ability, but if the new tags are lumped in together with all of the other tags, how is that ordering managed?

steve.dower · May 21, 2024, 3:09pm

Yes, you’d just entirely replace the linux1 default platform tag with your own made up one. Then you patch^[1] packaging.tags vendored inside your pip to add that custom tag first, and probably default to your own index rather than PyPI (which does not allow custom platform tags - by policy, not by specification/technical necessity).

It’s hard to quote footnotes, but that’s where I suggested in my post that it would be a user-specified order (i.e. they can edit their own config file). Distros/installers could add them by default, or a tool could be used to detect the set that should be in there, but provided it can be user managed, we don’t have to preemptively invent the full set of tags before shipping something useful.

My assumption is that most packages would not be looking at supporting the Cartesian product of all possible tags, but would rather pick the one that matters most to them. e.g. numpy is not going to specialise on CUDA version, so none of their packages will have a cu... in the platform and the relative ordering of the cu... tag vs. the sse3 tag is irrelevant. Packages that want to rely on multiple tags probably need to invent a new one specific to them, or split their package into parts in a way that separates the tags.

Either way, if they publish with a “standard” tag, they’ll be installed, and can offer a more optimised one for users who have opted in. They could even omit a standard tag and so require that users specify (again, possibly by running a tool that sets it for them) - something that 100% relies on having CUDA might do this, for example.

Because there’s currently no other way. I’m proposing adding a better way. ↩︎

dhellmann · May 21, 2024, 3:29pm

This is a good use case to call out.

Another benefit of using a separate file is that the rules can be defined with some sort of precedence ordering that’s independent of the file naming. Precedence is important for the case where multiple matches might work, and the installer has to choose between them (just as it does with platform tags today).

steve.dower · May 21, 2024, 3:36pm

To see this in context, try this command:

> python3 -c "import pip._vendor.packaging.tags as tags; print(*tags.sys_tags(), sep='\n')"

Now imagine having an extra tag at the top of that list which will get a more specific version of a package if available, but will fall back to the next tag if not.

oscarbenjamin · May 21, 2024, 4:15pm

Steve Dower:

try this command:
> python3 -c "import pip._vendor.packaging.tags as tags; print(*tags.sys_tags(), sep='\n')"
Now imagine having an extra tag at the top of that list which will get a more specific version of a package if available, but will fall back to the next tag if not.

On my Linux system this outputs 959 tags. If we add new tag modifiers like cu12, avx512 etc are they going to combine like a Cartesian product with the existing the tags so that the total number of tags grows exponentially like 959\times2^n where n is the number of modifiers?

dhellmann · May 21, 2024, 4:16pm

I don’t think this is something we want to require users to configure. We should allow them to override default behavior, but the original proposal presents a way to automate the selection and precedence selection and I think we should keep that automation as a goal because it will provide a better experience for most users.

I do agree we don’t need to come up with all of the new tags now. In fact, I think we want to assume arbitrary values because we won’t be able to predict all of the ways we may need to select variants in the future. So we should be looking for an approach that allows for extensibility without having to change the standard or without having to change the implementation of the installer in the future.

Tags may meet both of those criteria by allowing plugins to provide new tags and manipulate the order of the tag set used by the installer. I’m not entirely convinced that it’s possible to have 2 tag provider plugins do that safely without getting in each others way (a CUDA tag provider and a Fedora tag provider, for example).

I do think the rule-based approach avoids that plugin collision problem because it allows for filtering and sorting based on orthogonal variables independently.

steve.dower · May 21, 2024, 4:31pm

Oof, really? I only have about 40 on Windows, and 1/3rd have no platform (any) and so aren’t relevant here anyway.

I assume yours is multiplied by every possible glibc version for manylinux? That will certainly be an issue regardless. Then again, we’re only going to multiple this by the number of modifiers (I specifically said they wouldn’t cross with each other, so it’s 959*n, not 2**n, and if you want some particular crossover then make a new modifier for it), and the list is going to be intersected with a list of filenames that is already fully known, so it shouldn’t get exponentially bad.

Sure, my point is just that “user configurable” ensures it can also be automated or preconfigured by a distributor, whereas starting from one of those other two options may result in a solution that can’t support either of the others.

I’m also fully aware that pip will only support plugins under extreme protest, so I’m not assuming that a plugin-based solution will be viable. So by saying “the tool reads the tags from a file” and hand-waving how that file comes into existence, we head towards something that might actually be acceptable on all sides.

oscarbenjamin · May 21, 2024, 4:57pm

Yes:

$ python3 -c "import pip._vendor.packaging.tags as tags; print(*tags.sys_tags(), sep='\n')" | grep -- 'cp312-cp312' 
cp312-cp312-manylinux_2_35_x86_64
cp312-cp312-manylinux_2_34_x86_64
cp312-cp312-manylinux_2_33_x86_64
cp312-cp312-manylinux_2_32_x86_64
cp312-cp312-manylinux_2_31_x86_64
cp312-cp312-manylinux_2_30_x86_64
cp312-cp312-manylinux_2_29_x86_64
cp312-cp312-manylinux_2_28_x86_64
cp312-cp312-manylinux_2_27_x86_64
cp312-cp312-manylinux_2_26_x86_64
cp312-cp312-manylinux_2_25_x86_64
cp312-cp312-manylinux_2_24_x86_64
cp312-cp312-manylinux_2_23_x86_64
cp312-cp312-manylinux_2_22_x86_64
cp312-cp312-manylinux_2_21_x86_64
cp312-cp312-manylinux_2_20_x86_64
cp312-cp312-manylinux_2_19_x86_64
cp312-cp312-manylinux_2_18_x86_64
cp312-cp312-manylinux_2_17_x86_64
cp312-cp312-manylinux2014_x86_64
cp312-cp312-manylinux_2_16_x86_64
cp312-cp312-manylinux_2_15_x86_64
cp312-cp312-manylinux_2_14_x86_64
cp312-cp312-manylinux_2_13_x86_64
cp312-cp312-manylinux_2_12_x86_64
cp312-cp312-manylinux2010_x86_64
cp312-cp312-manylinux_2_11_x86_64
cp312-cp312-manylinux_2_10_x86_64
cp312-cp312-manylinux_2_9_x86_64
cp312-cp312-manylinux_2_8_x86_64
cp312-cp312-manylinux_2_7_x86_64
cp312-cp312-manylinux_2_6_x86_64
cp312-cp312-manylinux_2_5_x86_64
cp312-cp312-manylinux1_x86_64
cp312-cp312-linux_x86_64

There are 34 different manylinux tags plus the linux_x86_64 one. You can multiply that by 27 combinations of the Python version and ABI:

$ python3 -c "import pip._vendor.packaging.tags as tags; print(*tags.sys_tags(), sep='\n')" | grep -- 'manylinux_2_5_x86_64'
cp312-cp312-manylinux_2_5_x86_64
cp312-abi3-manylinux_2_5_x86_64
cp312-none-manylinux_2_5_x86_64
cp311-abi3-manylinux_2_5_x86_64
cp310-abi3-manylinux_2_5_x86_64
cp39-abi3-manylinux_2_5_x86_64
cp38-abi3-manylinux_2_5_x86_64
cp37-abi3-manylinux_2_5_x86_64
cp36-abi3-manylinux_2_5_x86_64
cp35-abi3-manylinux_2_5_x86_64
cp34-abi3-manylinux_2_5_x86_64
cp33-abi3-manylinux_2_5_x86_64
cp32-abi3-manylinux_2_5_x86_64
py312-none-manylinux_2_5_x86_64
py3-none-manylinux_2_5_x86_64
py311-none-manylinux_2_5_x86_64
py310-none-manylinux_2_5_x86_64
py39-none-manylinux_2_5_x86_64
py38-none-manylinux_2_5_x86_64
py37-none-manylinux_2_5_x86_64
py36-none-manylinux_2_5_x86_64
py35-none-manylinux_2_5_x86_64
py34-none-manylinux_2_5_x86_64
py33-none-manylinux_2_5_x86_64
py32-none-manylinux_2_5_x86_64
py31-none-manylinux_2_5_x86_64
py30-none-manylinux_2_5_x86_64

That gives 27 * 34 == 918 combinations but somehow there are some others bringing it up to 959.

dhellmann · May 21, 2024, 5:31pm

Got it, that’s useful to know. Is there background somewhere for that preference? (Maybe an old thread?)

pf_moore · May 21, 2024, 6:23pm

Tags have proven to be far less scalable than we’d thought when we did the original design. They work, but IMO we should be cautious about adding more tags simply because of the scalability implications. Apart from anything else, trying to decide the correct ordering of 1000-odd tags (much less N times that) is going to be tricky and error-prone.

To be more precise, pip doesn’t support plugins because we have no supported API. So plugins can’t actually communicate with the main pip code in a way that will be stable across pip versions.

That’s not to say that we’d have a problem with calling a well-defined hook interface (much like we do with build backends). However, any such interface would need to consider that not all installers are written in Python any more, so firing up a Python interpreter to call a hook might be more overhead than we want (this will end up somewhere in the resolver, which is pretty performance sensitive, but maybe it can happen once before the resolver starts?)

The main one is Create a supported "high level" programmatic API for pip · Issue #3121 · pypa/pip · GitHub. There’s also How to create plugin for PIP · Issue #3999 · pypa/pip · GitHub. As I said above, though, plugins are very different than “calling a standardised hook provided by another package”. The main issue with the latter is simply standardising something.

dhellmann · May 21, 2024, 6:46pm

Interesting. I’ve been thinking of both hooks and plugins as implementation details of the same pattern (define an API to provide dynamic information into the selection process and then allow for multiple independent implementations). It’s useful to know that you’re thinking of those implementation details as significant.

pf_moore · May 21, 2024, 6:55pm

The main point in my view is that a hook can be implemented without needing any access to the caller beyond what the caller passes to it, and what it returns. I guess it’s sort of like a plugin implemented via a message-passing architecture.

There’s all sorts of details needed to make a hook architecture work, though. Apart from the interface needing to be clearly defined, it may also need to be language independent^[1], there needs to be a way for callers to find the hooks, etc, etc. You also need to build in versioning and extensibility, because there’s a good chance you won’t get it perfect first time. So it’s possible, but not easy.

I keep mentioning uv, but it’s a great example of why you mustn’t assume pip is the only client ↩︎

dhellmann · May 21, 2024, 7:38pm

That seems like a good pattern to follow, similar to the build system hooks mentioned earlier.

I worry about a tendency to go too far in accepting additional complexity in requirements, though. If we focus on general behavior, and a reference implementation in pip (or something that pip vendors), and include the message passing as a requirement, then uv or other tools written in languages other than Python could come up with an interface layer to drive those same plugins without us having to describe how that would work in detail.

So, for example, if we say that one requirement is that a target package can express a selection-time dependency on another selector package, that’s metadata that any installer could cope with, just as it does with other types of dependencies.

Then we can say that the interface for passing data back and forth to the selector package code must use simple types. Maybe limited to things that can be encoded natively in JSON, like the build system hooks?

With those constraints, if the selector package presents a literal plugin, an installer written in Python can just pass data to the callable and an installer written in another language can use a shim application to do the same thing. If the Python installer wanted to avoid any side-effects from loading a plugin, it could also use a shim application.

Does JSON encoding work? What sort of data should the selector be given, and what can it return?

In Oscar’s original design the input was nothing and the output was a single string value. I would slightly extend that to pass the name of the needed variable somehow, or to at least let the selector report a mapping of variable names to values so that the same package could feed back multiple criteria.

I think we’ll want at least strings and version numbers as values, so a selector can check a type of support (think of most of the current platform tags) as a singleton or enum or can specify a version or version range of something that’s supported (maybe a specific version of an extension library is needed).

Do we need anything else to provide an initially useful selector API?