How does this interact with existing installations? Wheel Variant Support - WheelNext makes it sound like variant.json is part of the install, is the variant used tracked, and what happens when the system changes and such a variant is no longer a valid choice?
Do they? The BLAS library is a private implementation detail of either library. If BLAS symbols are hidden [1], things should ideally work fine.
The most idiomatic and user-friendly solution for this is to adopt dynamic dispatch rather than ship different packages for different CPU support levels.
I gladly admit my incompetence on this, but do you have a pointer to this? This official doc seems to say otherwise:
Also note that there are two CUDA APIs (and libraries): the CUDA runtime API and the CUDA driver API. According to the link above:
The CUDA Driver API has a versioned C-style ABI, which guarantees that applications that were running against an older driver (for example CUDA 3.2) will still run and function correctly against a modern driver (for example one shipped with CUDA 11.0)
And perhaps even if they are not, given CPythonâs use of RTLD_LOCALâŠď¸
Not to pick on this specific comment, but I want to mention that one of the goals of the WheelNext effort was to pull together a bunch of different stakeholders that care about this problem â to demonstrate that it wasnât just one organization or entity advocating for these changes. There are a lot of different companies, individuals, and (by extension) packages involved. We donât even publish any such packages, but a non-trivial portion of the issues that we get in uv stem from the problems that variants are trying to solve. Even if we just limit it to the GPU use-case, these packages are hugely popular (and I believe, though obviously canât prove, that there would be more of them if it wasnât so hard to package and distribute, e.g., vllm for a variety of different GPU architectures, declare dependencies based on the GPU architecture, enable users to install the ârightâ build without a deep understanding of Python packaging, etc.).
I donât fully follow this point. Why do you need to run code to look at the provider package? The providers tend to be pure Python; they are not themselves variant-enabled, at least for the proof of concepts that were used in the PyTorch experiment. I donât know why they would be.
âŚand I donât think pip defaults can change anytime soon, because high profile packages still have some old dependencies that donât provide wheels. Iâm sorry I donât have a specific example handy, but I do recall seeing pip build an sdist recently when I was installing dependencies for some test suite.
I could have worded this better. At the end of the day, this issue here is that this causes many places that not only have people expected, but that expecting it was part of the push for wheels, that downloaded remote code would not be executed.
The rest of that comment goes over how it seems given the stakeholders at issue, this could have been done statically, and I donât see any compelling argument here as to why it couldnât. Doing this at the resolver level rather than allowing packages to execute code to decide what to install also allows the resolver to bail out earlier if there are conflicting requirements like incompatible BLAS linking, because that information is just there upfront.
I think itâs âpossibly okayâ but still a suboptimal compromise on the promises of wheels to do this if itâs opt in, but I worry a little bit that repeated prompts like this will lead to people ignoring them.
Sure, we could have rebooted the discussion once again. However, I donât believe that restarting it without any new data was likely to bring a different outcome. And at least I donât consider myself capable of figuring out a solution to such a complex problem and figuring out all the potential issues in a purely theoretical way. Even if that were possible, I donât believe it would be the most efficient way.
I understand your sentiment, and in fact I share it often. Say, when people give me complex patches without any prior discussion, and I am torn between accepting a patch I disagree with or discarding all the work the user already put into it. However, in this case I do believe prototyping was the appropriate cause of action, and all of us have done with the awareness that the proposal may require significant changes, or even be entirely unsuccessful. And as Jonathan already pointed out, while working on the prototype weâve hit many issues that we did not anticipate, nor found mentioned in the previous discussions.
Yes, discussing on top of an existing prototype changes the flow of discussion. But more importantly, it means that we have much better understanding of the problem scope, which means that in some cases we will be able to answer concerns and ideas with actual experience and results, rather than theorizing. It also means that the discussion can be more focused, which in turn increases the chances of reaching a consensus rather than diverging in multiple incompatible directions, and of it being open to more people who simply canât dedicate that much time to read all the possible angles that could come in a more generic discussion.
To be honest, Iâve been thinking about this a lot, and the problem is complex. The proposal is designed as a âbackwards compatibleâ extension to the wheel format. By that, I mean that:
Wheels that do not use variants do not change at all. There is no reason to change the wheel version there, as it would break backwards compatibility; and we do aim to let packages provide non-variant wheels as a fallback for package managers that do not implement or enable variant wheels.
Wheels that use variants change the filename, and as Jonathan pointed out, we have specifically tested that all the common tools will reject them as having an invalid filename. There is no backwards compatibility to be preserved here, and it is implicitly broken by the changed filename anyway.
That considered, I believe that we can raise the wheel version, but Iâm not entirely convinced itâs actually necessary and desirable. If we were to raise the version, then at least during some extensive transitional period regular wheels would still have to be published with the old version; possibly âforeverâ, given there is really no benefit to breaking backwards compatibility for regular wheels.
Another aspect is that there are other open suggestions for improvement of the wheel format (symlinks, new compression algorithms), and if we were to raise the wheel version, we may actually include them as well. Even if it meant that these new features would only be available to wheel variants (since using them would break backwards compatibility), some of the packages needing variants would also benefit from them (e.g. because theyâre huge, so theyâd use a better compression).
And as a fun idea, variants could then be actually used to combine these new features with backwards compatible regular wheels: say, a package could provide a ânullâ variant using symlinks, with a no-symlink regular wheel fallback.
Yes, it is. However, I donât think it is possible to provide a good solution to more fine-grained CPU architecture specifications within the framework of tags and installers.
The way I see it, platform tags work because Python knows the platform itâs running on. And we can reasonably assume that it knows it, because, well, it needs to actually run on it. With CPU architecture versions, itâs not that simple.
Installers would need to know what architecture version theyâre running on, and what versions are compatible with it. They would need to either get that information from Python itself (like they can for platform tags), or carry their own implementation.
Having the code for that in Python interpreter does not sound like a good idea. CPython can happily run on an x86_64-v4, even be compiled for -march=x86_64-v4 or equivalent, without even realizing at the code level that such an architecture level exists. We would have to maintain the detection logic in CPython, backport updates and fixes, and the users using earlier Python versions would be unable to install wheels that are actually valid for their platform. While new architecture versions are relatively rare, and using an older version of Python is a bad idea anyway, I donât think this justifies rejecting wheels.
Having it in the installer is more realistic, but it means every installer would have to repeat the same logic, and keep it updated and in sync. People would have to upgrade installers to install new wheels; and while admittedly itâs a good idea anyway, and it is definitely easier than updating the Python interpreter itself, itâs still likely to cause some annoyance. It also increases the likeliness of subtle bugs, such as a particular CPU being detected incorrectly, and inconsistent behavior across different tools.
Both approaches also generally assume dumping the maintenance burden on people who often donât have the relevant hardware, detailed architecture knowledge or even the interest in actively maintaining frameworks for detecting the version of half a dozen different architectures correctly.
I think the provider plugin approach is more suited to the task, as it nominally defers the task to a single package, that will presumably be maintained by someone who has the hardware, the knowledge and the actual interest in it. This package will be released independently of the installers, and (at least within the current framework of the proposal) automatically installed in the newest version available (modulo version constraints), therefore enabling users to take advantage of newer architecture versions faster.
Of course, you could combine the plugin approach with platform tags. However, I think that introduces an unnecessary special case, given that the proposed framework can handle it equally well in a generic way.
It says that it can be a string of up to 8 ASCII characters. How would I encode a package for a CUDA version, an OpenBLAS version, maybe CPU extensions? cuXX-blasYY-AVXzz is already too long. [1]
Coming up with novel encoding schemas for each package and project is still part of the problem I mention â I want to be able to find out what a Wheel (variant) is for, without having to resort to external documentation, that might disappear or not even publicly exist in the first place.
ignore the use of hyphens as a separator, used only for illustrative purposes âŠď¸
Pathological example where this isnât the case: foo-1.0-1none-any-any-12.whl. Does that have a build number of 1none or a variant spec of 12?
Do all tools ignore invalid wheel filenames? I know I used to write a lot of scripts that bulk-processed the contents of PyPI, and Iâm pretty sure I never wrote any code to skip invalid wheel filenames. I definitely had code that simply counted hyphens, and then split the filename depending on whether there were 5 or 6. That code would break horribly when it encountered the new scheme. I might have had something to log an error and continue, but thatâs not the same as ignoring them.
There really is. âA wheel filename contains 5 or 6 hyphensâ, for example. Or âThe tags are the last 3 hyphen-separated fieldsâ. Not every tool uses packaging to parse wheel filenames, and the standards donât require that they do.
Iâm not trying to make things difficult here, but this is definitely something that was discussed some time back on Discourse, and the basic conclusion was that a lot of people felt that the existing versioning system for wheels was inadequate, and we were âstuckâ with the current spec until we found a way out of that problem. Iâm not seeing anything here that explains how those concerns were addressed, and I donât feel that âwe tried things out and it looks like itâs OKâ really resolves that problem.
More generally, I suppose I wonder if something along the lines of the PEP 780 âABI featuresâ propsal would work here (or, reasons itâs been considered and wonât work).
If we have a set of environment markers defined for all the reasonable âvariantâ axes (e.g. GPU runtime API, GPU driver API, CPU instructions(?), OpenBLAS(?)), and a way of specifying that a wheel requires W, X, Y, and Z in the METADATA, could we make this work statically? It might require range-downloading METADATA files for several wheels, if weâre not able to put all of this into the filename, but I would still strongly prefer a static solution for âvariantsâ.
This preference for static is not just for security reasons but also to simplify the mental model â I can work through the steps a âresolverâ takes by looking at what my computer offers in terms of supported APIs, and what various wheels require, and match the two up. With arbitrary code, I need to know what is going on case-by-case, and potentially deep in the dependency tree. A static approach means that the algorithm is set by a standards-process, and is easier to explain and comprehend.
I agree with this. While Iâm perfectly happy to concede that this should never happen in practice, a dynamic plugin based system needs to consider what happens when (for example) a plugin does something pathlological like report a different answer every third invocation (which could wreck tools that cache plugin results). This is a real concern for me - weâve had so much trouble dealing with sdist builds because setuptools runs arbitrary code at build time, and itâs almost impossible to say anything meaningful about sdist builds as a result. Itâs not even just about malicious code - developers have some insanely weird setup.py scripts out there, that we canât just dismiss because they do the job they were built for. (Obvious xkcd link: xkcd: Workflow)
FWIW pip does not exactly, it has a regex for wheel filenames that is not spec compliant.
Weâve deprecated invalid name that match this regex for a few releases now, the plan is to ignore all invalid names in pip 25.3.
But there is of course a long tail of older pip installations out there. Itâs probably worth checking if these variant names match the regex pip uses.
As I said, itâs not just about pip and uv. We can change to be compatible with the new scheme, and compatibility issues with older versions of pip/uv can be addressed. Itâs about other tools that, in good faith, wrote spec-conforming parsers of their own to do a specific, limited job. We shouldnât casually break such tools.
We can complain all we like that the wheel spec was too permissive, and we should have placed tighter limits on what the filename format was, or on how consumers had to handle invalid formats. But we didnât, and we canât just pretend otherwise. We can even tighten up the spec. But thatâs simply taking the compatibility hit now, in order to make future work easier, and it doesnât absolve us of the need to provide a transition process for affected users.
Iâve not done a thorough reading of the spec with regards to what consumers should do with invalid names, are you sure it doesnât tell consumers how to behave for invalid names?
But I donât disagree with you, my point was actually that we should also be considerate of popular tools that are not spec compliant, like pip, as not to break user workflows.
I also suspect the answer is that most people will just cram the âjust make it workâ flag into their config, get grumpy that they had to do that, and then forget about it. Which pushes me to think that opt-out might be the right approach unless we can think of a way to square the round peg. Otherwise it feels a bit like weâre compromising usability for ideological reasons?
But if folks think that most folks wonât just do that, Iâm interested to hear that too. Maybe Iâm wrong?
I suspect that what Barry was getting at is that âcode that has to be executedâ is code that has to be executed whether itâs part of a hypothetical selector package, part of a setup.py, part of a build backend, or part of a hypothetical variant plugin.
I think the issue is that packages that are helped by variants tend to also be larger, and if they tried to do a single wheel that did dynamic dispatch theyâd likely be well past the limits PyPI has for a single wheel.
@pitrou Regarding CUDA versions, youâre likely right that itâs more like the CPU feature problem (building against an older API version means being compatible with more target environments, but also means missing out on newer hardware features and running slower as a result), so the constraint is âno newer than the provided versionâ rather than âmust be built against exactly the provided versionâ.
My interpretation was based on the way folks have been using venvstacks, which isnât necessarily representative of what the underlying APIs permit.
Regardless, the point of the selector module concept is to allow the cuda project to be the one that publishes those variant selection rules (including the runtime hardware interrogation capabilities), rather than build tools and installers having to implement that functionality independently of each other.
Iâll just quickly note, as Iâve seen this line of reasoning mentioned a few times and I didnât address it, that my points earlier in the thread about opt in vs. opt out arenât about satisfying most users.
Theyâre about satisfying users who have done an analysis of what they consider safe and having that analysis invalidated with no warning.