Except where specifically noted below, local version identifiers MUST NOT be permitted in version specifiers, and local version labels MUST be ignored entirely when checking if candidate versions match a given version specifier.
There are a lot of details (which I haven’t read fully), but including local versions in ordered comparisons is explicitly prohibited here and here. Also, the PEP states that local version identifiers should not be used in published files, but only for locally patched private versions. Does the torch example imply that this restriction is not being followed in practice?
However, despite all of the above, I have no personal issue with making a change like this. I just think we need to make sure we follow the correct process if we plan on changing either the letter or the spirit (or, as it seems in this case, both) of a published standard.
If nothing else, someone should do some research to see if the reasons for the restrictions in PEP 440 were recorded anywhere.
It’s also ignored by quite a few projects publishing nightlies on public index servers for other projects to reuse. For example, from anaconda.org/scipy-wheels-nightly/:
I never noticed before that + is technically invalid. It doesn’t help that the examples given in PEP 440 make it seem like + is the right choice to tack on things like a CUDA version identifier or a git hash for dev builds. If there’ll be an edit of PEP 440, those two use cases could perhaps be added explicitly. For the git hash the replacement is clear (e.g. for pandas, dev0+1185.g384a60 becomes dev1185.g384a60). That’s because they have the .dev versions to hide other stuff behind. For the PyTorch example it’s less clear. If you have 1.0.0 wheels for say three flavors (CPU, cuda 10.2, cuda 11.7), I don’t see what PEP 440 wants you to do here.
PEP440 wants you (wrongly or rightly) to not try and use version numbers to encode variants of the same version. Local versions were meant for downstream consumers to patch something, while still recording that they patched it, not for the project itself to use.
Note, this is also why it disallows the use of local version numbers in specifiers. The idea was that you should not be depending on a local version, but rather they were intended to make version numbers for downstream patched projects that, when already installed, would still match their “compatible” version from upstream, while obviously recording that “foo 1.0” with XYZ patches is not the same as “foo 1.0” from PyPI.
Hmm, that seems like a problem. Unless I’m missing it, PEP 440 doesn’t actually state this explicitly. If you depend on something that you cannot encode in a version, like CUDA, and you’re forced to create a new package name each time there’s a new CUDA version, that is quite a bit of overhead. Plus potential issues like name squatting or malicious package uploads to PyPI for the names you’re using on your own public index.
So the proposal in this thread to change PEP 440 sounds good to me.
Local version identifiers are used to denote fully API (and, if applicable, ABI) compatible patched versions of upstream projects. For example, these may be created by application developers and system integrators by applying specific backported bug fixes when upgrading to a new upstream release would be too disruptive to the application or other integrated system (such as a Linux distribution).
Isn’t a dependency on something like a specific version of CUDA ABI incompatible? Assuming so, I’d say that this definition excludes the GPU use case. Even if you argue that two builds with different CUDA dependencies count as “ABI compatible”, I’d still argue that a different dependency like this isn’t what the PEP means when talking about a “patched version”.
I don’t recall GPU-type variants specifically being discussed when PEP 440 was created (I’m not sure such extensions even existed at that time!) but I am pretty sure that the intention was very definitely not to act as a way of encoding things that fall into the realm of compatibility tags.
Of course, things have moved on since then, and what made sense then might not do so now, so it’s not unreasonable to reconsider things, but I believe that @dstufft is correct about the existing intention behind local versions.
FWIW, those versions are not invalid – they just mean that you can’t publish to pypi.org with them. To be clear, this topic is not about versions like 1.0.0+local.version.label. Those are and will stay valid.
This is about version specifiers.
Today, PEP 440 allows ==1.0.0+123.local.version.label and !=1.0.0+123.local.version.label but you can’t do >=1.0.0+123.local.version.label. That is different from the “real world” version parsing in pip, which does allow you to do that and (largely) had the “right” semantics as well.
My opinionTM: They should have separate packages for each flavour and recommend that users install those via extras with package[cpu] or package[cuda102] etc (with the extras pulling in the flavour-specific code) and use runtime checks to ensure that they’re pulling in the right pieces. With that, libraries should depend on just plain package directly and end users will have control over which flavour they’re using.
Good point. Is there a compelling use case for allowing local versions in ordered comparisons? If we’re saying that the usage of local versions to encode GPU stuff isn’t intended usage, then the torch>=1.13.1+cu117 use case is no longer particularly compelling…
Local version specifiers have the wrong semantics for “variants” anyway. The problem is that local versions are arbitrary strings that anyone is allowed to make up without any coordination, and then they’re compared more-or-less lexicographically (for lack of anything better to use).
So, a dependency on torch>=1.13.1+cu117 would be satisfied by all of these real versions that the torch project distributes:
torch 1.13.1+cu117
torch 1.13.1+cu118
torch 1.13.1+rocm3.7
torch 1.14.0+cpu
(And it’s just luck that lexicographically cpu is smaller than cu and rocm, or else cpu variants would be preferred over gpu variants.)
Similarly, it doesn’t makes sense to say that somepackage 1.2.3+facebook.2 can be upgraded to 1.2.3+google.1, because it’s a better match to >= 1.2.3+facebook.1. Those are good names for locally patched internal versions, but >= doesn’t do anything sensible with them.
Really the only operations that make sense for local variants are == and !=, and those are already allowed.
My opinion is that local version numbers are a poor substitute for variants, and if we want to support variants like pytorch needs (and I think there is a good argument that we do) then we should add something that actually supports it well.
In fact @njs did a great job at explaining why they’re a poor substitute as I was writing this
Edit: Just to add on to it, the thing you’d really want to be able to say is something like torch>=1.14.0 AND it must have a +cu118 variant or something, but still not a good fit for local versions.
I’m inclined to say let’s not allow this since it doesn’t really make sense nor really do what the user wants (it’d allow version 1.13.1+duxxx to be installed). And we should come up with something that’s capable of selecting variants instead.
For Pytorch’s use case, I think virtual dependencies is the most suiable solution.
I still think reified extras (so tensorflow[cu177] can be a real package, that really participates in dependency resolution) + a fixed provides-dist: (so tensorflow[cu177] can satisfy dependencies that are requesting tensorflow) would be the obvious solution for variants.
(This probably needs to be split into a different thread) By a “real package” do you mean tensorflow[cu117] is a separate wheel with different metadata? I’m not sure I get how this would work since it’d require a lot of changes in the ecosystem to accomodate [ and ] in the package name. Or should this metadata can’t be a part of the main tensorflow package so pip install-ing tensorflow, tensorflow[cu117], and tensorflow[whatever] would install different dependencies but each produces a working Tensorflow installation?
Retreating from the boil-the-ocean topic of thinking about a better new solution for build variants / CUDA: it still seems that this was a “let’s clean this up because legacy” change in packaging 22.0, and once pip vendors that it’s going to break PyTorch pretty badly for very little real-world gain. Extras as they stand have downsides, while the current solution has worked for years. Is that really the way you want to go?