Let's permit `+local.version.label` in version specifiers

pradyunsg · January 14, 2023, 12:15pm

As some of you have likely noticed, packaging removed non-PEP backed parsing of versions and version specifiers in 22.0.

Local version labels in version specifiers · Issue #661 · pypa/packaging · GitHub is a user report about torch>=1.13.1+cu117 not being a valid version specifier after this change.

I don’t see good reasons to disallow use of local version labels and it’s relatively straightforward to map the well-defined semantic for comparing local version labels into most of the operators. Permitting this would mean that we’d break fewer user workflows when we do eventually make the switch over to PEP-backed parsing in widely adopted tools such as pip (which is currently blocked on a bit of a project for migrating to this: Upgrade the vendored `packaging` to 22.0+ · Issue #11715 · pypa/pip · GitHub)

Does anyone have arguments against allowing this?

pf_moore · January 14, 2023, 2:58pm

This is a change to PEP 440, which says

Except where specifically noted below, local version identifiers MUST NOT be permitted in version specifiers, and local version labels MUST be ignored entirely when checking if candidate versions match a given version specifier.

There are a lot of details (which I haven’t read fully), but including local versions in ordered comparisons is explicitly prohibited here and here. Also, the PEP states that local version identifiers should not be used in published files, but only for locally patched private versions. Does the torch example imply that this restriction is not being followed in practice?

However, despite all of the above, I have no personal issue with making a change like this. I just think we need to make sure we follow the correct process if we plan on changing either the letter or the spirit (or, as it seems in this case, both) of a published standard.

If nothing else, someone should do some research to see if the reasons for the restrictions in PEP 440 were recorded anywhere.

pradyunsg · January 14, 2023, 4:03pm

They are using it in their own index server, not on PyPI. PyPI disallows releases with local version labels.

pf_moore · January 14, 2023, 4:26pm

Cool. I wasn’t sure because PEP 440 says

Local version identifiers SHOULD NOT be used when publishing upstream projects to a public index server

and the torch server is a public index in that sense. But it’s not a big deal in practice.

rgommers · January 14, 2023, 6:41pm

It’s also ignored by quite a few projects publishing nightlies on public index servers for other projects to reuse. For example, from anaconda.org/scipy-wheels-nightly/:

matplotlib-3.7.0.dev1366+g235b01f439-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
numpy-1.25.0.dev0+363.gbb2769e12-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
pandas-2.0.0.dev0+1185.g384a603b14-cp39-cp39-win32.whl

I never noticed before that + is technically invalid. It doesn’t help that the examples given in PEP 440 make it seem like + is the right choice to tack on things like a CUDA version identifier or a git hash for dev builds. If there’ll be an edit of PEP 440, those two use cases could perhaps be added explicitly. For the git hash the replacement is clear (e.g. for pandas, dev0+1185.g384a60 becomes dev1185.g384a60). That’s because they have the .dev versions to hide other stuff behind. For the PyTorch example it’s less clear. If you have 1.0.0 wheels for say three flavors (CPU, cuda 10.2, cuda 11.7), I don’t see what PEP 440 wants you to do here.

dstufft · January 14, 2023, 6:55pm

PEP440 wants you (wrongly or rightly) to not try and use version numbers to encode variants of the same version. Local versions were meant for downstream consumers to patch something, while still recording that they patched it, not for the project itself to use.

dstufft · January 14, 2023, 7:01pm

Note, this is also why it disallows the use of local version numbers in specifiers. The idea was that you should not be depending on a local version, but rather they were intended to make version numbers for downstream patched projects that, when already installed, would still match their “compatible” version from upstream, while obviously recording that “foo 1.0” with XYZ patches is not the same as “foo 1.0” from PyPI.

rgommers · January 14, 2023, 7:57pm

Hmm, that seems like a problem. Unless I’m missing it, PEP 440 doesn’t actually state this explicitly. If you depend on something that you cannot encode in a version, like CUDA, and you’re forced to create a new package name each time there’s a new CUDA version, that is quite a bit of overhead. Plus potential issues like name squatting or malicious package uploads to PyPI for the names you’re using on your own public index.

So the proposal in this thread to change PEP 440 sounds good to me.

pf_moore · January 14, 2023, 8:33pm

Local version identifiers are used to denote fully API (and, if applicable, ABI) compatible patched versions of upstream projects. For example, these may be created by application developers and system integrators by applying specific backported bug fixes when upgrading to a new upstream release would be too disruptive to the application or other integrated system (such as a Linux distribution).

Isn’t a dependency on something like a specific version of CUDA ABI incompatible? Assuming so, I’d say that this definition excludes the GPU use case. Even if you argue that two builds with different CUDA dependencies count as “ABI compatible”, I’d still argue that a different dependency like this isn’t what the PEP means when talking about a “patched version”.

I don’t recall GPU-type variants specifically being discussed when PEP 440 was created (I’m not sure such extensions even existed at that time!) but I am pretty sure that the intention was very definitely not to act as a way of encoding things that fall into the realm of compatibility tags.

Of course, things have moved on since then, and what made sense then might not do so now, so it’s not unreasonable to reconsider things, but I believe that @dstufft is correct about the existing intention behind local versions.

pradyunsg · January 14, 2023, 8:37pm

FWIW, those versions are not invalid – they just mean that you can’t publish to pypi.org with them. To be clear, this topic is not about versions like 1.0.0+local.version.label. Those are and will stay valid.

This is about version specifiers.

Today, PEP 440 allows ==1.0.0+123.local.version.label and !=1.0.0+123.local.version.label but you can’t do >=1.0.0+123.local.version.label. That is different from the “real world” version parsing in pip, which does allow you to do that and (largely) had the “right” semantics as well.

pradyunsg · January 14, 2023, 8:57pm

My opinion^TM: They should have separate packages for each flavour and recommend that users install those via extras with package[cpu] or package[cuda102] etc (with the extras pulling in the flavour-specific code) and use runtime checks to ensure that they’re pulling in the right pieces. With that, libraries should depend on just plain package directly and end users will have control over which flavour they’re using.

pf_moore · January 14, 2023, 9:13pm

Good point. Is there a compelling use case for allowing local versions in ordered comparisons? If we’re saying that the usage of local versions to encode GPU stuff isn’t intended usage, then the torch>=1.13.1+cu117 use case is no longer particularly compelling…

njs · January 14, 2023, 9:38pm

Local version specifiers have the wrong semantics for “variants” anyway. The problem is that local versions are arbitrary strings that anyone is allowed to make up without any coordination, and then they’re compared more-or-less lexicographically (for lack of anything better to use).

So, a dependency on torch>=1.13.1+cu117 would be satisfied by all of these real versions that the torch project distributes:

torch 1.13.1+cu117
torch 1.13.1+cu118
torch 1.13.1+rocm3.7
torch 1.14.0+cpu

(And it’s just luck that lexicographically cpu is smaller than cu and rocm, or else cpu variants would be preferred over gpu variants.)

Similarly, it doesn’t makes sense to say that somepackage 1.2.3+facebook.2 can be upgraded to 1.2.3+google.1, because it’s a better match to >= 1.2.3+facebook.1. Those are good names for locally patched internal versions, but >= doesn’t do anything sensible with them.

Really the only operations that make sense for local variants are == and !=, and those are already allowed.

dstufft · January 14, 2023, 9:40pm

My opinion is that local version numbers are a poor substitute for variants, and if we want to support variants like pytorch needs (and I think there is a good argument that we do) then we should add something that actually supports it well.

In fact @njs did a great job at explaining why they’re a poor substitute as I was writing this

Edit: Just to add on to it, the thing you’d really want to be able to say is something like torch>=1.14.0 AND it must have a +cu118 variant or something, but still not a good fit for local versions.

uranusjr · January 16, 2023, 6:28am

I’m inclined to say let’s not allow this since it doesn’t really make sense nor really do what the user wants (it’d allow version 1.13.1+duxxx to be installed). And we should come up with something that’s capable of selecting variants instead.

For Pytorch’s use case, I think virtual dependencies is the most suiable solution.

njs · January 16, 2023, 7:37am

I still think reified extras (so tensorflow[cu177] can be a real package, that really participates in dependency resolution) + a fixed provides-dist: (so tensorflow[cu177] can satisfy dependencies that are requesting tensorflow) would be the obvious solution for variants.

uranusjr · January 16, 2023, 8:15am

(This probably needs to be split into a different thread) By a “real package” do you mean tensorflow[cu117] is a separate wheel with different metadata? I’m not sure I get how this would work since it’d require a lot of changes in the ecosystem to accomodate [ and ] in the package name. Or should this metadata can’t be a part of the main tensorflow package so pip install-ing tensorflow, tensorflow[cu117], and tensorflow[whatever] would install different dependencies but each produces a working Tensorflow installation?

pradyunsg · January 16, 2023, 10:16am

Neato - I’m going to go ahead and wrap up this discussion up by saying that we’re not going to permit this then.

I’m tentative about how this is going to disrupt users, but we can cross that bridge when we get there.

rgommers · January 16, 2023, 11:54am

Retreating from the boil-the-ocean topic of thinking about a better new solution for build variants / CUDA: it still seems that this was a “let’s clean this up because legacy” change in packaging 22.0, and once pip vendors that it’s going to break PyTorch pretty badly for very little real-world gain. Extras as they stand have downsides, while the current solution has worked for years. Is that really the way you want to go?

rgommers · January 16, 2023, 11:56am

Or is this the thing where it doesn’t actually break much? It’s really hard to tell from the bug report.