The packaging
module has a bug resulting in it raising an error when comparing non-version values with valid version values in dependency specifiers (#774). Its behaviour is different to the Dependency Specifiers Spec, but discussion of the issue (#774)
on GitHub has revealed that implementing the spec as-is would result in counter-intuitive behaviour when comparing the order of non-standard version values. So some discussion is needed to decide how to bring packaging
in line with the spec and whether the spec should be changed.
The bug
Say you want your package to require a dependency only on macOS 12 (Darwin 21). Your package metadata might contain:
Requires-Dist: requests; platform_system == 'Darwin' and platform_release < '22'
On macOS, import platform; print(platform.release())
is 22.1.0
(a valid packaging version). Whereas on Linux, platform release may be 6.10.14-linuxkit
which isn’t a valid packaging version.
The spec requires that comparing two valid version values use the normalised version comparison rules. When one or both are not valid versions, regular Python operator rules are used (so lexicographic ordering of strings).
packaging
currently raises an error when the right-hand side of an expression is a valid version and the left-hand side is not.
So in when evaluating this example on Linux, platform_release
takes on the non-version value and it evaluates '6.10.14-linuxkit' < '22'
which raises an error.
Reproduction instructions
For example, installing this pyproject.toml
project with pip currently crashes:
# pyproject.toml
[project]
name = "version-comparison-example"
version = "1.0.0"
dependencies = [
"requests; platform_system == 'Darwin' and platform_release < '22'",
]
[build-system]
requires = ["hatchling >= 1.26"]
build-backend = "hatchling.build"
$ python -c 'import platform; print(platform.release())'
6.10.14-linuxkit
$ touch version_comparison_example.py
$ pip install .
[...]
ERROR: Exception:
Traceback (most recent call last):
[...]
pip._vendor.packaging.version.InvalidVersion: Invalid version: '6.10.14-linuxkit'
$ pip --version
pip 25.1.1 [...]
Similarly, using an extra name like v0
that is a valid version also crashes (example project config), as the package metadata ends up containing a specifier like this:
Requires-Dist: typing-extensions; extra == 'v0'
This v0
value is treated as a valid version and compared against non-version values, which crashes. (I opened a PR to fix this (#883), which is how I came to this general issue.)
The spec’s behaviour
The spec requires that mark expressions (e.g. platform_release < '22'
) are compared as normalised versions when both the mark’s constant value and the variable parse as a packaging version. When one or both do not parse as versions, the values are compared using the normal Python operator behaviour.
Spec's Wording
Comparisons in marker expressions are typed by the comparison operator and the type of the marker value. The <marker_op> operators that are not in <version_cmp> perform the same as they do for strings or sets in Python based on whether the marker value is a string or set itself. The <version_cmp> operators use the version comparison rules of the Version specifier specification when those are defined (that is when both sides have a valid version specifier). If there is no defined behaviour of this specification and the operator exists in Python, then the operator falls back to the Python behaviour for the types involved. Otherwise an error should be raised.
packaging
(and projects using it, likepip
and most Python packaging tools) raise an error instead of using regular Python operators- The
uv
project has its own implementation of dependency specifiers, and it follows the spec (uv
can install the problematic example projects above).
The issue with the spec’s behaviour
Implementing the spec’s behaviour results in counter-intuitive comparison results when comparing the order of version-like (but not valid packaging version) values.
A distribution (Python package)'s version is required to satisfy the version spec’s rules, but the values of several environment markers are provided by the OS, system software, or are not expected to be versions (e.g. extras
).
Indeed the spec has a table listing the environment marker names and their types (String, Version etc) and only python_version
, python_full_version
and implementation_version
are defined to be Version, the rest are String (or sets of strings).
However the marker expression rules described above treat any value as a version if it parses as a version. This may happen to be the case (as in the macOS platform_release
example and the extra
example).
This results in most version comparisons using intuitive ordering based on numeric value of version components, but a hard cliff occurs where values are not valid versions, and the comparison instead uses regular string order by character-by-character value:
platform_release >= '20'
:- intuitively evaluates false when
platform_release
is6.7.0
or6.7.0+gentoo
as these are valid versions, - counter-intuitively it evaluates true when
platform_release
is6.7.0-gentoo
, as this is not a valid version and so'6'
>='2'
is evaluated, which is true.
- intuitively evaluates false when
There are some examples of version-to-version and non-version-to-version equality and ordering comparisons in the tests for MR #883.
Equality comparisons (==
, !=
, etc) are also affected by this, but it’s not really a problem in practice, because the difference in behaviour is limited to normalisation of version components being used in version-to-version comparisons but not otherwise. E.g. 'v8-dev' == 'v8-dev.0'
is true because these are versions, but 'v8-foo' == 'v8-foo.0'
is false because they’re not versions.
Previous packaging
module behaviour
As noted in pypa/packaging#774, packaging<22 and (pip<24.1) had a lenient LegacyVersion parsing fallback when a value failed to parse as a version. However when I read more of the code and tested marker evaluation, I found that the lenient version comparison was not used when evaluating markers, comparisons involving LegacyVersion always evaluated false.
And even if it were used, LegacyVersion had an epoch of -1 (vs >=0 for regular Version), so comparisons of non-versions with versions would always evaluate the non-version as smaller.
Although it wasn’t used to intuitively evaluate non-version marker expressions, the presence of this lenient version parsing logic meant that in past versions packaging
(& pip & Co.) would not raise an error when evaluating a non-version-to-version expression. The current behaviour seems to be a result of removing the legacy lenient parsing, which now results in the non-version-to-version cases raising rather than always evaluating false.
What to do?
In summary:
- The spec requires non-version to version markers to be evaluated as plain Python strings
packaging
currently raises an error evaluating non-version to version markersuv
implements the spec’s behaviour- Comparing the order of version-like strings with regular lexicographic string ordering results in unhelpful/counter-intuitive order, like
3.9
being greater than3.10
- It’s not obvious to an average user under what circumstances versions will be compared with version semantics vs string semantics, and it’s also not under their control, as the logic depends on the value found in the environment
How should this situation be resolved?
- Update
packaging
to match the spec and accept the ordering issues with externally-sourced environment markers such asplatform_release
- Update the spec(s) to improve handling of non-standard versions
- Something else?
Some pragmatic temporary workarounds/mitigations have also been proposed in #774 to avoid raising an error by evaluating mark expressions lazily. Maybe we could exclude environment names like extras
from version parsing to avoid failing there.