I am trying to fully implement arbitrary equality in packaging and I got stuck on whether the versions should be case insensitive or not.
This part of the spec is to support legacy versions that do no conform to the “modern” specification, as evidenced by this line:
This operator is special and acts as an escape hatch to allow someone using a tool which implements this specification to still install a legacy version which is otherwise incompatible with this specification.
So investigating how prior implementations worked, both LegacyVersion (implemented in 2014 in packaging) and parse_version (implemented in setuptools in 2005) are case insensitive. And prior to that StrictVersion and LooseVersion (implemented in CPython in 1998) only accepted lower case.
To clarify the situation I would like to chance change the following sentence:
Arbitrary equality comparisons are simple string equality operations which do not take into account any of the semantic information such as zero padding or local versions.
To:
Arbitrary equality comparisons are simple case-insensitive string equality operations which do not take into account any of the semantic information such as zero padding or local versions.
I’m not a Unicode expert so I don’t know, but all prior implementations relied on comparison via str.lower, so whatever best describes that.
The regular version specification only allows for ASCII to be used, but this is a specific escape hatch from the regular version specification where previously arbitrary strings were allowed.
The real question is whether two different implementations need to give the same results. I think it goes without saying that they do for an interoperability standard.
In which case, we need to be sure there is no ambiguity - and it’s worth remembering that not all tools are written in Python, so even “use str.lower” isn’t as precise as we might hope.
Case sensitivity is much more unambiguous, so is there a real-world situation where we need case insensitivity?
Packaging currently implements case insensitive comparisons, so either it would need to be changed to match the standard, or the standard needs to be changed to match the way this has been done for the last 20 years. As mentioned above, this is only for legacy (not valid) versions anyway, so preserving legacy behavior makes sense. In the packaging repo. Damian has made 2 PRs; one adds tests showing the current behavior, and one changes the behavior to match the spec. This is really about which one is correct.
(The spec doesn’t really state it needs to be case sensitive or insensitivity currently, though case sensitive is somewhat implied - if it was case sensitive, then I think the spec still should be updated to clarify that.)
Sure, I am 100% open to less ambiguous language, I bring str.lower up only as how it always has been implemented, even today in packaging, not as how it should be written into the spec. So whatever the language is must not contradict str.lower.
The real world use case is matching against any version prior to switching to modern PEP-440 style versions, that is not PEP-440 compliant, which is specifically called out in the spec, and looking at the history of version parsers prior to PEP-440 they were all case insensitive. So specifying case sensitivity would be breaking what appears to be the main purpose for arbitrary equality. If your version happens to be PEP-440 compliant then it is also matched with case insensitivity anyway.
IMO, the spec doesn’t imply case insensitivity at all. It says to use “simple string equality operations”, which to me is case sensitive. Having said that, “we’ve been doing case insensitive checks for 20 years” is a reasonable argument for not changing things.
I do think that if we choose case insensitivity, we need to be very explicit what that means. And I don’t think that we can appeal in that case to “keep doing what we’ve done for the past 20 years”, as Python’s behaviour has changed in that time - str.upper() is Unicode-aware nowadays, and it wasn’t in Python 2.
So IMO, case sensitivity is the easy path - it’s arguable that it is what the standard always said, and we can simply clarify the language. Whereas case insensitivity would need a formal specification of what we mean - taking into account all the complexities of Unicode. So that would likely need a full PEP (although hopefully a small and relatively non-controversial one). And who knows whether people have strong opinions about some of the edge cases here?
Does Rust have an “uppercase” function that exactly follows what Python’s str.upper() does?
Conceded, but equally, do we have any evidence that anyone has ever used arbitrary equality in a context where case insensitivity was needed? I wouldn’t be surprised if all of the real use cases used lowercase throughout (both in published versions and in constants used in version comparisons).
Actually, does PyPI (or any other index server) even allow non-PEP 440 versions these days? If Python packages with non-standard versions no longer exist, how much do we even care? I explicitly don’t care about using arbitrary inequality to compare against version strings that aren’t Python package versions - PEP 440 isn’t the right tool for that job in the first place.
Actually, there’s another option here. We could simply state that the behaviour of arbitrary equality is not specified by the standard except on ASCII strings, where it is a case insensitive string comparison. That might be the simplest practical option. If anyone’s using non-ASCII legacy versions, they’d need to speak up.
I like that option a lot. It’s simple and well-defined, and won’t run into trouble specifying some exact algorithm which may not match Python’s str.lower.
So we would change
Arbitrary equality comparisons are simple string equality operations which do not take into account any of the semantic information such as zero padding or local versions.
to, perhaps?
Arbitrary equality comparisons are simple string equality operations which do not take into account any of the semantic information such as zero padding or local versions. The comparison is case-insensitive for ASCII letters and undefined for any non-ASCII text.
Notably, I think we have to call it out as explicitly undefined in order for str.lower() to still be allowed as an implementation. Without that, and ascii text understood as case insensitive, I would assume that a correct implementation carefully does not upper, lower, or casefold arbitrary unicode text.
I prefer “unspecified” over “undefined”, as I could see people reading “undefined” as meaning “you need to raise an exception”. That’s nitpicking, but isn’t that what standards are about?
I like your wording otherwise - although to be clear, it is technically still a spec change, not a clarification.
Right, when I started implementing arbitrary equality on arbitrary strings it came as a surprise to me that the existing check was case insensitive, as I could not read the spec[1] in an explicit way that allows for case insensitivity. This left me stuck with whether to break backwards compatibility or not.
What changed my mind is not just that checks have case insensitive for 20 years, but the spec calls out it’s own intention “to still install a legacy version”, and to match the behavior of legacy systems you must use case insensitivity.
I am 100% onboard with that, my intent in starting this thread was for a simple language clarification that matched existing and legacy implementations of arbitrary equality.
What I don’t want to do is create a new specification that doesn’t match legacy behavior, and my concern about going down the path of carefully defining a Unicode collation is that is what would happen.
I’ll post this in a few other Python packaging community spaces and wait a week or so for any objection, and then make sure a PR is raised for the clarifying language suggested by @sirosen and refined by @pf_moore and link it back to this this discussion.
Actually it does use the term “semantic information”, which isn’t defined or used anywhere else in the spec. One could make an argument that case sensitivity isn’t “semantic information”, but due to the unclearness of what that means I don’t want to start that discussion. ↩︎
Thanks. I’ll note that as I said before, this is a change that does have the potential to affect interoperability, and so it needs to go through the standard process:
If a change being considered this way has the potential to affect software interoperability, then it must be escalated to the Packaging category of the Python.org Discourse for discussion, where it will be either approved as a text-only change, or else directed to the PEP process for specification updates.
I’m happy to consider this discussion the escalation referred to in that process, and I think that your proposed next steps are sufficient for due dilligence on this. Specifically, I don’t think this warrants a PEP. So unless any objection comes up as feedback, I’m happy to approve this as a text-only change.
To complete the loop, when you raise the PR, please post the PR link here and ping me for formal acceptance of the change. I’ll also review the PR - I don’t know if I have authority to merge spec changes, but if I don’t, I’ll approve it and I’m sure one of the PUG editors can do so.
backward compatibility is important because the spec explicitly says that’s the intent of arbitrary equality
The spec fully achieves the kind of backwards compatibility that it is aiming for, with plain string equality. That is enough to “allow someone using a tool which implements this specification to still install a legacy version which is otherwise incompatibile with this specification”.
The case for case-insensitivity is based around the behaviour of existing tools (or as it turns out, perhaps of tools that once existed). That is the relevance of discussing the behaviour of those tools.
Alas, using PyPI, and particularly limiting to top 15k, is an extremely narrow check of the Python packaging ecosystem, and only really shows you the rules that PyPI enforces.
For example, in pip we’ve had multiple issues raised of packages using legacy versions this year. These come from conda packaged versions of Python packaging metadata, Linux redistribution versions of Python packaging metadata, and private indexes and tooling.
So I don’t personally find this heuristic argument that strong.
I don’t see how you arrive at this conclusion, legacy version tooling all expected “FOO” and “foo” to be the same version.
FWIW, it’s still true in existing tools, here is SpecifierSet in packaging (same holds for Specifier and Requirement):
So saying that the spec only allows for case sensitivity and changing it in packaging would be a backwards incompatible change for packaging, with over 10 years of precedent in packaging and over 20 years of precedent in the tooling the versioning was derived from, and against the intent of the spec.
Let me flip the question: What motivation do you have to break 20 years of backwards compatibility in tooling? Especially when it seems the main thrust of your argument is that no one is using it.
I’ll just add a note here, the discussion seems to have focused on versions not defined by PEP 440, which is fair because it is the most thorny issue.
But the spec I’m asking to be clarified affects arbitrary equality as a whole, the version that you pass to arbitrary equality may be PEP 440 compliant or not, and this equally affects PEP 440 versions. And packaging has always supported arbitrary equality in some form, since it’s inception to today.
Data from pypi was mostly just for interest, I am not using it to make a case in either direction.
I don’t see how you arrive at this conclusion, legacy version tooling all expected “FOO” and “foo” to be the same version.
Whether or not the spec achieves its aim that “a tool which implements the specification” can “install a legacy version which is otherwise incompatible with this specification” can be answered without reference to legacy tooling.
You are saying that some legacy tooling supports a usage that the spec does not describe, and you want the spec explicitly to require this behaviour of all compliant tooling.
(Actually, writing that… I realise that arguably the case-insensitive behaviour is _already_ compliant with the spec today. It would only be contrary to the spec in case versions “foobar” and “FooBar” both existed and then it failed to distinguish them.
ie what you are really proposing is that legacy versions “foobar” and “FooBar” can never co-exist. In the nature of legacy versions - this probably is not something that can reasonably be required, even if it does turn out to be true).
My objection to the change is really only that changing specs is in general undesirable.
I don’t agree with your reframing of this, but in this reframing of it is in the same way 1.0B1 and 1.0b1 can’t “co-exist” (because they are considered the same version).
The spec on arbitrary equality may actually already call this out depending on how you interpret the phrase “semantic information”, if case insensitivity if not “semantic information” then the spec could be already be read to say that it is case insensitive. But as I pointed out before (in a footnote), the unclearness of this phrase makes it difficult to draw any implications.
Unless some specific point is brought up where non-PEP 440 versions vs. PEP 440 versions are distinguishable in the proposed clarification I’m going to stop engaging on the discussion of non-PEP 440 version support, because whether non-PEP 440 versions are supported are not is not actually related to the language clarification I’m asking for.
The prior discussion on Unicode was relevant because technically non-PEP 440 versions could be non-ASCII, and case insensitivity can be ill defined for non-ASCII encodings, whereas PEP 440 versions can only be ASCII.
The proposed language clarification is just as about whether ===1.0a1 vs. ===1.0A1 are considered the same specifier or not.
One of the points of having a living spec vs. a historical PEP document is to allow for language clarifications, especially when the spec is unclear or contradicts the tooling it was based on, for example here is a prior clarification discussion: Are Developmental releases a type of pre-release?