How to address formerly unknown incompatibilities

sodul · February 13, 2024, 7:53pm

I’m trying to figure out how to update dependencies for already released packages.

I have a very concrete example that should be a simple but potentially very common use case.

The green test runner was compatible with all recent python versions in green 3.x, with no upper bound on the versions of python. When Python 3.12.0 was released, it introduced slight changes that broke green, so we made a new 4.0.0 release of green dropping older versions of python and adding compatibility with py3.12.0.

All is fine, but if someone tries to install green 3.x with py3.12 it will not work so we would like to mark older versions of green in pypi.org as explicitly incompatible with python 3.12.

Furthermore python 3.12.1 introduced a regression in the unittest package that broke green and other test frameworks. As such we are releasing green 4.0.1 with python_requires = >=3.8, !=3.12.1 which means that if you are on python 3.12.1, then pip will just install green 4.0.0 which is not helping. We could yank green 4.0.0 but then pip would just install green 3.x which is even more incompatible in practice.

I’m a contributor on green, this is why I use it as an example, but many other python packages have similar issues.

In practice what happens is that many package maintainers are artificially putting upper limits for unreleased dependencies, even if they will not be incompatible and this cause pip to become unable to resolve dependency conflicts, to the point that it blocks the application of security fixes in some cases.

I tried to find other topics and reports on this so this might be a duplicate but I think this is an important issue.

This is a related conversation:

jwodder · February 13, 2024, 8:09pm

There is currently no way to modify metadata for released packages. I’m confident this has been discussed here before, but I can’t find anything at the moment.

EDIT: The closest issue I can find is Permit Project Maintainers to Modify Project-wide Metadata · Issue #4816 · pypi/warehouse · GitHub.

ketozhang · February 13, 2024, 8:12pm

A bug found after release where it’s valuable to fix an older release version is usually addressed with a patch or a hotfix (incrementing the patch version or a hotfix suffix).

sinoroc · February 13, 2024, 9:48pm

I wonder if it could be a post build release rather than a bug fix release if it is just about fixing packaging metadata.

https://packaging.python.org/en/latest/specifications/version-specifiers/#post-releases

notatallshaw · February 13, 2024, 11:38pm

I would like to highlight a particularly common related scenario in which users end up with broken environments, which could be mitigated by a solution that allows some kind of metadata update post-release:

It’s not considered good practice for libraries to add upper bounds.
A new release of a dependency breaks the library.
The library releases a new version that temporarily adds an upper bound until a proper fix can be released.
The installer (Pip, Poetry, etc.) finds an incompatibility and therefore backtracks on the library version.
The older versions of the library do not specify an upper bound on the dependency that broke the library, leading users to install the old version of the library along with the breaking dependency.

Having some way for library authors to mark this would certainly be useful, rather than, in some cases, being overwhelmed by confused users.

(As a side note, for my example, package installers could apply some kind of heuristic when resolving dependencies to make this scenario less likely. I will work on a PR for Pip at some point this year, which should help.)

sodul · February 14, 2024, 12:51am

@ketozhang in our case the ‘bug’ is that 3.12.0 introduced changes that we then fixed with a new release, but our older releases are still incompatible with 3.12 and there is no way for us to advertise this.

We do not and should not mark ourselves as incompatible with versions of other things that have not been released as @notatallshaw points in 1.

With python 3.12.1 there was a bug introduced which was then rolled-back and fixed in 3.12.2. What we want to do is to be able to mark all our prior releases as incompatible with python 3.12.1. We are doing a new release that states the incompatibility explicitly but then pip will just try to install the older releases on python 3.12.1 which does not help anyone.

@sinoroc a post release does not really solve the core problem sine the existing releases are still considered installable by pip and it might install the older versions even though they are unusable.

ketozhang · February 14, 2024, 1:09am

Good catch on post-release (TIL) @sinoroc .

@sodul Practically, only users affected by this issue is those creating a new environment tomorrow (i e., post release). Once they encounter this issue and try to search for it, the hope is they also encounter the version list (in PyPI or some release page) that there is a newer release of the same patch version (assuming semver-like scheme).

There is no way to inform your users of the incompatibility without editing an existing release metadata which I discourage. Next best thing is to alert users theres a post-release fix available like how npm does it.

jeanas · February 14, 2024, 1:48am

It’s not discouraged… it’s simply impossible on PyPI.

CAM-Gerlach · February 14, 2024, 2:17am

This is a tough problem to comprehensively solve well currently, at least without either tradeoffs one way or another, or the keys to Guido’s time machine. However, post releases are probably the best approach here overall, at least within the current constraints of the packaging system.

Fixing packaging metadata is a canonical case for a post-release, AFAIK

AFAIK, the only cases where earlier post releases of the same patch release will be installed once further post releases are released is if users have the dependencies pinned by strict equality (i.e. == 4.0.0, without .* or any other modifiers). However, if users have a genuine need to pin dependencies all the way down by strict equality, typically for security or testing consistency purposes, suddenly installing a package with different metadata (which could, e.g., install a malicious dependency, or cause resulting differences when testing) likely wouldn’t be desired anyway.

jeanas · February 14, 2024, 3:16am

I tried OP’s case, using pip install --find-links and a dummy package built locally. Pip happily used release 1.1 which had no special Python version requirement while 1.1.post1 required Python < 3.12, and 1.0 had no requirement.

ketozhang · February 14, 2024, 5:42am

To be fair, PEP 592 for yanked packages effectively modfies (adds on) the metadata an installer sees in the release.

Do we want something similar for this purpose?

pf_moore · February 14, 2024, 6:51am

(Warning - I haven’t thought this idea through).

Maybe we could require that if an installer finds a .post release, it should ignore any non-post release of that version? That makes .post releases effectively a replacement, rather than an addition.

ntessore · February 14, 2024, 8:00am

Indeed, it has been pointed out many times in the past that the only solution currently is to publish a post-release and yank every existing version, which really isn’t ideal at all.

Edit: Which is what @pf_moore’s suggestion would “fix”, I should add.

webknjaz · February 14, 2024, 5:36pm

What if an end-user wants that non-post release (for reproducibility purposes, for example)? Specifying pkg == 1.0.0 selects pkg == 1.0.0.post1 AFAIR. Would it be easy for the end-users to allow excluding 1.0.0.post1 and still pinning to 1.0.0?

sinoroc · February 14, 2024, 6:31pm

If I recall correctly the following discussion was somewhat related (different use case but same area), maybe there is some knowledge in that thread that can be applied here: An official "unsupported-python" package

CAM-Gerlach · February 14, 2024, 8:19pm

Ah, I incorrectly assumed from previous discussion and testing I’d seen that pip would ignore earlier post releases when backtracking, but that’s evidently not the case, which @pf_moore 's idea (assuming I’m understanding correctly what specifically he’s proposing) would solve.

Yeah, or more precisely prune all but the latest post release for any given release version that has them from the set of matching versions to check/backtrack over for a version specification.

The express purpose of the post-release segment per the spec and in practice is to correct errors/issues in the packaging/release artifact (i.e. metadata, release notes, missing files, etc), and there are multiple stern warnings that post-releases should not be used for changes to the code itself.

Therefore, skipping such earlier post-releases when backtracking skips over potentially erronious metadata (if the post release corrects/updates packaging metadata), and also speeds up the solve (by skipping versions with either erroneous or unchanged metadata). The only theoretical scenario in which backtracking to an earlier post release might actually find a valid version is in the unlikely event a dependency/Python version/etc. compatibility was erroneously dropped in the latest post release, but that’s a rather pathological case and is a packaging error that can be corrected by simply issuing another post-release, and at most means pip might install a slightly older patch version rather than a potentially broken one.

You could also change the meaning of strict equality matching (==) as specified in the spec to include post-releases even without .*, but that would be a much bigger change (to the spec and in practice) that has non-trivial downsides to consider, unlike just tweaking the backtracking strategy.

That’s what I previously thought too, but as mentioned in my post above, and confirmed both from the spec and from testing it with pip, pkg == 1.0.0 will always select 1.0.0 exactly, not any post release.

Per the spec:

By default, the version matching operator is based on a strict equality comparison: the specified version must be exactly the same as the requested version. The only substitution performed is the zero padding of the release segment to ensure the release segments are compared with the same length.

For example, given the version 1.1.post1, the following clauses would match or not as shown:
== 1.1        # Not equal, so 1.1.post1 does not match clause
== 1.1.post1  # Equal, so 1.1.post1 matches clause
== 1.1.*      # Same prefix, so 1.1.post1 matches clause

And indeed, for a random example package I found, edlib (which has no deps and a release with not one but two post releases), tested on the latest pip 24.0:

$ pip install --dry-run edlib==1.3.8
Collecting edlib==1.3.8
  Downloading edlib-1.3.8.tar.gz (93 kB)
     ---------------------------------------- 93.5/93.5 kB 895.8 kB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Would install edlib-1.3.8
$ pip install --dry-run edlib==1.3.8.*
Collecting edlib==1.3.8.*
  Downloading edlib-1.3.8.post2.tar.gz (93 kB)
     ---------------------------------------- 93.1/93.1 kB 663.1 kB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Would install edlib-1.3.8.post2
$ pip --version
pip 24.0 from C:\Miniconda3\envs\py311-env\Lib\site-packages\pip (python 3.11)

fungi · February 14, 2024, 8:21pm

What if an end-user wants that non-post release (for
reproducibility purposes, for example)? Specifying pkg == 1.0.0
selects pkg == 1.0.0.post1 AFAIR. Would it be easy for the
end-users to allow excluding 1.0.0.post1 and still pinning to
1.0.0?

“Easy” is relative, but if they’re pinning the exact version of that
package then they’re presumably also pinning the versions of its
dependencies, or can at least choose to start doing so.

In projects I work on, with a transitive dependency set numbering
around a thousand packages, we deal with this all the time. Sure you
can limit upper bounds of some of your dependencies, but if you do
then you need to be prepared to cap their (direct or indirect)
dependencies as well.

fungi · February 14, 2024, 8:37pm

What if an end-user wants that non-post release (for
reproducibility purposes, for example)? Specifying pkg == 1.0.0
selects pkg == 1.0.0.post1 AFAIR. Would it be easy for the
end-users to allow excluding 1.0.0.post1 and still pinning to
1.0.0?

Oh, and I just realized I probably misread your question, but I’m
fairly sure pkg===1.0.0 (triple-equals) gets you that behavior
already, doesn’t it?

CAM-Gerlach · February 14, 2024, 10:31pm

As my previous replies have mentioned, so does normal double-equals—triple equals just triggers literal string comparison, which as the spec mentions is mostly only needed for legacy non-standard versions.

webknjaz · February 21, 2024, 12:27am

I meant ==. But I’m either misremembering or the spec (or the resolvelib implementation?) changed at some point in the past and I was remembering some older version of it like @CAM-Gerlach.

My understanding is that === is a discouraged hack that only exists to support arbitrary version specifiers that only exist due to historical reasons and don’t adhere to PEP 440.