OK. My instinct is that some of the options expect to be able to take an action before the full resolution is complete (“immediately error out”, for example, suggests this). But the devil is in the details, and I haven’t really thought through all the details of what any of this would mean in terms of pip’s resolution algorithm (because doing that would take quite some time…)
Editable metadata could be just fine with Option 1 too, as I pointed out. You could even require that a Python cap can only be made by editing the metadata. The issue I have with solution 2 is that it normalizes the behavior of making version caps on Python by trying to support it. Many of the problems it causes (especially for locking package managers) are not solvable. There are half a million PyPI developers, and they are not going to read the sorts of discussions we are having here. They are just going to see that capping is now supported, they are going to think “hey, this means I don’t have to support Python 3.10 right away”, and are going to cap. That’s not what it means, even under Option 2 - you can’t down-solve your Python version. I don’t think forcing errors for upper bounds (which is what you are doing, no one is arguing for solving using the upper bounds) is important enough to add to the system, any more than erroring because Windows is detected is important enough to add. It’s better to leave this up to the packages that really need it to implement themselves, via special dependencies, adding errors in setup.py, etc.
I’m trying to mostly stay neutral on the options, but I am rather not liking Option 2, because it’s giving users a sharp knife they think they want that will be very dangerous for most of them. Even with perfect, back-fixed metadata, Option 2 doesn’t really “add” anything at all to the solve, other than nicer error messages that can be obtained another way, and already must be obtained another way. Option 1 fixes the meaning of the field to match why it was added. Trove classifiers are there for “known to be supported”, and that’s fine.
This has been around for years, and for some reason, just recently have Numba and SciPy suddenly decided to start using Requires-Python for upper caps, even thought that’s now causing worse failures and/or workarounds.
I think the key issue here is this field is used by the solver. And the solver does not need to know about upper caps on Python, because it can’t solve for Python. (And even if it could, I wouldn’t want it to, because python version is important enough I want exact control over it - that’s why I almost always pin Python in environment.yml). Lower bounds are useful, because it can back-solve. But you virtually never back solve for an upper bound on Python.
Every single one of those systems doesn’t use our metadata anyway - the dependency names may not match, they have their own systems for limits, etc. Conda, homebew, etc. all run Python migrations exactly the same way: they just start at the top of the dependency chain, try to build Python 3.x version of the package, then keep going unless something visibly fails. Adding metadata-based failures here would slow the process down, not speed it up or help it in any way.
Also, every one of those does have some system to keep Python packages separate by version, which alone breaks the symmetry compared to other packages.
The conda-forge system is nothing like what is being proposed here. This proposal is to allow all maintainers to individually edit their own metadata for all time. Conda-forge packages are also immutable. Conda-forge’s system is a central pinning repository that collects known breakages; it’s a single addition, and it’s done by central maintainers, not individual package authors. There are no worries about security, maintainers being able to not break historical versions of packages, etc. (I’m not completely sure this is accurate, since I haven’t had to mess with modifying metadata much, but I do know the entire design of conda-forge is around maintainers who know packaging working on it, rather than package authors, who often hate packaging, like PyPI.)
I’m not saying it would not be useful, but it won’t solve the same problems in the same way.
Upper caps are bad for regular packages too, not just Python. For example, let’s say IPython 7.19 is out, and then Jedi 0.20 is found incompatible with it. So IPython 7.20 has a cap on Jedi < 0.20. Fine, Pip will probably solve IPython 7.20 and Jedi 0.19. Then let’s say you add another package with jedi>=0.20
. A “smart” solver will be too smart for it’s own good - it will back solve for IPython 7.19 and Jedi 0.20 - it thinks it avoided a dependency conflict, but there should have been one! This is what it’s doing for Python, too - it’s looking back and finding an uncapped / looser cap.
This is technically correct, and is why PDM/Peotry do it, and conda does it, and now Pip is doing it too. But for a user, it ends up with worse practical solves. The workaround you propose for SciPy with a “breaking” uncapped SDist will end up messing with this, too.
(side note) The “solution” being pushed by some is to always cap, since this problem is caused by “tightening” a cap, monotonically increasing caps avoid this problem - but that’s 100% dependent on SemVer being an accurate predictor of errors. If pyparsing 3.0.5 is broken, you are out of luck again, since you capped 3.1 or 4 instead - caps must monotonically increase or you are back to square one. Plus if you are wrong, you now are creating dependency solver errors where none existed. And you are forced to make frequent updates. And you have to maintains old major versions, because if you cap, that means you should be expected to be capped, too. And… (see rest of post).
Python is special, especially for locking solvers; they are trying to conceptually solve this problem assuming perfect metadata and assuming a target range of Pythons. Lower bounds on Python are pretty safe to consider accurate, but upper bounds are already not accurate; there are 346,823 projects with 3,108,221 releases that mostly have incorrect upper bounds (many of them are not even knowable yet, might be 3.12 or 3.19). Option 1 just says let’s not use Requires-Python for this at all.
I believe we are up to at least three proposals now: Editable metadata, A way to avoid SDists trying to install if a wheel is missing, and this one. And I might have missed one. I’d rather try to keep them separate, other than acknowledging that these might come along some day, so that one doesn’t have to wait on the other. I also don’t plan to push the editable metadata proposal forward, someone else will have to pick it up if they want it - not that it’s not useful, but it’s a huge undertaking.
The SDist one is tricky, but it looks like @rgommers might already have something for that, so that can be moved forward there.
Is there a PEP that explains what “requires-python” is supposed to mean? Because the poetry problem goes away if you can convince them to use it to mean “a restriction on the supported versions of Python” rather than “the range of versions that must be supported”.
Yes, that’s mentioned at the top. PEP 345 & PyPA Core Metadata specification state “This field specifies the Python version(s) that the distribution is guaranteed to be compatible with.”. This would only be compatible with Option 3 - if this is a guarantee, then you can’t guarantee the future so you have to ignore it when solving.
And I don’t think that affects them. There needs to be two values here, or this needs to be disconnected from the solve. A library’s metadata should not be dictated unconditionally by the current lock file.
I completely agree with you. I find it really confusing that poetry has the fields side-by-side:
[tool.poetry.dependencies]
python = '>=3.7, <3.11'
numpy = '>=1.20'
scipy = '>=1.5'
The first requirement says is a “for all” requirement, and all of the others are “there exists”.
All of the values “tool.poetry.dependencies” should be “there exists”, and there should be a separate field somewhere else that does the “for all”, although I like your idea above of omitting it in most cases and letting Poetry figure it out as the intersection of the requirements specified by all of the depenencies.
I’ve seen a few people mentioning apt / Linux distros / etc. here. I’m nowhere near as much of a Python packaging expert as other contributors to this thread, but I do have lots of Debian packaging experience so I thought I’d offer some context on that. It’s true that Requires-Python
turns into “just another dependency” in apt terms (although it might have to be manually transcribed by the packager - I don’t know of anything that would do it automatically), but as an isolated statement this is potentially misleading and needs some more specifics.
The consequences of upper bounds on Python with apt are indeed likely to be less bad than they are with pip fetching from PyPI: typically the result would be either (1) dpkg/apt would “deconfigure” the package with the upper bound in order to upgrade Python, and then unpack and configure a version with a weaker or absent upper bound, (2) apt would decide to automatically remove the package with the upper bound if it can’t find a better solution, or (3) apt would bail out with an error and require the user to resolve things, perhaps because the consequences of removing the package with the upper bound are too bad in some way.
However, apt-based distributions (certainly Debian and Ubuntu) don’t remotely try to provide an inventory of versions that you might choose to install that corresponds to the complete upstream history, for any package. The set of available versions is typically zero or one per line in /etc/apt/sources.list
(there’s no technical restriction on more being available, and it can be different in third-party archives, but Debian and Ubuntu’s archive management tools generally arrange for there to be at most one version of a package per suite+architecture). So Depends: python3 (<< 3.10)
in a package built for a Debian release that defaults to Python 3.10 doesn’t in practice mean that apt will try to downgrade to Python 3.9 and sort everything out: firstly, Python 3.9 probably won’t even be available to apt, and secondly, even if it were, we’re operating in a system-wide flat system here, and the chances of finding a valid solution with an older version of Python than your distribution provides are negligible anyway.
Upper bounds also complicate apt’s job in finding solutions, which is already extremely difficult given the large dependency graph for a complete OS, so generally speaking we only add them when we know that a given version of a dependency will definitely break the package with the dependency - for Python packages this is normally just used for binary packages with extensions built for say Python 3.9, which might get something like Depends: python3 (>= 3.9), python3 (<< 3.10)
. But this is really more like Python tags in wheels than it is like Requires-Python
.
At best, tight upper bounds can serve as a release management hint (effectively “dear Debian release team, don’t release with Python 3.10 until you also have a numpy that works with it”), since we try to ensure that we have a suite of packages that remains dependency-consistent throughout. However, preemptive upper bounds are a rather big and inflexible hammer for that job. At Debian’s scale we normally prefer to reserve that hammer for cases where we know it’ll be needed, as otherwise we end up getting ourselves into giant interlocked tangles where we’re waiting for dozens of package maintainers to do things before we can make forward progress.
For the sort of case where a dependency turns out to break something that depends on it in a way that we couldn’t 100% predict in advance, we’d be more likely to declare the problem the other way round: the new version of the dependency would declare Breaks: depending-package (<< first-fixed-version)
. That’s useful for release management purposes, and it allows apt to refuse to upgrade the dependency unless it has a solution that deals with the broken package, without having to preemptively declare tight upper bounds that might be non-issues. I don’t think there’s any analogue for this in Python’s own dependency system.
I know this isn’t all completely applicable to pip or other Python solvers, but I hope it’s somewhat useful anyway.
The next step for me is to make a PR to packaging to add set intersection - this is needed for any of the three solutions here, as well as helping nox, cibuildwheel, and probably other packages that want to query the requires-python setting with specific questions, like “is 3.9.*
supported”. It’s not trivial to compute properly, so it will probably be a little while before I do that - afterwards, we can revisit this and see which solution is preferred.
There is something I hadn’t thought about, but was highlighted by numba/llvmlite: they have an RC release for Python 3.10. So the “correct” solution (before approxomaly Monday) for them might be for Pip on Python 3.10 to automatically get 0.55rc1 (and the matching rc for llvmlite) if using Python 3.10, not to scroll back to some old version or even fail.
I’m really wondering if solvers should be directional, that is, never look for older versions of packages with “higher” upper bounds. For most cases, that is probably much better, especially when metadata is immutable, so users cannot “fix” old releases. This is clearly the case with upper bounds on Python, but it’s really usually true for upper bounds on anything, unless there’s an LTS release, which are rather uncommon.
I’m only just catching up with this thread - how much of the primary problem would be solved by just releasing a wheel tagged for the unsupported version that only contains a specific error message? e.g:
Lib/site-packages/numba.py:
> raise ImportError("numba is currently unsupported on Python 3.11. "
"Please use Python 3.10 or earlier, or specify --pre "
"to install our current prerelease build. "
"Visit <our URL here> for updates on new releases.")
You’d have this as a totally separate repo and carefully only release wheels that are not supported, so that users can explicitly --no-binary
to build from source (which I find myself doing from time to time, so would like to keep it this simple rather than having to patch the sdist first), but most are going to quickly get the wheel with the explanation.
It’s obviously not as ideal as failing at the resolution stage (and I think my selector packages idea would be a better approach for solving this kind of edge case there, as well as the others), but what I propose above would work today with no changes to any tooling.
(Oh, and I voted for “don’t support upper caps in resolvers, and warn about it in build tools” earlier, but you probably could have guessed that from my proposal )
I believe that’s (at least almost) in my original post:
This is based on a similar idea proposed for removing manylinux1 support. The only difference is that this one, by depending on a package with an error-raising SDist, is that you get the error during install, rather than later when you are trying to use the package. To the best of my knowledge, this does a better job of solving most of the issues authors face with unsupported Python versions, and it’s reactive instead of proactive - you can’t “not support” Python 3.11 until 3.11 is far along enough for PyPI to support to support wheel uploads, so you can actually test and see if you really don’t support it before breaking it. It’s also not overly easy; a simple limit will tempt users who don’t write complicated packages to limit Python support.
I guess I’d forgotten the earliest suggestions 60-odd posts later
The problem is that an error raising sdist actively prevents people from testing your package, for example, if you believe it’s incompatible because of a core CPython bug (likely). Core devs are not going to jump through that many hoops to test your package, especially if we ever get around to doing the automated testing we keep thinking about. Waking it so the sdist is unusable is unnecessarily restrictive.
That should be a fairly easy change to make to PyPI. I’m pretty sure they only block version tags to prevent abuse and/or user error. It’s certainly easier than changing the current definition (de facto or otherwise) of the Requires-Python field.
It’s not entirely clear, but I think you’re suggesting this is a good thing? I certainly think it’s a good thing. We don’t want to make it too easy for every random package to fail to install, but it should be possible for aware/active/thoughtful developers to help their users fail quickly with helpful guidance for known (and monitored) scenarios.
No, no, that’s suggestion 2. What I was suggesting is, to give a concrete example:
NumPy 1.23 releases before Python 3.11 tags are allowed on PyPy / before Python 3.11 is ABI stable. They release the normal set of wheels, and a normal SDist. They don’t limit Python-Requires to <3.11, since we discourage that (option 1).
Then Python 3.11 becomes ABI stable - the 3.11 tag is allowed on PyPI. NumPy tests the most recent release to see if 3.11 is supported. Sadly, they haven’t been testing alphas and betas and so there are several problems with NumPy 1.23 on Python 3.11rc1. So they upload a new set of wheels for 3.11, such as numpy-1.22.0-cp311-cp311-manylinux2014_x86_64.whl
, which are empty and just contain a dependency on break-me-if-python-is-too-new
or something like that - that package is where the “broken SDist” is. That way, “normal” users get a nicer error, which is what they want. Dependencies are specified per file, not per package.
If you build a non-released version, or just from source, you don’t get this error.
That should be a fairly easy change to make to PyPI.
Nothing needs to change for PyPI. Not allowing 3.11 wheels until 3.11 is ABI stable is perfect.
I think you’re suggesting this is a good thing?
Yes. That’s the problem with option 2 - it makes it really easy to just cap Python without knowing if it’s broken.
Looks like the concensus here is that we should change the recommendation and reword things to make it clearer what is expected here (i.e. suggestion 1).
Is there any interested in adding warnings to pip too like Henry suggested?
Packages like scipy may need some kind of nudge for them to change (if they even do): https://github.com/scipy/scipy/issues/14738
Would it be something that shows when the user of a library/application installs it? Because that seems rather hostile toward the developers of such libraries/applications since it is probably quite likely that those users will flood the issue trackers with requests to fix it when it is really just a recommendation.
Just to note, there is at least one other outstanding use case that Option 1 not only doesn’t solve, but rather makes actively worse: backports of stdlib packages, which are not intended to be used on Python versions that has the stdlib package installed (as the latter will usually shadow the former…except when it doesn’t, in pathological cases) and just cause extra bandwidth, install time and user confusion.
In this case, upper-capping as it is currently implemented works if every (non-yanked, pre-release, and otherwise compatible) release of the backport package has the upper cap. However, if implemented ex-post-facto, like for typing, all it does is make pip
backtrack to the previous release that doesn’t have it and install that instead—which is exactly what is happening in that case, as clients on later Python versions are just installing an earlier version of typing
(witness the distribution of package versions vs. Python versions.
Right now, this means for existing packages that to solve this, we need to go with the less elegant approach of having a lower-capped sdist-only install that errors on install, warns, or warns and installs an empty package unless the package has been upper capped its whole life (which still triggers an inefficient backsearch of every release version). If I’m following right, the former is pretty close to the recommended workaround here. However, that workaround will stop working if pip moves to --only-binary
by default, which seems to be under serious consideration, Furthermore, with the changes in option 1 here, even upper-capping all versions will not be officially supported and produce a warning (and Option 3 would be similar, except it won’t work at all). Under that scenario, I’m not sure how we’re supposed to handle this situation.
By contrast, Option 2 would fix all of this for this use case and be a strict improvement here; typing
would be fixed as-is, and the others could be fixed similarly, with just a post-release adding the upper-cap to the metadata of the latest release. I’m not sure if that’s strong enough grounds to implement it, but I’m curious what else we’re expected to do in this use case, as with --only-binary
+ a warning or removal of the upper-capping behavior, we’d be basically out of options short of yanking every version of the package (which in pip < 22 would just produce a warning, but in pip >= 22 would cause huge unnecessary breakage).
It would be really good if the recommendations could be clarified here (As in clearly documented in a brief form, without any confusing details that just make things fuzzy). Pointing others at this or similar threads is a giant headache for all involved. And frankly, it isn’t even useful, because even if you spend hours reading this thread, you will still most likely be frustrated and confused and only get a faint feeling of “fine, apparently even a correct pin isn’t worth the trouble”.
Poking this thread again, before I merge the PR filed based on this thread…
If anyone has concerns with the proposed language, please flag them here!
Had a broken pipeline caused by this exact practice of pinning the upper range of Python (in my case, to <3.13
) in requires-python
- the dependency was Numpy, and I was using PDM to install the project dependencies in a container.
What caused confusion was that this pinning appeared to be done on a release branch, while the main project TOML had no such pinning. I had to apply the same pin to my project’s requires-python
in order for the PDM install to work in the pipelines.
I can remove this pin, but I’ve left it in place out of safety, but I’m surprised by how easily such failures can be triggered.