Requires-Python upper limits

I would implement it this way: The solve proceeds as today, except the upper cap is ignored (option 2 and option 3). Then (option 2 only), once the final package is selected, you’d check to see if the current Python version is valid. If not, throw an error. The upper cap does not affect the solve at all, it just causes an error at the end if it is not valid. You are just removing the possibility to back solve to get an older package with a looser Python upper bound, because older packages are very, very unlikely to have better support for newer Python versions.

Honestly, the same system could be applied for requirements. You could also specify that you should never back-solve for looser requirements. Because the current IPython caps Jedi to 0.17 doesn’t mean you should look for an older IPython that doesn’t cap Jedi to 0.17, it’s not more likely to be compatible. But implementing it is a bit harder, since package versions are not fixed and limits can be valid (just back solving for looser limits is problematic), while the Python version is fixed. And maybe there might be more valid reasons.

There could be hundreds of old releases, requirements change over time, every file is allowed to have different requirements - I’m not sure this would be “simple”. And you’d have to add it for all time or it’s useless, the solver would just go back to wherever you gave up editing requirements.

Someone other than me can pursue this:

Files are hashed; you can never upload two different wheels with the same name but different hashes. Lock files / requirements.txt hashes are listed for each file. You can’t change the files; that’s a core security requirement. So the only way to modify the metadata would be to add a new file that sits “alongside” the wheel or SDist file - probably each one, since each file technically could have different metadata, and then modify setuptools, flit, and every other build system to produce them, pip and every other installer to include them, pypi and every other wheelhouse to include them, pip-tools, Poetry, Pipenv, PDM, and every other locking system to handle them somehow, etc. And the incoming PEP 665. And any older version of anything (like pip) would not respect the new metadata.

And, you’d have to be very careful to only allow a very limited subset of changes - you wouldn’t want an update of a package to add a new malicious dependency via metadata!

Would you be able to get old versions of dependency override files too? Plus other questions would need to be answered. It could be done, but it would be a major undertaking, for a rather small benefit - ideally, everyone should try to avoid capping things, play nicely with deprecation periods, test alpha/beta/rc, and just understand that in the real world, a library can’t perfectly specify it’s dependencies so that there will never be a breakage. That’s only possible when locking, for applications.

To be clear, I am not saying editable metadata would not be useful or nice to have. It would hopefully remove the desire to add preemptive caps! I’m just saying it’s a big job, it’s hard, and solving it doesn’t conflict that much with solving the issue here. Backsolving for older Python version is very, very rarely useful, and 99% of the time harmful.

Correct. Bigger chance Python 4 means no GIL or some C-level changes for that sort of thing than the whole str/unicode transition.

BTW no one has come to python-dev to ask for a change to this. But I will also say that 3/5 of SC 2022 ran on a “backwards-compatibility FTW!” platform, so there may be a potential shift in how much breakage there may be between versions.

Sure, it doesn’t hurt anything.

I have a plan …

Nor was it a request to change the cadence. I think that 1 year is better in many areas, this is one where it’s not the best. One of the problem I see is that Python 3.N+2 master branch start too early IMHO with respect to 3.N+1 alpha/betas, but I digress.

Yes ! And I hope more tooling in the core, to make transition easier :slight_smile:

I don’t know if I should be scare or excited…

1 Like

More frequent updates means smaller change sets. Packages should be more likely to work on the next version of Python. That’s also partially why pybind11 tests against the latest Python 3.11 alphas; it’s much easier to deal with 1-2 breaking changes at a time than to get a bunch of them all at once.

FYI, funcparserlib 0.3.6 released in 2011. It worked on every version of Python up to 3.10; it finally broke when setuptools removed 2to3. The authors, ten years later, made an (alpha) release without 2to3 lib. But that’s an example of a package that needed no changes for 10 years. There are others like that - well written Python code is likely to work in the future.

I do rather think back solving to look for higher upper limits is likely a problem for both Python and dependencies. What would the implications of simply removing back-solving for higher upper limits in the solver for regular dependencies too? (I have not thought that one through)

Here’s a poll for the above choices:

  • 0: Leave it as is.
  • 1: Disallow upper bounds by producing warnings in development backends when Requires-Python is upper bounded.
  • 1b: Disallow upper bounds, and also ignore them in solvers.
  • 2: Implement upper bounds as an instant fail if they end up in the final solve and are below the current Python.
  • 3: Ignore upper bounds in solvers. Leave the current wording of “guaranteed”, encouraging upper bounds.

0 voters

What if your package is pure Python and it wants to provide a wheel? Based on what’s said in the PEP about compatibility tags, both py31-none-any and py3-none-any can be installed on Python 3.3 so it doesn’t seem like you can really upper-bound the wheel without Requires-Python.

Why would you want to upper-bound a pure Python wheel?

And the expectation is you would make a new release with a higher lower-bound which would cause a resolver to select the older version due to the fact that newer versions of the package don’t support the older Python version.

I’m (well, it’s more of “we” because there’s more volunteers and I’m not the only maintainer, I don’t want to speak for anyone but myself though) using PyPI to distribute an application, not a library which is why I want to control what version user is installing the package on. If they try installing it on Python 3.10 (which I don’t provide support for yet), I want the installation to fail so that they either realise their mistake or ask about it in the support channel. I provide full documentation on how to install the application but since not all users read the instructions carefully (or more likely, just don’t read some parts at all), the upperbound guarantees that they get stopped if they accidentally use wrong (as in, unsupported) Python version.

Of course you can upper bound a wheel without Requires-Python, the same way you should upper bound the wheel with Requires-Python today, because Requires-Python is broken. You add a dependency on a package like "unsupported_python; python_version>=3.10". Then unsupported_python would be an SDist package that breaks with a error message. This is also how you would force Windows or some OS you don’t support to error out - why isn’t anyone asking for that? I would hope not supporting an OS is much more common than limiting Python! Also special architectures, etc. But Requires-Python is enticing people because it looks like it you should be able to set upper limits, but it causes problems.

For the two people who voted for 0: No change, remember we currently state “Python-Requires specified the Python version(s) that the distribution is guaranteed to be compatible with”, but if you add an upper limit, then Pip will solve for older versions if you are on a “blocked” version of Python and Poetry/PDM will never get the latest version of your package if a user sets too high of a Python-Requires themselves! That’s the current situation! Do you perhaps have a different solution?

As an idealist, I love Matthias’s suggestion of mutable metadata. Someone could then write a simple script that a project owner could run that would update PyPi’s metadata so that all old versions will set an upper bound equal to the upper bound that’s set on the next version. This would essentially implement your method 2 as an opt-in since the project owner has to run the script. It also allows future versions of the project to do whatever they want.

Another advantage to mutable metadata is that it would allow projects to deal with breaks caused by dependencies. For example, tensorflow-probability was broken by Jax 0.2.21. They didn’t release a fix for two months. In that time, a project of mine that depended on tensorflow-probability had to set “jax<0.2.21” until finally tensorflow-probability had a release, at which point, I removed my upper bound. It would have been much nicer for the tensorflow-probabiility team to edit the metadata on their latest release saying “jax<0.2.21”.

Isn’t that a problem with poetry? It seems like the “poetry update” command should just build a lock file corresponding to the virtual environment’s Python version (the one set with “poetry use”). After all, just because you’re leaving the version (or any other requirement) unbounded, someone else bounding that requirement shouldn’t cause an update failure, should it? And you’re locking all the solved versions of dependencies, why shouldn’t you also lock the version of Python that you selected?

Because, if you locked for Python version, platform and architecture without actually needing to, portability would suffer, and the usefulness of the lock file outside of a corporate team setting would suffer. The problem with Poetry as I see it is that it requires that your “abstract” Python version is contained within the Python version it’s locking for. If you depend on packages foo>=1 and bar, and bar depends on foo<2, Poetry doesn’t interrupt the solve, it just solves for foo>=1,<2. Maybe there’s something I’m missing, but why can’t it do the same for the Python version? Short of locking it to a specific version, of course.

Is this documented as best-practice somewhere?

Clicked too fast. I meant to add:
I think first there should be an agreed-upon best-practices solution for projects that, for whatever reason they decide, want to reject newer versions of Python. The desire is to eventually provide support, but on their timeline, and not on the timeline of CPython.

No, the whole point of a lock file is to allow an environment to be recreated exactly somewhere else. You don’t have control over your Python version. Maybe you made your lock file on Python 3.6.0. Your CI system has 3.6.1. Requires-Python allows patch versions - >=3.6.1 is very common, in fact! You must be able to restore your environment on different versions. Poetry (and PDM, etc) take a range (unfortunately, they force this range to be the one you put in Requires-Python if you distribute on PyPI!), and they solve for that range. So if you write >=3.6, every package it finds must also include 3.6-:infinity:. If a package does not monotonically increase in Python version upper bounds, this procedure will get old versions. If it can’t find an unbounded result, it will force you to add a bound in order to solve. It’s not “wrong”, but it’s very unhelpful, forces you to set your metadata based on a lock file (which for a library is unimportant, and for a PyPI application is also unimportant, it’s only for applications that are not PyPI distributed, at least unless PEP 665 support was added to PyPI packaging) and produces solves that users don’t expect if a bound is ever lowered.

Mutable metadata, besides being a major undertaking, is requiring every package author to maintain every version they’ve ever released (at least the metadata for it). It also will have issues with automated scripts - slapping an upper cap on all older copies of a package can have problems (what if that dependency was dropped or changed? What if it really was supported? What if the problem is Python dependent? What about environment markers? Etc.) If a package doesn’t change the metadata, we still are in the same problem as today - so the options given still would be useful, IMO. In fact, for option 1, I’d add: “You are not allowed to cap Python version in a wheel/SDist. However, it is allowed as a .patch file.” That would avoid normalizing version capping, and would clearly indicate pre-guessing is not recommending practice.

Python never gets locked, so “1.2” is not a valid choice - it must work in the entire range. If you put python= ">=3.6,<3.9", then every dependency it discovers must be valid in that range. Or if you put "^3.8", or >=3.6.1", it must find a solve where every dependency satisfies that. And it’s perfectly happy to go back as far as it needs to to satisfy that. If you want to “lock” the python version, you have to specify python=="3.9.9", and that’s what goes into your Requires-Python slot. It really should have two settings, one for Requires-Python, and one for solving. In fact, I’d personally like to leave the solving one blank, then have it tell me what the final range is. This would also have the nice effect that it would never back solve to get a “better” Python upper bound.

There is no agreed-upon best-practices solution for rejecting Windows. Why is rejecting Python versions special? Because the name of Requires-Python and the fact it’s a free-form slot and the fact the wording in the PEP/standard is terrible makes it seem like it’s already providing this (it was not supposed to - it was added to fix solves, not break them). People don’t control their Python version, and Python is supposed to be forward compatible, so providing a really easy way to force forward incompatibility seems destructive for 90% of packages.

And it’s sometimes OS dependent! NumPy released Linux 3.10 wheels well before anything else. The reason given for why SciPy didn’t support 3.8 right away was due to the Windows change. Etc.

In fact, what the large data science packages actually want is the ability to force wheels without user intervention. If a package could tell Pip not to fall back on SDists unless the user asked for an SDist, that would solve the problem for most of the packages wanting this feature. In fact, PyTorch deleted all their old SDists (yanking was not enough) just to ensure pip never tried to fall back to an SDist if it couldn’t install a binary.

Large packages have hundreds of thousands of users, many of whom will open issues in droves if anything fails mysteriously, which is why they want these sorts of fails.

If that’s true then why not propose an addition to requirements metadata that says only binary distributions are valid? This is possible at the user level with pip (--binary-only numpy) so there’s some level of prior art here.

Of course, for such a proposal to be viable, you’d have to persuade the projects who currently feel that capping the python version solves their problem, that this would be a better solution for them. Otherwise we’ll just have two proposals, and still be arguing over whether capping is a good idea :slightly_frowning_face:

1 Like

Okay, thanks for explaining this! It seems that the Python version restriction isn’t just saying “I support anything in this range”—it’s saying “I must support everything in this range”. This is fundamentally different than any of the other package requirements. This explains why Poetry cannot accept a dependency whose Python version restriction is tighter than your project’s version restriction.

Another option would be this:

Add another option alongside the Python version constraint called the Python version support requirement.

The version constraint X says: This project cannot install on any version that doesn’t match the constraint.
The version support requirement Y ⊆ X says: The intersection of all of this project’s dependencies’ support requirements must be a superset of Y. In other words, it’s a promise that this project will work on all of Y.

We currently only have a setting for Y. Projects like scipy would like to set X, but can’t, so they set Y as a bad proxy.

I’m not sure this would solve everything though since you still want mutable metadata to update Y in case something breaks in a new version or some dependency adds a restriction to their X.

Oh! Looks like you have the exact same idea, but your leaving blank idea is even better because then the need for mutating metadata I mentioned above is greatly reduced. If some dependency adds a restriction on X, your Y would be automatically updated.

So, in short, if no one is specifying Y, then why can’t we do this:

  • Change the meaning of the Python version requirement to X,
  • Change package tools to treat it as such, which means not ensuring that all projects support everything in the Python version requirement, but rather calculating Y as the intersection of every dependency’s X, and succeeding if your virtual environment is in that range.

I can imagine how that’s a major undertaking on the PyPi, but with a few scripts to cover the most common cases, it shouldn’t be such a chore for package authors. The usual case is that a release of Python (say 3.10) breaks you, so you run some script that “ensures all releases are at least <3.10”. Then your next release is uncapped.

Mutable metadata really is the idea solution notwithstanding the work it might take to do it.

Yes, I submitted an issue about this: Poetry refuses to upgrade projects that bound the Python version · Issue #4292 · python-poetry/poetry · GitHub

It looks to me like there’s both a pragmatic “what do we do given the current state of tools” argument (which is the one @henryiii is making), and a more principled “what’s the design we’d ideally like to have” argument. A few thoughts:

  • The current state of things clearly doesn’t work well. Editing the requires-python definition to forbid upper bounds is one way out. And that way is the least amount of work (this is important).
  • On the other hand, from a design perspective removing accurate metadata in order to let an installer tool download an sdist which that metadata indicated was not going to work and then have the build system error out (potentially after setting up an isolated build env which may download a lot more data first - for scientific projects this could be hundreds of MBs, and if you rely on the likes of TensorFlow or PyTorch potentially >1 GB), is a poor design choice long-term.
  • Python is just one of many dependencies. It’s treated differently on PyPI than other build-time or runtime requirements, however (a) there are tools that can install/control Python versions, and (b) sdists are not only for PyPI - they are also for conda-forge, Homebrew, Linux distros, etc. all of which treat Python the same as other dependencies.
  • Matthias’ proposal for editable metadata is probably the best long-term design (and goes together with @henryiii’s idea 2, “implement upper capping properly”). It’s way too much work to consider only for this python_requires issue, however it’s very valuable for adding caps to other dependencies long-term. For conda-forge this is possible, and many maintainers describe that capability as a life-saver.
  • The current description of requires-python is not “terribly worded”, it’s just the intuitive way of describing a dependency requirement, and it matches what I (and I assume many other maintainers) would assume if I wasn’t familiar with this discussion - supporting the PEP 508 specification language. We recently had a discussion on the Pip issue tracker about why build and runtime dependencies are treated so differently (the former cannot be overridden), and the conclusion there was also that there’s no good reason for that. The design reasoning is similar here; Python is not special enough that upper caps must be forbidden.

On the need for this:

  • On the list of real-world issues that we (scientific package maintainers) have with packaging, this doesn’t rank very high. If the outcome is that we go with erroring out in the build system, we can live with this for some years to come.
  • Our concerns are real though. We’ve always had this metadata info in all release notes “this release supports Python 3.8-3.10”, and users(/packagers) do not read release notes. Improving metadata quality and not downloading a lot of data before erroring out does matter.
  • As already pointed out by a few others, the “you don’t know if it will or won’t work, hence you should not cap” is extremely misguided as the blanket response to caps. Package authors should default to no caps in the vast majority of cases, but there are valid reasons to add caps on any dependency (as also laid out in @henryiii’s excellent recent blog post). For packages like NumPy and SciPy we are sure things will break with future Python versions, so a cap is valid. Note that we do think about this carefully - for example for NumPy 1.21.2, released before Python 3.10rc1, we already set the cap to <3.11 because we planned to upload wheels later on, after Python became ABI-stable and we had our wheel build infra updated.
  • It’s also worth pointing out that this is not just about NumPy and SciPy. The way sdists are treated in general by install tools isn’t great, which is causing other projects to not upload sdists at all. For example, take what are probably the three largest and most actively developed Python projects (several dozens to several hundreds of full-time engineers): TensorFlow, PyTorch and RAPIDS. The latter has given up on PyPI completely, and the former two do not upload sdists, because they are too problematic (failed installs highly likely) - which is a shame, because sdists have significant value for archival and code-flow-to-packagers reasons. This requires-python issue is not a main driver for not having those sdists, but it does show how problematic it is to try installing sdists that aren’t going to work.

On locking install tools:

  • Poetry and PDM clearly have usability issues here.
  • The Poetry/PDM behavior, and the resulting flow of packages to PyPI with unnecessary caps seems to drive most of the opposition to adding any caps at all. This is understandable, but it’d be much better to push those tools to stop doing that rather than to continue pushing back on all caps.

This isn’t true. The transition mechanism I had in mind for SciPy is to upload a new sdist for the last release which was missing the upper cap in requires-python to error out in setup.py with a clear error message. That’d be equivalent to what @henryiii is advocating for (modulo it doesn’t solve the immediate issue with locking solvers), and that then becomes irrelevant once the final design is implemented in install tools.

@henryiii is correct that this is a much more import wish/problem. It’s a little orthogonal though, as I hope my first points on pragmatism vs. good long-term design made clear.

@pf_moore that’d be great, and is Speculative: --only-binary by default? · Issue #9140 · pypa/pip · GitHub (your original proposal). It’s a significant amount of work, and it’s still not clear to me that it has enough buy-in from install tool maintainers (?). I already replied on the issue after you asked about potential funding:“If it looks like there will be buy in for this idea from the relevant maintainers/parties, I’d be happy to lead the obtain-funding part.”. Still happy to do that, and confident I can actually obtain that funding in 2022. I’m not quite prepared to do the significant amount of work of getting all the buy-in we need before arranging funding though, or to arrange funding and then don’t get it done because of lack of consensus. So this is a bit of a chicken-and-egg problem. Someone within the PyPA who understands what’s needed and has connections to the relevant parties would be better placed to do this initial alignment (if a smaller amount of funded time would help there, please let me know - that’s easier to arrange).

Second thought on this: it must be the default. Any user level opt-in switch (e.g. writing in your docs pip install --only-binary scipy) is useless, because users don’t read docs - and when you have O(20 million) users, that’ll be a lot of bug reports and wasted time.

Third thought on this and on capping in general: about half of all Python users are scientific / data science users now. These users are not developers - they are scientists and engineers first, and programming is a tool to do their actual job. Expecting them to figure out how to fix up their install commands after a new release of some dependency has broken their pip install some_pkg is a poor idea. The prevalent attitude to caps around here is “don’t add them, when it breaks just fix it”. This just plain doesn’t work for these users. And unfortunately these users do sometimes(/regularly) work in places with outdated (or non-Linux) HPC systems, and may therefore need to build from an sdist. So building from sdists therefore needs to be reliable - I wish we could rely on “only build from source if you’re an expert”, but we can’t.

8 Likes

Thanks for the extremely well thought out response. I don’t have time to make a detailed response here (and honestly, I’m not really the person who should) but I’d like to make one point here from the perspective of a pip maintainer which might be getting overlooked.

Sigh. I just can’t write a “quick reply”, no matter how hard I try :slightly_smiling_face:

tl;dr; Any proposals here need to be very clear whether they are looking at the “theoretical” problems with caps, or at “how pip can hack around the issue” (with a side question of “what about other installers that may exist, now or in future”)


Source distribution handling in pip is a huge mess of heuristics, backward compatibility hacks, and out-and-out guesses as to what the right behaviour should be.

Any comments on this thread about what “installers should do” fills me with dread, because implementing it in pip will no doubt trigger an extended and draining debate on edge cases and failure modes.

The issue here is fundamentally that dependency solvers are complex, and the key algorithms are based on principles that don’t apply in Python packaging (namely, that the problem can be statically defined in advance). In developing the new resolver, the pip developers have had to make compromises and design decisions to adapt existing approaches to the realities of Python packages. (And Poetry and conda have made their own, different, compromises, which is why we have to be careful here to not agree on a solution that only works for pip). Wheels are fairly simple to handle, because the only issue we have to address is the fact that metadata isn’t available “up front” but must be introduced on demand. Source distributions, however, are a nightmare, because builds might fail, builds have their own dependencies, building a sdist or even just getting the metadata can be hugely expensive, etc.

@henryiii is working from a rather theoretical model, where a “solver” finds a suitable set of things to install, based on the available constraints. That’s how the example of pip downloading and trying to build numba 0.51 comes about. Under that model, Python version caps are a problem because they aren’t applied correctly (all older versions of numba should cap the Python version, but they can’t because of immutable metadata).

The proposed solutions (apart from “do nothing”), however, are not looking at that theoretical model, they are rather adding heuristics outside of that model.

  • Error out if an upper bound is detected. Anywhere, even on a wheel? on a sdist? What even does “detected” mean? What if pip scans numba 0.52 first for some unknown reason, finds an upper cap of Python<=3.9 and errors, even though numba 0.53 supports 3.10?
  • Ignore the back-search if an upper bound is detected. Again, what is “detected” here? What back-search? This assumes a certain implementation method. Pip does search backward through valid versions, because we prefer to install newer versions when possible, but that’s not guaranteed - all we actually promise is that we will find some valid solve. I have no idea if conda or poetry work like this (I believe conda uses a SAT solver, which may not do this at all).

Not all solvers use a backtracking model (althouth our research for pip suggests that there’s currently no usable solver that handles “on demand” metadata apart from backtracking ones), and not all backtracking models necessarily follow a strict “latest to earliest” scan of versions (although pip currently does in most cases - probably! Our prioritisation logic is distinctly non-trivial and I wouldn’t guarantee we never check numba 0.51 before numba 0.52[1]…)


  1. Actually, I can describe how that could happen, but I won’t unless asked, because this post is already too long… ↩︎

1 Like

My understanding is the following:

  1. Run the package resolution as normal, except ignore any upper bound on the Python version during the resolving
  2. If the resolved package has a requires-Python field in its metadata, and that field specified an upper bound for the Python version, then perform an action

Some of the options in the poll above are choosing which action to take (eg raise an error if the version of Python in the target environment is above the upper bound).