How can build backends avoid breaking users when they make backwards incompatible changes?

pf_moore · March 25, 2025, 4:45pm

I don’t think it’s inevitable, and the UX design is one of the things I’m concerned about. IMO, pip shouldn’t be taking on complexity here - I don’t see any evidence that this is a major issue. We’ve never had any problems like this with any other build backend. And while I don’t want to blame setuptools^[1], I also don’t want pip’s feature set to be dictated by mishaps in setuptools’ release management.

I won’t block such a feature, but I don’t particularly support it.

Maybe. If you (or someone else) can articulate it without referring to setuptools or to this recent issue, that would be helpful.

There are also shallower issues. The deprecation warning setuptools issued was reportedly hard to find, and often “lost” in log spam. That’s a UI issue that has multiple aspects. Setuptools doesn’t have much control over what output it displays (including compiler output that it has essentially no control over beyond “display it or not”). Build frontends don’t have access to any structured backend output, all they get is stdout and stderr (again, “display or not”). Users complain when normal execution produces too much output. So important messages do get hidden or ignored. But equally, people never act on warnings until it’s forced on them - and then they complain that it was a surprise. It’s a people issue, not a technical one.

People also don’t pin stuff, or use tools like lockfiles. And they don’t test build infrastructure upgrades before letting them into production. There’s no “middle ground” between YOLO-ing the latest release into production, and managing all your build dependencies and environments manually.

And, of course, there are major mismatches in expectations - many of the people most affected by this sort of issue are businesses who push their priorities and deadlines back onto volunteer developers, but don’t understand the underlying dynamics of volunteer open source work.

There’s also groups like Linux distros, who are volunteers, and who do take a cautious approach in a lot of cases - but they work to very different principles^[2] than the packaging ecosystem is designed around, so there’s still big mismatches in expectations.

There’s a lot of systemic “how people deal with open source” problems involved here, as well as the packaging ones, and we’re not going to solve those here.

Setuptools is a special case. It has way more history than any other build backend, it has a design that not only allows, but encourages ridiculous levels of interference with implementation details by user code, and it’s used by default in every ancient, unmaintained and frankly broken package on PyPI. Their problems are massive, and frankly I’m impressed that they are even willing to continue accepting that support burden].

But who’s “talking about build backends” anyway? If build backend maintainers are involved, the conversation will be productive. If not, why are we deciding what other people should do with the software they maintain anyway? “We” (by which I mean “people in the packaging community who aren’t setuptools maintainers”) can’t solve issues for setuptools^[3], and I’m not sure we should even try to.

From what I can see, this was removing a deprecated option that had been warned about for 3 years or more, in a major release ↩︎
such as “build everything from source” ↩︎
or any other build backend ↩︎

notatallshaw · March 25, 2025, 4:57pm

I outlined the scenario here that pip is currently incapable of handeling without mentioning setuptools: How can build backends avoid breaking users when they make backwards incompatible changes? - #9 by notatallshaw

But it rests on the premise that build backends make incompatible changes at some point. If this premise is faulty, then this isn’t an issue.

I brought this up in my opening post and linked to: Warnings in backends as suppressed by frontends · Issue #558 · pypa/packaging-problems · GitHub

If someone is motivated about backends not being able to communicate to the user they should follow up on your comment.

I’m not aware of any tool that provides lockfiles that include build backend locking, and the proposed lockfile PEP doesn’t support it either.

One build backend maintainer has replied so far, I’m hoping for more.

Changes in how frontend tools work can absolutely solve problems here.

What if, for example, by default pip choose the minimum version of the bulld backend with a wheel available whenever a lower bound was given? This would solve the scenario I outlined, at least until the user upgraded to a version of Python that old versions of foo didn’t support.

bryevdv · March 25, 2025, 5:14pm

Windows is backed by a multinational, multi-trillion dollar company. Much OSS maintenance, yes even for packages with millions of installs a month, often has to fit in the spare time of a handful of folks. I’ve said it before: users needing to deal with this kind of churn is absolutely the price of free software.

fungi · March 25, 2025, 5:35pm

There are also shallower issues. The deprecation warning setuptools issued was reportedly hard to find, and often “lost” in log spam. That’s a UI issue that has multiple aspects. Setuptools doesn’t have much control over what output it displays (including compiler output that it has essentially no control over beyond “display it or not”). Build frontends don’t have access to any structured backend output, all they get is stdout and stderr (again, “display or not”). Users complain when normal execution produces too much output. So important messages do get hidden or ignored. But equally, people never act on warnings until it’s forced on them - and then they complain that it was a surprise. It’s a people issue, not a technical one.

In the circles where I run, the biggest source of repeated impact is the surprising long-term stability and backwards-compatibility of CPython itself. There are rather a lot of (particularly pure Python) libraries out there on PyPI whose most recent releases long pre-date any such deprecations and come from a time before the prevalence of wheels (or at least when there were a lot more people who didn’t bother to make them for pure Python packages because they saw wheels as redundant in those cases). Many such libraries still work fine and appear in large transitive dependency sets for popular projects… until an unconstrainted SetupTools invocation decides that it wants to reject some aspect of those packages.

Not that I hold any animosity over this; the SetupTools maintainers have volunteered for a Sisyphean task and are faced with being basically unable to remove old features or add safety measures without breaking hundreds or thousands of users (many of whom respond in unfortunately unkind ways).

But who’s “talking about build backends” anyway? If build backend maintainers are involved, the conversation will be productive. If not, why are we deciding what other people should do with the software they maintain anyway? “We” (by which I mean “people in the packaging community who aren’t setuptools maintainers”) can’t solve issues for setuptools^[1], and I’m not sure we should even try to.

With my PBR maintainer hat on (it’s a pyproject build backend and SetupTools plugin), we try to maintain backwards compatibility as much as possible and transparently translate old options to new metadata fields wherever we can. At the moment, its latest releases are still declaring Requires-Python: >=2.6 though realistically we only test on 2.7 and 3.6+ these days (if users encounter regressions in scenarios we don’t test, we still consider them bugs and will do our best to implement fixes). As such, we don’t recommend users set an upper bound on pbr in their build-system.requires list, but because it also relies on an unbounded version of SetupTools we’re still unfortunately exposed to regressions on that front.

Please do note, this is merely the approach PBR’s maintainers have taken to backward compatibility, in part due to a user base with a disproportionately high amount of old LTS or even effectively EoL platforms, coupled with the historically impossible (or at least discouragingly complicated) situation around constraining build dependencies in the broader Python packaging ecosystem. I’m not personally advocating for this approach, it’s absolutely a lot of effort and we do regularly question the necessity or wisdom of continuing in that fashion indefinitely.

or any other build backend ↩︎

potiuk · March 25, 2025, 5:42pm

I think there is no “perfect solution” - and in a way it should be left to the maintainers what is more important for them, but also we have to remember that significant part of of the problems yesterday were caused by sdist-only packages that are python only, and that can be easily addressed by “whatever” approach in builds and publishing .whl.

I think we will never be able to support building old packages - forever in all possible future circumstances - this is simply impossible, not only because of build requirements in pyproject.toml but also because - for example - someone will want to build a package on new Debian 266 and the libraries you will need will relocate or disappear or will not be installable any more.

I thiink what we should really aim for is rather similar to what “build reproducibility” goals are when you release a package - not “reproducible for ever” but reproducible given a set of tools and environment that will be around long enough that the version will be relatively old and replaced by a new version.

Software rot - Software rot - Wikipedia is a thing and it will happen regardless of how much we want to protect against it. Old packages will fall out of maintenace and their users will have to switch to other packages etc. I think we should more encourage maintainers and the whole ecosystem to review, help and upgrade their dependencies and accept the fact that sometimes things will fail, rather than aim for impossible “have a package buildable for ever in all future circumstances” goal.

In this context - what I prefer in airflow (and yeah I have a bit simpler problem because we are python-only) is to pin all the build requirements with == . That provides “medium-term” build reproducibility (important for security), but it also acknowledges that in the future with new python versions and new systems, this migh not work, We accept the risk, as we know we will be releasing future versions and encourage users to upgrade. In a way, that’s even a good idea at some point of time (and I am only half-joking) to make an old version of your software not installable - as it will force people to upgrade to later version or just drop your package - which is good for security and generaly supply chain healthiness.

Quite recently we even (one of the Airflow committers) submitted a change to dependabot so that it can also bump build requirements in pyproject.toml when they are pinned, which means that when we release later version of Airflow, the build requirements are pinned to the latest version of build dependencies available “now”. We don’t have to worry about it - dependabot will open PRs when needed and we merge them when build passes (and our CI tests all the build scenarios for all 90+ distributions we distribute from our monorepo).

And it works well for us.

webknjaz · March 25, 2025, 6:24pm

I just want to note that this may cause issues for downstreams because they usually have one version of setuptools in a distro that they build wheels with. And whatever you pin has a high chance of conflicting with that, forcing the distro maintainers to patch it out, which creates maintenance burden too.

In my upstreams, I always use PIP_CONSTRAINT which allows the CI to be stable. And the end-users can do the same if they wish, but I’m not limiting them with potential dependency conflicts.

Another solution might be setting upper bounds on release: ansible/packaging/release.py at e66aaa66a5bc231a8452b1927f1f1266332ba23e · ansible/ansible · GitHub. Although, I think this may have similar problems.

In general, I think that it would be useful to request build front-ends to support build env pinning. Though, that might have to wait until we have a lock files standard..

Kwpolska · March 25, 2025, 6:52pm

I agree that it is not easy for open-source maintainers to care as much about backwards-compatibility as Windows does.

On the other hand, the breaking change that inspired this thread is removing a single str.replace call. Keeping it as-is would not require any extra maintenance. So I think that build backends should take extra caution when making changes, and ideally not break backwards compatibility, especially over things as minor as - vs _.

So maybe we should demand that build backends be as compatible as Windows is, given their important role in the ecosystem and potential for breakage if they are not careful.

pf_moore · March 25, 2025, 7:29pm

I’m not at all comfortable with the idea of framing this in terms of expectations on build backends, or what frontends can do to help “fix” problems caused by build backends making incompatible changes. The setuptools issue that triggered this discussion is very polarised, with people looking to assign blame (or at the very least, “responsibility”) and I don’t want to be part of that.

Where I think build frontends fit in, is that they are intermediaries, enabling one set of users (developers) to interact with another (build backends). In that context, setuptools is just as much a user of pip as the developer, and we should be looking at how pip can help setuptools manage their deprecation process better^[1].

So why don’t we ask the setuptools maintainers what they would like from build frontends? Not right now - emotions are too high at the moment - but once things have cooled down, we should work with setuptools to understand how we can help things go more smoothly in future.

There was no way for setuptools to get the deprecation warning in front of their users? Then frontends could add a way for backends to issue “priority warnings”, which will be displayed even when normal output is suppressed.
The advice setuptools gave their users to use PIP_CONSTRAINT didn’t work? Find out why, fix any bugs, and add functionality if needed.
Legacy projects without a pyproject.toml were the issue? Maybe frontends should default to a specific version of setuptools, rather than just assuming the latest is still compatible with legacy projects?
Projects can’t easily manage a staged rollout of new build backends? Maybe we need a --build-constraint mechanism that can be used to put an upper limit on backend versions.

The point is, let’s not try to solve “the problem” in the abstract. Let’s treat it like we would any other case where one of our users (in this case setuptools) hits an issue - understand the problem and work on a solution with the user.

As an aside, I’ll say that while I may not agree with the way setuptools managed the recent deprecation, I will defend their right to do what they feel is in the best interests of their project and their users ↩︎

potiuk · March 25, 2025, 7:50pm

I think we have a bit different situation here - we don’t use setuptools, we use flit-core for providers and hatchling (for main airflow) - and especially flit-core is nice as it’s the only dependency you need, also our package uses pyproject.toml and we expect build isolation to kick-in. I rather doubt some upstream users would want to pin specific version of flit-core - or hatchling (with those few dependencies it has).

And we have a little different situation - as we are not a library that might be used by a number of upstreams, so our worry is more about our users who install airflow as an app, and most of them will use .whl anyhow, very few might actually want to build airfllow from .sdist or sources.

So yeah our situation is not as bad as many other libraries. But particularly for the python-only libraries and tools, switching to - say - flit and pinning it in requirements, does not seem like a bad advice.

BrenBarn · March 25, 2025, 10:41pm

I more or less agree, and my corrolary from this is the same I had in other packaging threads: a lot of this pain would be sidestepped if anything that requires a build step (i.e., sdists) was not part of what’s automatically searched/installed by install tools like pip. In other words, a clear separation between “you are the system integrator and accept that you need to build things” and “you are just someone who wants to install things”. Then a lot of things would fail early and loudly and people would not get used to depending on sdists in ways that may break later.

pf_moore · March 25, 2025, 10:45pm

github.com/pypa/pip

Speculative: --only-binary by default?

opened 05:39PM - 16 Nov 20 UTC

pfmoore

state: needs discussion state: awaiting PR type: deprecation

**What's the problem this feature will solve?** A lot of users are reporting is…sues when there's no Python 3.9 binary for projects they need, and pip tries to build from source and fails with an obscure error (because the user doesn't have a compiler, or isn't set up to build the relevant packages). **Describe the solution you'd like** Pip shouldn't try to build from source if the user isn't prepared to deal with build errors. As it's not possible to know the user's level of expertise, we should err on the side of caution, and by default only allow wheels to be installed. Users who know they need to install from source and have checked that they can do so, can explicitly say so using a new `--allow-source` flag, which acts as an "opt-in" to source builds. **Alternative Solutions** Improve the error messages when a source build fails. This is hard, because the details of what went wrong are entirely the responsibility of the build backend. **Additional context** I don't realistically think this can be added without a lot of disruption, but given that significant numbers of projects ship wheels these days, maybe it isn't as unthinkable as it once was. I do think it's worth discussing the implications, if only as a thought experiment, and I don't know where else we could do that apart from here. One big problem area is that we can't distinguish between "pure Python" projects that are shipped only as sdists, but which only need Python to build, and complex projects that need a compiler. So restricting to wheels only would require an explicit opt-in for some projects which currently install with no issue.

notatallshaw · March 25, 2025, 10:55pm

It’s a nice idea but I do this at work with --only-binary :all:, I set and a handful of packages to --no-binary as there is no wheel version of them. The errors are not user friendly, you get a ResolutionImpossible and you need to decode them. Because the issue can be with transitive dependencies, it’s hard to determine if the issue is actually the --only-binary filter or an actual dependency issue, so I don’t know how you would ever make it a good user experience.

However, I would be supportive of turning on --prefer-binary by default, which will prefer older wheels over newer sdists, there would need to be a new flag (or updated semantics) to turn that behavior off where needed though.

BrenBarn · March 26, 2025, 12:40am

Couldn’t the user experience just be the same as if you tried to install a nonexistent package? Just “package blah not found”. It’s certainly true that it would be nicer if it said “only sdist found for package blah, you must build it separately”, and also true that it wouldn’t be a great transition experience when things start failing because they’re sdist-only (and needless to say that pain should be mitigated). But to me it’s just the same as if you tried to pip install something that only exists as a tarball on github. It’s just not there from the perspective of the installer.

notatallshaw · March 26, 2025, 2:30pm

Couldn’t the user experience just be the same as if you tried to install a nonexistent package?

That would also be a bad user experience for transitive dependencies, as there could be some other path in the dependency graph that doesn’t use that particular dependency, so the whole graph has to be searched and the error is non-obvious. But it doesn’t really come up for non-existent packages because users don’t define them because they would get hard errors, but lots of users are face installing from sdists.

Let’s discuss on that pip issue though, I’ll post my findings there of actually using this feature regularly.

webknjaz · March 26, 2025, 2:53pm

Yeah, totally. Though, my point applies to any build backend in the context of downstreams specifically. For upstream users that work with upstream-provided stuff directly, pinning sounds like a good solution. Especially, for build backends that don’t have any transitive deps. Additionally, pinning would contribute to reproducibility, which is also nice.

abravalheri · March 28, 2025, 12:43am

To be honest, I believe that the following would be very very useful and would be a tremendous quality of life improvement:

Not hiding warnings.
This is a well-know and recuring topic of discussion. The snippet in Warnings in backends as suppressed by frontends · Issue #558 · pypa/packaging-problems · GitHub looks like an interesting starting point ^[1]. It would also be incredible useful in other contexts (e.g. setuptools/setuptools/command/editable_wheel.py at v78.1.0 · pypa/setuptools · GitHub).
Tell the users which packages have been installed via sdist.
For example, imagine that after running install -r requirements.txt the user get a message like:

No wheels available for packages ab, bc, cd, de. Installed using sdists.

Read about potential drawbacks and reproducibility issues in https://packaging.python.org/guides/sdist-drawbacks-and-reproducibility^[2].

That would be great, no?
Now I don’t mean to put the spot on the colleagues working on frontends. I know that they are very complex to maintain and have problems of their own. I am just mentioning this as brainstorm.

Possibly a very similar approach could be used to increase visibility on issues that are also dear to frontends, for example, imagine the following:

Packages ab, bc, cd, de do not contain pyproject.toml.
Future installations may be impacted by implicit --use-pep517.

Please read more information about … … …

Alternatively or additionally we can also discuss thinks like adding a build-api hook for the frontend to configure log verbosity and/or a way for the backend to tell the frontend which logger name to subscribe for messages that are essential to be displayed to the user. ↩︎
Hypothetical link, but it is in my mind to start working on something like https://packaging.python.org/guides/sdist-drawbacks-and-reproducibility, if anyone would like to beat me to it, please be my guest. ↩︎

notatallshaw · March 28, 2025, 1:17am

Digging a little further, the next step on this appears to be this PR which I’ve just called attention to: Forward warnings from build-backend to build-frontend by pradyunsg · Pull Request #171 · pypa/pyproject-hooks · GitHub

IMO, this is a little too verbose, but I think it would make sense to go from:

Sucessfully installed: ab-1.0, bc-1.1, cd-0.9, de-1.2

To:

Successfully installed from source distribution: bc-1.1, cd-0.9
Successfully installed from wheel: ab-1.0, de-1.2

I have created a feature request for pip: Split "Successfully installed" message between sdists and wheels · Issue #13307 · pypa/pip · GitHub

This sounds like a good idea to me, but I’ve never looked at the PEP 517 / legacy install code paths, and I see a lot of open issues for pip, I’d need to carefully review if this is already an open request.

notatallshaw · March 28, 2025, 3:39pm

@abravalheri I made a branch of pip that vendors the pyproject-hooks PR to forward warnings from the build backend, you can see the details and an example install: Forward warnings from build-backend to build-frontend by pradyunsg · Pull Request #171 · pypa/pyproject-hooks · GitHub

A lot of these warnings have setuptools documentation links in them, I suspect many confused users will end up on the setuptools github issue tracker asking why they are getting these warnings. And they will probably mistake them for errors, especially if their command fails for separate reasons.

I would appreciate your feedback on that PR that you would be happy, or not, for frontend install users to see these warnings.

abravalheri · March 28, 2025, 4:24pm

Thank you very much Damian,

Other maintainers may have a different opinion, but I think that it is better to have users contacting us before the removal than after, at least we can try to clarify and point out how to improve reproducibility of sdist dependencies. We can also tweak the links and the text to improve things.

That is a real risk, but the alternative is that users will only ever know about warnings when it is too late… Any other suggestion?

steve.dower · March 28, 2025, 4:28pm

Is there any sensible/obvious way to show these warnings to the package developers but not the end user? Does that just have to be a user-specified front-end option? (Maybe it can be on by default for editable installs, but that’s far from universal.)

The best person to show the warnings to is the one who can fix them. And if they mistake it for an error, well, then they’ll fix it and no harm done