Adding a mechanism to forward user-facing warnings from build-backends to build-frontends

pradyunsg · June 19, 2023, 10:41am

Today, pyproject.toml-based build backends do not have any mechanism from the Python hooks to communicate to the caller, other than sys.stdout and sys.stderr. Currently, both of these are fairly noisy on multiple build-backends (especially when they spawn subprocesses) and suppressed by default with pip, and via an opt-in with build.

There is a noted desire for having a less-noisy side-channel for additional communication, that’s come up thus far: presenting warnings.

Concretely, this will help build-backends show warnings about usage patterns that they are discouraging use of to the end user. Using a new mechanism to communicate information that is intended to be user-facing, from the backend to the frontend, would avoid burying this information in a wall of output from the build-backends.

Since the underlying build-backend is expected to have a Python interface, I’m proposing to enable forwarding warnings.warn messages that derive from UserWarning (concretely, by overriding showwarning in the subprocess-that-wraps-the-hook). Notably, this mechanism will also work with the current somewhat-ad-hoc workaround for this implemented by setuptools, which uses a UserWarning-derived base class.

This is an interoperability piece though, so I’m bringing it up here. I’m unsure if this needs a PEP (I’m hoping it doesn’t, and expecting it does ) since it’s primarily using a piece of the standard library to forward information.

pradyunsg · June 19, 2023, 10:45am

Oh, I should add prompting questions.

Does anyone have concerns with this approach?
Does anyone have opinions on whether this needs to be an interoperability PEP, vs handling this as a PR on packaging.python.org, vs leaving this as an implementation-specific thing?

steve.dower · June 19, 2023, 12:27pm

Could it use the logging framework with a specific named logger?

Since we have a Python API for invoking the backend, it leaves the opportunity for the front-end to set up the logger with whatever handlers or formatters they like.

Ultimately, stdout and stderr are likely to get content regardless (not everything can be captured reliably), but I do like the idea of having a mechanism for showing messages even on overall success, and without the clutter of all the output.

pradyunsg · June 19, 2023, 12:51pm

Yes, although we’ll need to bikeshed the logger name.

They do already have the ability to format the warnings messages though, since they’re the ones calling warnings.warn.

pf_moore · June 19, 2023, 1:19pm

IMO, passing logging is richer (and hence more flexible) than just passing warnings, so seems useful (either as well or in stead). But there’s a question of whether it’s needed. YAGNI might well apply here.

The PEP 517 API is conceptually an in-process Python API^[1], and as such, it’s reasonable to suggest that any “out of band” information (exceptions, warnings, logging) get passed through from backend to frontend. In practice, only exceptions^[2] have been surfaced like that in the past.

My view:

This could be left as a quality-of-implementation matter for the wrapper library. But if we want to make it a standard requirement that things get passed through the boundary, it’ll need a PEP.
If it’s a QOI matter, backends must not assume warnings or logging will be seen by the user, because they don’t know what library will be used to call them. But they can use pyproject-hooks-friendly warning subclasses or logger names “just in case”, if they want to.
In reality, calling the backend any way apart from in a subprocess is probably impractical, so deciding that in-process information like exceptions, warnings and logging is lost across a backend API call is also a reasonable position to take.
We can choose different answers for each of exceptions, warnings and logging.

If pyproject-hooks passes all warnings and logging from the backend to the frontend, I’m less worried about standardising right now. We can wait to see if there’s an emerging consensus from backends^[3]. But if pyproject-hooks plans on only exposing certain loggers, or certain warning subclasses, it’s imposing a de facto standard and I feel it would be more honest in that case to agree what gets passed on as a formal standard.

There was a competing subprocess-based proposal that used a CLI interface to the backend, which was rejected in favour of PEP 517. ↩︎
And a limited set of those, I believe. ↩︎
Backends don’t have a strong track record for converging on common behaviour, though, TBH. ↩︎

pf_moore · June 19, 2023, 2:06pm

… of course, it’s equally possible for backends to not just dump raw build tool output on stdout, but to agree a standard output format that front ends could consume. I’m not criticising over the fact that they haven’t, just noting that front ends probably shouldn’t expect any meaningful cross-backend structure to warnings or logging output, for essentially the same reason. So I think it probably makes more sense for the backends to drive this discussion, based on what capabilities they want/need, rather than the frontends, based on what they hope backends will provide…

pradyunsg · June 19, 2023, 3:56pm

puts on backend author hat

I agree. This is coming from my desire to have this ability in backends that I’ve authored & maintain, as well as a similar need/desire for setuptools that was the motivating case for the issue linked in OP.

puts on frontend author hat

Given that we’re going to need to do something to add support for whatever shape this mechanism takes, I think a new channel is nicer than special-casing a subset of the existing channel.

I also don’t know if it’s reasonable to expect all backends to agree on a single way of doing things on something that exists already – we’re not set up correctly for cooperation between backends on such stuff and we would still need frontends to special-case those anyway. Something like that definitely needs a PEP (and a lot of cat-herding).

takes off all hats

That’s what I think is the right thing for this, at the moment. I’m biased though, since I wanna drive this to completion sooner rather than later, and writing a PEP for this feels tedious honestly.

I’m wary of doing all logging, just because it would mean that the backend doesn’t get full control over the logs or that the frontend just gets formatted logs. And, if we do a logger with a specific name, same – I think that’ll need agreement on the name.

Somewhat similarly for warnings, I could be convinced that all logs should be forwarded – the tricky piece there is we’d need a proxy object for filtering stuff, and it won’t work cleanly with regular warning filter mechanisms.

steve.dower · June 19, 2023, 4:02pm

I propose the name build, and a recommendation that backends use a logger that propagates up to it (i.e. they do logger = logging.getLogger("build.pymsbuild") and use that).

Totally up to the front-end how they handle it. It’s easy enough to configure the build logger to write directly to stdout (respecting user preferences about verbosity) or pass them back to the host process.

The main semantic difference I’d want from the existing stdout/stderr capture is that logged messages should be made available to the user (unless they explicitly suppress some level of messages) regardless of build success/failure.^[1] That is, the front-end should not wait for the overall result before deciding whether to show messages or not (it might wait for the build to finish before they get shown).

Given the back-compat expectations, I’m fine with front-ends only showing captured stdout/stderr on failure. And since some build tools/scripts don’t really have options for it, I’d rather not make backends capture and forward stdout/err themselves.

Which I guess means the front-end can’t just write to stdout from the subprocess… oh well. ↩︎

pf_moore · June 19, 2023, 4:34pm

It depends on the intent here, IMO.

If this is to allow backends to say “here, this is some data (warning, log message, whatever) that needs to be shown to the user”, then I suspect it needs a PEP, as it needs agreement between backend and frontend. Even if that agreement is simply to provide a “level” that the frontend can use along with a verbosity setting.

If it’s to allow frontends to see what’s going on in the backend, in the same way as it could if the backend was called in-process (with all the uncertainty that involves if a backend calls a 3rd party library that does logging), then I think it’s just an implementation detail of pyproject-hooks (because PEP 517 is silent on whether exceptions/logging are propagated).

I don’t want to insist on a PEP if it’s not necessary, but I don’t want to repeat the mistake we made with project name normalisation and wheel filenames. If tools need to agree on something for it to be useful, we need to get buy-in (even if only of the “silence is assent” form) from those tools.

This is a very reasonable suggestion, but is precisely the sort of constraint that needs buy-in. If a backend already has logging under its own namespace, and build scripts can interact with that somehow, changing to a new namespace might be unacceptable to them. A PEP would let them say so, whereas just making that decision in pyproject-hooks, or adding it in a spec update PR, wouldn’t.

Also, let’s be careful here - we’re being very casual about talking about forwarding stuff here, but it does involve inter-process marshalling of log and warning data. I trust that @pradyunsg would speak up if he thought this was unacceptably difficult, but as soon as it becomes an expectation, or a standard, every caller of a build backend has to implement this, not just pyproject-hooks.

To summarise, pyproject-hooks propagating warnings and/or logging wholesale seems like a thing they can do without too many problems - frontends can react if stuff comes through, backends can publish stuff knowing that it may or may not be seen, and everything’s fine (albeit a bit vague). But as soon as tools expect certain behaviours^[1], we need to standardise properly.

Even if it’s just “pyproject-hooks will definitely pass through the build.* logger”. ↩︎

abravalheri · June 19, 2023, 6:29pm

Regarding the YAGNI aspects of the proposal, I can mention the following examples on how I would use the proposed approaches if there were available today:

warnings: The backend needs to show deprecation warnings to package developers.
This kind of works nowadays if the devs use build because it does not hide the stdout, but it seems that some devs use pip wheel (or simply test with pip install . and ignore the further build output when it is done in the CI environment).
Allowing deprecation warnings to be shown with pip install . would be tremendously helpful.
logs: There are some bits of information that the backend wants to show to devs when they install pip install -e ., but in a way that it avoids breakage on the CI if PYTHONWARNINGS=error (which is a common practice).
One example is related to the PEP 660 implementation via a symlinks. In this case we want to advise the user to not remove the directory that contains the symlinks (and is referenced by the .pth file), but if we adopt a UserWarning this might break disposable CI environments that use strict warning filters.
Still related to PEP 660, the installer/frontend controls the UI and can decide to hide the stdout, but the backend is the one responsible for deciding the implementation details. This makes it difficult for the backend to convey information about the limitations of each method and provide advice (as a “post-install”/“during-install” message).

pradyunsg · June 19, 2023, 8:29pm

FWIW, I’ve pushed the chunks of code I have for this idea to:

github.com/pypa/pip

Forward warnings during backend calls to the subprocess logger

pypa:main ← pradyunsg:build-backend-warnings

opened 08:03PM - 19 Jun 23 UTC

pradyunsg

+54 -15

This is the supporting change to https://github.com/pypa/pyproject-hooks/pull/17…1, with the logic changes in pip. As it is, this is a little messy but it works nicely. This is not API-compatible with the existing pyproject-hooks, since it imports `BuildBackendWarning` which doesn't exist in the currently vendored copy. (and, yes, I considered using `logging.captureWarnings` but discarded that idea since I wanted to filter for only `BuildBackendWarning` in that context)

If you want a textual description of how this works, click here.

On pyproject-hooks’ end, there’s two logical pieces:

In the hook calling script, all the warnings that the backend hook has spewed out are captured. Then, details of the spewed warnings (message, filename, lineno), which are subclasses of UserWarning, are returned to the calling process.
In the calling process (i.e. pip), it’ll spew out a pyproject_hooks.BuildBackendWarning, a subclass of UserWarning with the captured information.

On pip’s end, we’ll need to accomodate for the fact that the build-backend hook calls can now spew out warnings; directing them through the subprocess logger (we could also attach a global warnings.showwarning, if we really want).

If you're curious about how this looks in the final form, click here to expand this.

That’s from a pip install -e . on pip itself, with an updated vendored copy of pyproject-hooks with pip’s setup.py modified as follows (setuptools invokes the setup.py script in-process within its backend):

❯ git diff setup.py     
diff --git a/setup.py b/setup.py
index d73c77b73..e39260731 100644
--- a/setup.py
+++ b/setup.py
@@ -23,6 +23,12 @@ def get_version(rel_path: str) -> str:
 
 long_description = read("README.rst")
 
+if "editable_wheel" in sys.argv:
+    import warnings
+
+    warnings.warn("Hello from `setup.py editable_wheel`")
+
+
 setup(
     name="pip",
     version=get_version("src/pip/__init__.py"),

Yup, this is basically intended to solve the pip install doesn’t show my deprecation warnings problem.

As am I. That’s why I’ve brought it up here before throwing code as PRs on pyproject-hooks and pip.

Yea, I wanna push-back on the scope-creep that’s been happening through the discussion. I was not thinking of forwarding all warnings or all log messages. In the implementation linked to above, I’m intentionally not marshalling “rich” objects (i.e. warning classes).

There’s two reasons for that… (1) it’s not particularly useful information (we could add that info to the message if we want, but I’m ambivalent) and (2) it’s tricky to pass around that information since the build-backend isn’t a module that can be imported on the caller subprocess’ end safely.

Note that I’m also not thinking of forwarding the warnings “live” – this is a subprocess call that the parent process ends up waiting on and there’s nothing on the parent process’ end to handle I/O^[1], and I think it’s OK to have the warnings show up after a successful run (we could move things around to spew out warnings even if the process fails – if you want that, say so on the pyproject-hooks PR as a review please!).

We could add that, but I think that’s way more work that we want to be doing here, especially given the end goals of this. ↩︎

pf_moore · June 19, 2023, 9:32pm

That makes sense, but again, it’s an implementation detail that people may end up relying on. As is the decision to only forward UserWarning. What about deprecation warnings? If backends start issuing user warnings for their deprecations because that’s what pyproject-hooks forwards, we’re getting into de facto standard territory.

Agreed, and your reasoning is perfectly sound. But again, it’s starting to feel like something that we can’t assume people will all have the same interpretation of what’s reasonable.

I’m sorry, but I’m starting to think this might need a PEP. Maybe that’s just because I don’t personally have any need for this feature (either from the backend or frontend POV) so I can’t see what’s “obvious”. But the fact that we’re not set up for encouraging backend co-operation (as you mentioned above) means the only mechanism we have for ensuring all backends make the same assumptions is standardisation.

pradyunsg · June 19, 2023, 10:20pm

Yea, I think that’s fair even though it’ll mean that this will go into my bucket of things to write PEPs for.

Topic		Replies	Views
Presenting progress for pyproject.toml-based builds Packaging	17	1222	February 20, 2022
Why does python -mbuild still complain Packaging	6	727	March 9, 2022
Pre and post install messages Packaging	3	282	April 18, 2024
General discussion of some proposals I have for pyproject.toml extensions Packaging	15	1590	November 15, 2023
Removing setup.cfg and setup.py from the packaging tutorial Packaging	101	5778	June 20, 2022

Adding a mechanism to forward user-facing warnings from build-backends to build-frontends

Related Topics