PEP 775: Make zlib required to build CPython

Hello everyone,

I would like to propose PEP 775, making zlib required to build CPython.

The PEP itself is quite short, but here is a TLDR:

Building CPython without the zlib compression library will no longer be formally supported with the exception of WASI. This is mainly because a zlib-less build is useless for most users. When building without zlib a warning will be raised and this change also allows us to remove some code that deals with zlib-less builds.

Full PEP: PEP 775 – Make zlib required to build CPython | peps.python.org

4 Likes

I feel the WASI exception makes the PEP a lot less persuasive. zlib is now required, except it isn’t required because we grant an exception for this one case.

Should we instead invest in zlib on WASI so that we can guarantee the presence of zlib without any exceptions?

9 Likes

There was previously discussion about zlib on WASI in Make zlib required on all platforms (simplifies code) · Issue #91246 · python/cpython · GitHub and No zlib in WASI · Issue #93819 · python/cpython · GitHub, but I haven’t found anything more recent than 2023.

I agree. I feel like the exception would be less glaring if it could somehow be couched in functional terms like “zlib is required for all platforms except those that. . .”. That might make no practical difference for now, but at least would create a little penumbra of information about future situations in which other officially supported platforms might not require zlib, or what would have to happen in WASI for it to also require zlib.

1 Like

If it’s just going to be a warning, will the PEP really achieve the goal stated in its title? If the goal is to “Make zlib required to build CPython”, then building CPython without zlib should fail, unless an explicit --I-want-a-broken-and-unsupported-Python flag is passed to the configure script. If it’s just a warning and a few failing tests, people will still build broken Pythons (because they failed to install zlib development headers) and be surprised when pip does not work.

2 Likes

Think of it as a deprecation – here we add a warning, but keep old code working (and platform that passes tests in the old configuration helps with that). And we give people time to report any cases where ripping out zlib would hurt them.
The point of the PEP is documentation/messaging. Technically everything works roughly as before (except “nice” error reporting for unsupported cases).

I don’t think that would help users all that much.
The core idea is “Python without zlib is percieved as broken”; this doesn’t apply on a platform that doesn’t typically deal with archives, wheels, or transport compression.

The WASI exceptlion is helpful to “continue testing a platform without zlib, so that we don’t unintentionally break unsupported builds yet.”
As with deprecations, the intention is having a few years to fish out use cases we don’t know about, so when we break things, we have a better idea of what we’re breaking.

That’s intended.
We explicitly say that if pip does not work, it’ll be because whoever build CPython didn’t follow the build instructions, and didn’t test; it’s not because pip has an undocumented dependency on an optional module.

And practically: if you build CPython and don’t run the tests, you’re fine for some limited/personal use. Or for bootstrapping a port to a new platform. But if you start distributing the build to others (who might use it for arbitrary stuff), I don’t think you can skip running the tests.
IMO, the PEP lets you see the failures when they matter.

Those where it’s not a priority? Those where the users don’t expect it? :‍)
The PEP is concerned with what’s supported by the core team, so there’s a known set of platforms to which it can apply, in PEP 11. Out of those, only WASI makes sense as an exception.
If someone is supporting another platform, supporting zlib (and any patches needed to make it work & pass the tests) would officially be on them. They can ask the core team to help, of course.
I don’t think it would help to put in a definition to classify platforms we don’t know about.

When we add a new platform to PEP 11, that would be a good time to decide whether it joins WASI.

1 Like

Ignorant question here, but what does WASI typically “deal with” if it doesn’t deal with compressed data? That doesn’t seem to leave a lot of possible I/O on the table. Is WASI just a toy? And if it’s a toy, why do we care about supporting it at all?

I would rather make it more reliably removable, and fix up tests that assume it’ll be there so they can handle its absence.

As the PEP lists, there are genuine cases where it may be omitted. And the same applies for a variety of other libraries (ctypes is the obvious one that we have a policy for already, but similarly for socket and ssl, a decent number of OS-specific modules, and no doubt more that I can’t think of off the top of my head).

Saying “it’s required” basically just gives the stdlib free rein to assume that import zlib will never fail. I don’t think we should do that, but rather, we should clearly document (perhaps via a PEP) that zlib may be absent, and so we should always assume that import zlib (in the stdlib) may fail. If that leads to other functionality being absent, that’s fine - users will get a suitable error - but the alternative to users getting a suitable error is that we give them a documented “go away”, which is worse.

I believe it gets given data that’s already been decompressed by browser/host libraries. So zlib is probably present, it’s just not been compiled in WASM and so isn’t directly accessible from the CPython port. There are probably other ways to get to it.

1 Like

IMO, we should only claim to support something if we test it.

How do you test that? Should we add zlib-less buildbots for each platform? Who’d be motivated to run them?

To me, it makes sense to only test on WASI. And the easiest way for a stdlib module to pass the tests on WASI is exactly what you suggest. (In practice: keeping import zlib in functions that need it, rather than having it at the top of the file.)

1 Like

That’s fine, but the instructions to maintainers need to be clear. In large part to resolve disputes when someone who does care brings an issue (“yes, we will allow changes to enable non-zlib functionality when zlib is absent, rather than making unrelated functionality fail because we were too lazy to protect the import”) as well as allowing ourselves the freedom to not block a PR/release due to an import zlib.

We already do this for ctypes. And we don’t have buildbots for those configurations (our test suite wouldn’t run without ctypes either).

1 Like

IMO, that goes for any unsupported configuration – zlib-less, ctypes-less, but also Solaris (which we can’t test on) or Emscripten (which is on the way to being supported).
Basically: we can collaborate on the CPython main branch, but testing, patching and backports are left to the someone who cares.

I think that should go in a PEP 11 update, not a zlib PEP. Should I open a discussion for it?

That might be the case, but I can also go to JupyterLite and type import zlib in a WebAssembly-powered Python prompt running in the browser. So my question still stands: why do we care about WASI support if there are more full-featured options out there?

1 Like

I think because WASI is minimal enough that it can be usefully embedded into a site, whereas Pyodide adds a significant download stage (was 100x at one point, probably not anymore) and also doesn’t embed as well (needs to be run more self-contained, like CPython, rather than directly integrated into the host app).

Though again, I could be off here - I’m no WASM expert, I’ve just listened to a few of them. But regardless, I’m very supportive of any effort to make CPython runnable with fewer dependencies (and without the related functionality), so if WASI is the only way we’re going to “approve” this, then I support it.

Perhaps, but let’s spell it out here, since we’re now looking at functionality that doesn’t have a clear alternative (ctypes code becomes native code for us, so it’s no burden to make it optional) and isn’t a complete show-stopper (e.g. a broken unsupported platform prevents use of CPython entirely, while absent zlib doesn’t have to).

This is far more like mobile app granular permissions (or WASI, or Windows, or macOS, etc.), and so we should explicitly say that we will try to allow having it turned off, even though we don’t actively test or gate our releases on configurations where it is turned off.

We can if we want to. All it should take is time as it would need to be statically built into CPython when compiled for WASI. I have it on my todo list to get builds of everything but OpenSSL from GitHub - python/cpython-source-deps: Source for packages that the cpython build process depends on working with WASI via a wasi.py externals command, but I just haven’t gotten around to it (PEP 751 – A file format to record Python dependencies for installation reproducibility | peps.python.org is a bit of a time sink for me).

No one said that WASI can’t deal with compressed data, just that we don’t currently bother building in zlib for CI and such as it requires it be statically compiled into CPython instead of just finding a .so file. Other people who build CPython for WASI do compile in zlib.

Definitely not; I’m not maintaining support just for fun. There are plenty of companies relying on CPython support. I also don’t think the SC would have let me make it a tier 2 platform if it was a toy.

I’m afraid you’re conflating WebAssembly in general with the web. Think of WebAssembly as a CPU ISA that’s abstract and easily verified to be safe; the “web” part of the name is historical baggage from its initial motivation (or think of it as being as secure as something in a web browser if you want a different reason for “web” in the name). What you’re getting from JupyperLite is Pyodide, which is CPython compiled using Emscripten for the web via WebAssembly (which is only a tier 3 platform).

WASI is for WebAssembly on servers, so a different use-case from Pyodide and the web. Various places use WASI for things from IoT for security and portability to serverless functions where it competes with containers due to its fast start time and security (e.g. Hyperlight; a disclaimer is I know some of the people involved and it is developed at MS, but it’s just an example I had on hand; Fastly uses it and has an explainer about WASI).

So, to summarize:

  1. WASI is a tier 2 platform and not a toy, so it isn’t going anywhere (if you disagree then that is a request to the SC)
  2. We could get zlib into WASI as part of our CI if we want to not grant it an exception for zlib
  3. The work required to add zlib to any build we do for WASI is to statically compile it in, probably via a new Tools/wasm/wasi.py externals command so it can eventually support more libraries like SQLite via the same mechanism
  4. I can do the work, I just need the time which I won’t have until PEP 751 is done (I’m hoping by April to submit the PEP)
3 Likes

Looking at the unsupported platform issues, yes there’s a bunch of “can’t build at all” ones, but there’s also a lot about specific functionality – curses, os.cpu_count, ctypes utilities, platform.libc_ver, strftime
For all of those we can take patches, but don’t test them, support them, backport them or block releases on them. I don’t understand how zlib should be different.

I guess the practical instruction for CPython maintainers would be “Use @requires_zlib rather than single out WASI specifically”. Is that what you want in the PEP?

FWIW, there are alternatives – if you’re serious about compression, you could use zlib-ng. (Of course, you need zlib to install that using Python tools, but with non-Python install tools going mainstream that is less of an issue.)

3 Likes

I think we have to differentiate a bit more:

  1. One aspect is building CPython from source without zlib

  2. Another aspect is running CPython and the stdlib in particular with a working zlib module

  3. Yet another is whether or not to accept patches for zlib-less installations of CPython

The PEP currently conflates these two, e.g. pip doesn’t work without zlib module (that’s the second aspect). Building CPython does work without zlib, but some tests fail (that’s the first aspect).

Steve brought up the third aspect. The PEP hints into this direction as well (“Code to generate more “friendly” error messages, or to pre-check whether zlib is available, will be removed.”).

IMO, we should treat zlib just like we do all other similar external libs, e.g. ctypes: CPython should continue to build without them, some tests may fail without them and some stdlib modules will not work without them (and should issue an appropriate error message).

The few places where the stdlib includes error messages related to this are really no maintenance burden, so I don’t see a need for dropping them. The same goes for patches providing work-arounds or additional errors/warnings for the situation where no zlib module is available.

As for not actively fully supporting a CPython without working zlib module, I think that’s something we can document. The PEP could then be turned into a document describing which aspects of a CPython will fail without a working zlib, so that people are aware of the implications.

Making an exception for WASI doesn’t strike me as necessary. It’s not a very common platform to run CPython on at the moment (this may change, of course, going forward). With the above change in emphasis, it would also not be needed. We could simply make a WASI an example of where we do have a CPython build without zlib.

All that said, I’d also like to bring up another point:

Why don’t we simply vendor in zlib and use it on platforms where configure does not find a system zlib ? We already do this on Windows.

The zlib code base is a mere 5MB and the license is compatible with the PSF License.

With zlib being widely used in the Python ecosystem, this would certainly be a viable choice.

2 Likes

Along with “and continue to use @requires_zlib even when WASI gets support”.

@malemburg laid it out better than I would, but fundamentally, what I want is the PEP to be retitled “make zlib optional to build CPython”, because I don’t think it should be required. It’s stdlib functionality, not core functionality. And since this is essentially the status quo, I don’t see why there’s a PEP (I assume the PEP intends to change the status quo).

Further, the PEP doesn’t explain why we should do this - it merely says we probably can, most of the time, and therefore we should. I don’t doubt there are needs that can only be satisfied by making the zlib module required, but they need to be in the PEP in order to justify the negative impact of such a change.