What's preventing zstandard compression of wheels?

Sorry if this is a stupid question, but I was looking through various standards, and I don’t actually see anything that requires a new wheel version for zstandard compression use.

The specification for binary distribution formats says nothing about compression, only that a wheel file is a zip file (and does not specify the zip file format version either)

The latest specification for the zip file format (6.3.10 currently, though has since 6.3.7) includes zstandard as a supported compression option.

Is this a documentation issue that the current specification doesn’t list a zip file format specification version when it should, a specification issue where only certain compression methods were intended but this isn’t in the specification, an ecosystem issue that various tools can’t handle what’s specified, or something else?

1 Like

It’s a practicality issue. A wheel that can’t be unpacked by the stdlib zipfile module is basically useless, as pip won’t be able to consume it (and nor will any other tool that’s written in Python in the obvious way).

The wheel spec was written a long time ago, and isn’t as precise as we’d make it these days, but it’s not really worth anyone’s time to tighten it up, to be honest.

1 Like

This thread is relevant: Making the wheel format more flexible (for better compression/speed)

5 Likes

Would it be worth at least tying the acceptable zipfile format features explicitly to “features supported by the standard library zipfile module” as specification language in the short term since that’s the practical reason? I know that’s technically changing the specification, but in this case, I think that’s only changing it to match the current reality right?

1 Like

I’m curious how common it is for “zipfile-supporting thing” to actually support zstandard out of the box. It might be in recent specs, but that doesn’t mean it’s widespread.

1 Like

What is the reason that you feel it’s important to put it in writing in the spec?

From a process point of view, anything that “changes interoperability” needs a new PEP. We can (with community consensus) waive the need for a PEP, but we’ve had a few cases where we’ve done that and had problems as a result, so it’s not something we’d do lightly.

Nothing that particularly applies to me directly in this specific instance, as I already knew it was assumed by the community (specification language supporting it or not) that zstandard wheels needed a new wheel version (though I actually didn’t know why it was assumed before, thanks for the context), though the banners on various peps that redirect to “living specifications” in both typing and packaging lately have been getting more aggressive[1], while the specifications don’t actually match what people need to know to implement something from scratch in either case still.


  1. They don’t persist being dismissed, and take up significant space requiring dismissing them on each page load, which has felt annoying when cross referencing peps with living specification due to gaps in what is actually transfered over. ↩︎

1 Like

libarchive has supported it for a while, GitHub - libarchive/libarchive: Multi-format archive and compression library, which means there’s out of the box support for it on windows, freeBSD and debian derivatives.

Other than python[1], I can’t tell you what doesn’t support zstandard off the top of my head.


  1. though that’s being worked on iirc ↩︎

2 Likes

PEP 784, which is now accepted and merged into 3.14b1, should go a long way in helping make Zstandard available to wheels. It includes integration for Zstandard compression in zipfile (and tarfile).

That being said, adopting Zstandard today is not feasible for the reasons Paul mentioned:

I think this is a good motivator for making wheel compatibility visible to the resolver. The status quo is that new compression cannot be adopted in an opt-in manner because any change to a wheel interacts with every installer that exists today. If we don’t want to have to wait for 3.14 to be the oldest supported Python in 5 years, then we need some way of saying “I know how to deal with feature X, so I will include wheels with feature X in what I look at”, and design the system such that not every resolver needs to support feature X, or even potentially features at all.

9 Likes

We have that in terms of the wheel version. Which is what @mikeshardmind originally asked - why does new compression need a new wheel version?

There’s a separate problem because resolvers don’t consider wheel version, meaning that projects can’t usable publish two different wheel versions for the same project version, but that’s not the question here.

True for pure Python projects and abi3 wheels, but couldn’t build backends start producing cp314 wheels as soon as installers can handle them? I’m not sure why that would require feature checks or spec changes.

3 Likes

Wheel versions - as in, the current version 1 wheels, and a hypothetical new version 2, both with the same set of tags (I.e., the same filename…)

1 Like

Would a new package manager iteration, aside of python core, allow to accelerate, by vendoring zstd for older versions ?

The limitation specific to pip is that because pip is the tool that core Python bundles to allow users who have just downloaded Python to get access to 3rd party packages on PyPI, we have to bundle all our dependencies into pip. And because of problems with the logistics of having platform-specific pip wheels for all possible platforms, pip cannot bundle libraries that use native code.

The net result is that for pip, all code needed must be either written in pure Python, or be available from the stdlib.

Other installers, such as uv, don’t have this constraint. Arguably, if someone wanted to take the position that pip compatibility was no longer important, we don’t have to adhere to the constraints pip imposes. I’m not going to try to make that argument, though (and that’s not just because I’m a pip maintainer :slightly_smiling_face:).

1 Like

But wheel version is not opt-in. As you say, right now resolvers don’t see wheel version, so a 2.0 wheel interacts with every installer.

I think it’s definitely related if we’re asking “What’s preventing zstandard compression of wheels?” as the title states. And certainly as a solution for the issue brought up preventing adoption. But I don’t mean to derail the current conversation.


I guess theoretically this could work but I think you can use pip running on an older/different version of Python to act as the installer to a prefix, then run with a newer Python. I don’t really know why one would do that but I wouldn’t be surprised if that was done today. But perhaps that is niche enough that it is acceptable to break? Also these wheels would fail if installed by uv currently.

1 Like

technically, resolvers ignoring the wheel version is ignoring a critical part of why there’s a wheel version in the first place, but we’ve already established this to be a practicality problem. :neutral_face:

1 Like

Agreed, but according to the spec, installers are required to respect the wheel version, not resolvers. The wheel version is designed so that installers don’t try to interpret something they aren’t prepared for, not so that multiple versions can be considered at once. That may have been a mistake in the spec (the points @emmatyping made are important here), but it’s how things are.

I’m still trying to focus on the original question “why does zstandard compression require a new wheel version”, rather than have this open up into the wider question of “how do we evolve the wheel format”. There’s been a huge amount of discussion about the latter, and right now there’s a group of people working on that topic elsewhere (presumably trying to get to the point of having something to bring to Discourse) so I don’t want to replicate those discussions yet again. But if that’s what people prefer to discuss, I’m fine with things going in that direction - I just don’t personally have the energy to get involved with that right now.

4 Likes

It’s a good question, and I think it’s adequately answered by Paul’s very first response - a wheel that can’t be unpacked by the stdlib zipfile module is basically useless.

In terms of how we move forward, the easy (but slow) answer to that is to wait until all supported Python versions (including non-CPython runtimes) can unpack zstd-packed files with zipfile. Once that’s the case, publishers can be confident that their users will be able to install their package.

The idea seems to be that using the wheel version could accelerate this timeline, but that doesn’t seem like it’s actually going to be able to help. At best, it changes the explanation when an install fails (from “your runtime doesn’t support the ZIP format” to “your installer doesn’t support the wheel format”), but it’s not really clear that would be an improvement either.

It really looks like this is just a case where patience has to be used. For controlled environments, you could use zstd today, but for uncontrolled environments you’ll just have to wait for the prevalence of support, not merely the existence of support.

I think it provides an improvement because the user could fix the problem without changing their environment–they need to update their installer, rather than update Python.

Nothing stopping an installer from catching the error and suggesting updating the installer.

1 Like