Making the wheel format more flexible (for better compression/speed)

Did ^ ever happen? We have the exact same use case as Steve and would use it immediately:

I think one of the conclusions of this thread is that we might be interested in adding zstd and/or brotli to the standard library.

I started a new thread about that in python-ideas.

You’re welcome to join.

1 Like

On a small experiment, trying to “replace” in WinPython distro:

  • pre-installed/compressed wheels
  • per a subset of original wheels (with pep-751 verifiable hash).

To be at same final download size, existing wheels shall be about 22% smaller, and that is achieved per zstd compression level 12 (the last not-too-slow) compression level.
It’s using standard-0.23, that includes zstd-1.5.6
I choose level 12 because zstd-1.5.6 was compressing twice slower if going from level-12 to level-13.
recent zstd-1.5.7 claims to be 3x faster than before at high compression ratio (using multi-thread automatically), but if you count like the planet, level-12 may remain preferable.

experiment:

  • download wheels (accounting for 40% of WinPython size)
    llvmlite-0.44.0-cp313-cp313-win_amd64.whl
    opencv_python-4.11.0.86-cp37-abi3-win_amd64.whl
    panel-1.6.2-py3-none-any.whl
    polars-1.27.1-cp39-abi3-win_amd64.whl
    pyarrow-19.0.1-cp313-cp313-win_amd64.whl
    PyQt5_Qt5-5.15.2-py3-none-win_amd64.whl
    PyQtWebEngine_Qt5-5.15.2-py3-none-win_amd64.whl
    scipy-1.15.2-cp313-cp313-win_amd64.whl
  • unzip them in a directory “un_zipped_wheels”
  • re-compress the whole with a “python compress_folder_with_zstd.py ./un_zipped_wheels ./un_zipped_wheels12.tar.zst --level 12”
echo %date% %time% &python compress_folder_with_zstd.py ./un_zipped_wheels ./un_zipped_wheels12.tar.zst --level 12
23/04/2025 14:53:32,33
📦 Creating TAR archive: .\un_zipped_wheels12.tar.tar
🗜️ Compressing to Zstandard: .\un_zipped_wheels12.tar.zst
✅ Done: un_zipped_wheels12.tar.zst (245180 KB)
echo %date% %time%
23/04/2025 14:54:22,15
# compress_folder_with_zstd.py
import os
import tarfile
import zstandard as zstd
from pathlib import Path

def compress_folder_to_zst(source_dir: str, output_file: str, compression_level: int = 19):
    source_path = Path(source_dir).resolve()
    output_path = Path(output_file).resolve()

    temp_tar_path = output_path.with_suffix(".tar")

    # Step 1: Create a .tar archive of the folder
    print(f"📦 Creating TAR archive: {temp_tar_path}")
    with tarfile.open(temp_tar_path, "w") as tar:
        tar.add(source_path, arcname=".")

    # Step 2: Compress the .tar using zstandard
    print(f"🗜️ Compressing to Zstandard: {output_path}")
    cctx = zstd.ZstdCompressor(level=compression_level)
    with open(temp_tar_path, "rb") as f_in, open(output_path, "wb") as f_out:
        cctx.copy_stream(f_in, f_out)

    # Step 3: Clean up the .tar file
    os.remove(temp_tar_path)
    print(f"✅ Done: {output_path.name} ({output_path.stat().st_size // 1024} KB)")

# Example usage
if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(description="Compress a folder into .tar.zst using Zstandard.")
    parser.add_argument("source_dir", help="Path to the folder to compress")
    parser.add_argument("output_file", help="Path to output .tar.zst file")
    parser.add_argument("--level", type=int, default=19, help="Zstandard compression level (default: 19)")

    args = parser.parse_args()
    compress_folder_to_zst(args.source_dir, args.output_file, args.level)