Zstandard in Python 3.14: compress a file in streaming and multi-threaded mode

akira94 · December 25, 2025, 8:28pm

It is very nice that Python 3.14 added Zstandard to the standard library. Please see here for its package documentation. I can use it with a rudimentary syntax

from compression import zstd
from pathlib import Path
import shutil

outDir  = r"E:\Personal Projects\tmp"
outDir  = Path(outDir)
inTar   = outDir / "chunk_0.tar"
zstdDir = outDir / "chunk_0.tar.zst"

with open(inTar, 'rb') as f:
    with zstd.open(zstdDir, 'wb') as g:
        shutil.copyfileobj(f, g)

Could you explain how to use it to compress a file in streaming and multi-threaded mode?

In this way, we can utilize modern hardware with multi-core CPUs to compress a file that does not fit in the memory. Unfortunately, the documentation does not contain any examples of my use case.

emmatyping · December 27, 2025, 12:51am

You can enable this pretty easily by setting CompressionParameter.nb_workers:

from compression import zstd
from pathlib import Path
from multiprocessing import cpu_count
import shutil

outDir  = r"E:\Personal Projects\tmp"
outDir  = Path(outDir)
inTar   = outDir / "chunk_0.tar"
zstdDir = outDir / "chunk_0.tar.zst"
# here I am using the number of processors on your device, but you may
# want to limit this to 4 or 8 workers depending on your use case.
compressionOptions = {
    zstd.CompressionParameter.nb_workers: cpu_count()
}

with open(inTar, 'rb') as f:
    with zstd.open(zstdDir, 'wb', options=compressionOptions) as g:
        shutil.copyfileobj(f, g)

Note that sometimes Zstandard does not support multi-threaded compression depending on how it was compiled. If this is code that you expect to run in multiple environments, you may wish to add error handling for the resulting ZstdError.

I expect people will want multi-threaded compression frequently enough we should add an example to the docs. Maybe we can just add the nb_workers flag to the last example.

akira94 · December 27, 2025, 7:01am

Thank you very much for your answer which works very well. Does zstd.open(zstdDir, 'wb', options=compressionOptions) as g automatically handle bigger-than-memory compression?

emmatyping · December 27, 2025, 8:38am

shutil.copyfileobj takes a length parameter which you can use to set the size of the buffer used to copy data. The documentation also notes “by default the data is read in chunks to avoid uncontrolled memory consumption”.

akira94 · December 27, 2025, 8:49am

This is really beneficial. Thank you very much for your elaboration.