Can we set the number of threads in tarfile.open(outputArchive, "w:zst")

akira94 · December 27, 2025, 9:45am

I have a script that uses a new built-in package in Python 3.14:

from pathlib import Path
import tarfile

# List of NDJSON files you want to compress
fileToCompress = [r"E:\Personal Projects\tmp\tarFiles\chunk_0.ndjson",
                              r"E:\Personal Projects\tmp\tarFiles\chunk_0.ndjson",
                              r"E:\Personal Projects\tmp\tarFiles\chunk_0.ndjson"]

fileToCompress = [Path(p) for p in fileToCompress]

# Output archive name
outputArchive = r"E:\Personal Projects\tmp\tarFiles\dataset.tar.zst"
outputArchive = Path(outputArchive)

# Create a .tar.zst archive without including folder paths
with tarfile.open(outputArchive, "w:zst") as tar:
    for file in fileToCompress:
        # Use only the filename (no directories) inside the archive
        tar.add(file, arcname = file.name)

print(f"Created archive: {outputArchive}")

A big benefit of Zstandard is its built-in parallel compression. You can see here for how to set the number of threads in compression.zstd.open(zstdDir, 'wb') as g.

Can we configure the number of threads in with tarfile.open(outputArchive, "w:zst") as tar?

akira94 · January 12, 2026, 10:01pm

The answer is yes. This is mentioned in the documentation of tarfile:

from pathlib import Path
from compression import zstd
import tarfile

# List of NDJSON files you want to compress
fileToCompress = [r"E:\Personal Projects\tmp\chunk_0.ndjson",
                  r"E:\Personal Projects\tmp\chunk_0.ndjson",
                  r"E:\Personal Projects\tmp\chunk_0.ndjson"]
fileToCompress = [Path(p) for p in fileToCompress]

# Output archive name
outputArchive = r"E:\Personal Projects\tmp\dataset.tar.zst"
outputArchive = Path(outputArchive)

options = {
    zstd.CompressionParameter.nb_workers: 4,
}

# Create a .tar.zst archive without including folder paths
with zstd.open(outputArchive, 'wb', options=options) as f, tarfile.open(fileobj=f, mode="w:") as g:
    for file in fileToCompress:
        # Use only the filename (no directories) inside the archive
        g.add(file, arcname=file.name)

akira94 · January 16, 2026, 6:10pm

A cleaner solution is

from pathlib import Path
from compression import zstd
import tarfile

files_to_compress   = [r"E:\Personal projects\tmp\chunk_0.ndjson",
                       r"E:\Personal projects\tmp\chunk_0.ndjson"]
files_to_compress   = [Path(p) for p in files_to_compress]
output_archive      = r"E:\Personal projects\tmp\test.tar.zst"
output_archive      = Path(output_archive)

options = {
    zstd.CompressionParameter.nb_workers: 4,
}

with tarfile.open(output_archive, mode="w:zst", options=options) as tar:
    for file in files_to_compress:
        tar.add(file, arcname=file.name)