Every algorithm proposed for inclusion in the compression namespace handles files—except zlib. If we exclude gzip (which does handle files), what’s the actual rule? Or is this just inconsistent? Should authors have used “zlib” name for gzip too?
This argument applies equally well to every module under discussion.
The problem here is gzip
should’ve been called gzfile
to match tarfile
and zipfile
. But that’s historical now. (Personally I’d put them all under compression
anyway, without removing any existing modules from the top level.)
I think it would be useful to look at other languages and how they refer to GZip. In the below table links that put GZip under a zlib package are considered “Compression” since zlib is a compression library.
Language | Compression or Archive | Link |
---|---|---|
Golang | Compression | https://pkg.go.dev/compress/gzip |
Ruby | Compression | class Zlib::GzipFile - Documentation for Ruby 3.5 |
Rust | Compression | GitHub - rust-lang/flate2-rs: DEFLATE, gzip, and zlib bindings for Rust |
Haskell | Compression | zlib: Compression and decompression in the gzip and zlib formats |
C# | Compression | GZipStream Class (System.IO.Compression) | Microsoft Learn |
Java | Archive | java.util.zip (Java Platform SE 8 ) |
NodeJS | Compression? | Zlib | Node.js v23.11.0 Documentation |
Web APIs | Compression | Compression Streams API - Web APIs | MDN |
PHP | Compression | PHP: gzcompress - Manual |
Perl | Compression | IO::Compress::Gzip - Write RFC 1952 files/buffers - Perldoc Browser |
Based on the above I think it is very common to refer to GZip as a compression format, so I think it should go under compression
.
Of the above, Perl and C# refer to ZIP as part of the compress namespace, but the rest do not include it. So I think I will maintain that zipfile
and tarfile
probably shouldn’t go under compression
.
Agreed. FWIW, the current Wikipedia entry is.
gzip is a file format and a software application used for file compression and decompression.
Based on the above table, I made a PR to update the PEP to include gzip
under compression
and expand the rejected ideas section.
I hope to submit the PEP to the Steering Council tomorrow, barring any further concerns. This would allow a few weeks before the beta cutoff for 3.14 to merge the implementation. In the mean time, I will be working on polishing up my branch so that I can break it up into mergeable MRs.
I would like to take this opportunity to mention that I have a few open issues on my CPython fork discussing API design choices for the new compression.zstd
module. Mostly weighing keeping the module the same as the existing pyzstd API vs making it match existing compression modules in the standard library more.
Thank you again to everyone for the feedback and discussions!
Agreed, and perhaps we should have done that in Python 3.0?
I would hope that the nesting be kept to two levels, i.e. std.zstd
, std.zlib
, std.futures
, etc. Namespaces should only be created to avoid conflicts, not for taxonomic purity.
Well, are we concretely planning to move zlib
and lzma
there? Otherwise, a partial taxonomy only makes things more confusing and awkward for the user.
Yep! The PEP enumerates the modules going under compression: PEP 784 – Adding Zstandard to the standard library | peps.python.org
New import names
compression.lzma
,compression.bz2
,compression.gzip
andcompression.zlib
will be introduced in Python 3.14 re-exporting the contents of the existinglzma
,bz2
,gzip
andzlib
modules respectively.
I think the question was whether they’d be removed from the top level. But this is also in the PEP, in the next section, with an overall process that takes ten years.
I believe this PEP should also address the potential future namespacing of the entire standard library, perhaps under std
. In such a scenario, would modules like compression.zstd
reside at std.compression.zstd
or be moved to std.zstd
?
Why should this PEP do that? There’s no point in putting nonbinding future plans in here, is there?
Exactly. We really have no idea what a stdlib reorg would look like.
I agree with Barry and James on this. Since we don’t know what a std
namespace might look like, anything said in this PEP wouldn’t be much more than an educated guess based on a design sketch. I don’t think that’s a good basis to design a specification.
Timing wise, I don’t think there’s a need to specify it anyway. A future PEP introducing std
can always make whatever proposal it wishes. That seems like a much better time to look at the location of these modules as presumably there will be a fully fleshed out design in hand.
The rejected ideas section says:
a future PEP introducing a
std
namespace could always define that thecompression
sub-modules be flattened into thestd
namespace.
I think this allows for a future PEP to argue for flattening the namespace into e.g. std.zstd
while not making that a requirement if we end up liking the compression
package.
I missed this, but I’m happy with this. Thank you.
I may as well add my thoughts on the color of this bikeshed:
- Python should immediately reserve
std
, if it hasn’t already. - This PEP, which is titled “Adding Zstandard to the standard library”, should do what it says on the tin. It should not be concerned with introducing a
compression
module or making changes to any existing compression module that is unrelated to Zstandard. The module that implements Zstandard should be namedstd.zstd
(orstd.zst
if people prefer that). - Moving forward, all new Python stdlib modules should be named
std.$whatever
. Python should adopt a policy to never introduce any more stdlib modules outside ofstd
.- I agree with the view from @pitrou that “the nesting be kept to two levels, i.e.
std.zstd
,std.zlib
,std.futures
, etc. Namespaces should only be created to avoid conflicts, not for taxonomic purity”.
- I agree with the view from @pitrou that “the nesting be kept to two levels, i.e.
- There should not be any rush to migrate existing stdlib modules into
std
. That should be thought about separately and proposed as a future PEP. - There should certainly not be a rushed decision to migrate existing stdlib modules into
compression
. If this happened, and then later it was decided to migrate the stdlib tostd.*
then Python will have had two large-scale name migrations in a short space of time, which isn’t useful for anyone.
There isn’t. This is part of a 10-year plan. If it doesn’t turn out to be successful, we can always take a different path.
It’s similar to my initial concern, but honestly, it’s not something we need to worry about right now—maybe in 10 years… or maybe not.