Hi all,
I saw that Python 3.8 adds zipfile.Path
, which provides a partial pathlib-compatible interface to zip files. This implementation doesn’t actually use any pathlib
machinery, rather just mimics its interface. Previously I’ve noted packages like artifactory that do subclass from pathlib
classes, but the implementation is necessarily messy due to pathlib.Path
not being intended to be subclassed in user code.
But what if we changed that?
The pathlib
source code already contains notions of flavours (implementing posix or windows path semantics) and accessors, but pathlib
contains only a single accessor implementation, and in the last few years pathlib
has accumulated code that bypasses the accessor.
I propose that we change instances where pathlib.Path
directly calls methods like os.close()
and os.getcwd()
to use the accessor instead, drop the underscores from _Accessor
and _NormalAccessor
, and document the accessor interface.
I’ve played around with these ideas and published a package called pathlab that shows what might be possible. It turns out implementing the equivalent of os.stat()
and os.listdir()
in an accessor gets you pretty far!
Some questions arise:
- Is this something we want users to do? For me it seems obvious that a common interface to the local filesystem, zip and tar files, FTP, WebDAV etc would be desirable, but I’m keen to hear feedback!
- Should we provide a
pathlib.StatResult
class for use in users’Accessor.stat()
implementations?os.stat_result
is a little tricky to instantiate as-is. - What does this mean for
os.DirEntry
, if anything? Its interface is a subset ofpathlib.Path
, with the exception of the addition of apath
attribute. - How do we suggest users bind an accessor instance (which, unlike a local filesystem accessor, requires some context like a URL or an open file pointer) to a path type? In my implementation I’m doing some things in
Accessor.__new__()
to create a newPath
subclass with the accessor instance bound in. But perhaps there’s a better way… - In Accessors, do we want to stick to the
os
API as much as possible, or perform some sort of simplification (i.e.unlink()
andrmdir()
, or justdelete()
?) - How do we support transfers between different kinds of filesystems?
with path1.open('rb'), path2.open('wb') ...
would work but would not preserve permissions/metadata/etc, so would there by scope to add aPath.transfer()
method?
Penny for your thoughts! Hope this isn’t too crazy.
The changes needed to pathlib.Path
can be seen here, albeit implemented in a subclass for now.
Barney