Here’s a way to replace the double-duty ReadablePath.open() method without resorting to dunder methods: add a stream argument to ReadablePath.read_bytes() and WritablePath.write_bytes(). When stream=True, these methods would return a readable or writable file object in binary mode. In the ABCs they’d be abstract. There’s analogous behaviour in libraries like requests, and it might appeal to users who feel daunted by the values accepted by open(mode=...).
What about instead of adding the openable protocol, or the stream mode, provide a decorator in pathlib, that takes a function of type Callable[[JoinablePath], IO[bytes]] and returns a valid ReadablePath.open(...) or WritablePath.open(...) method.
Which basically provides the same support for users implementing new pathlib subclasses without the new __open_rb__, __open_wb__ dunders.
I initially thought it would be convenient to implement via a single decorator with an argument to chose between read and write, but sketching it out, I realised that the user facing part should look more like the property builtin.
In the end it would just be your openable protocol squeezed into a different usage pattern for no reason other than avoiding new dunders, which likely is not a good a reason anyways.
My suggestion would have looked like something along these lines:
# open_signature_helper would return a callable with signature
# def __call__(self, mode="r", buffering=-1, encoding=None, errors=None, newline=None): ...
# and internally would call __open_rb__ or __open_wb__ according to mode,
# wrap IO[bytes] correctly and implement correct behavior for the other options
class MyPathRO(ReadablePath):
...
@open_signature_helper
def open(self):
# == __open_rb__
return ... # type: IO[bytes] readable
class MyPathRW(WritablePath):
...
@open_signature_helper
def open(self):
# == __open_rb__
return ... # type: IO[bytes] readable
@open.writer
def _(self):
# == __open_wb__
return ... # type: IO[bytes] writable
It’s good to have someone else think through the problem all the same thanks
Having played with some options, I’m going to use the __open_rb__() / __open_wb__() approach, at least for now. It can be done without impacting pathlib.Path (which is nice) and it could always be revised later if someone hates it. Patch here:
The filled and dashed arrows currently represent the same thing: a standard super/subclass relationship. For performance reasons I’m hoping to ABCMeta.register() the pathlib classes as “virtual” subclasses of the pathlib._abc classes, and so the dashed lines represent these planned registrations. I expect it will be a few weeks until I can make that change.
I wasn’t sure if a GitHub issue or this thread would be a better place to ask—please let me know if I should move the contents of that GitHub issue to this thread
That’s the same distinction as between the existing PurePath and Path, see the docs.
If a function takes a JoinablePath, I know that the file named by that path doesn’t need to actually exist.
The Readable/Writable split goes further: if something takes a ReadablePath, I’ll expect it to work with an immuable backup directory or a shared container layer.
JoinablePath is an ABC that you can’t instantiate directly. To use it, you need to subclass it and provide at least parser, with_segments() and __str__() attributes (at time of writing)
JoinablePath provides only part of the PurePath API (e.g. it doesn’t include __fspath__(), __eq__() or as_posix() )
I figured it might be interesting to revisit the questions in my original post from ~5 years ago and show how we answered them:
In Python 3.11 we removed the “accessor” classes altogether, as they were a vestige of early pathlib development that had no present purpose. Instead we’ve made various methods of _ReadablePath and _WritablePath abstract, like iterdir() and mkdir().
In Python 3.12 we added pathlib.PurePath.with_segments(), which is called whenever a new path object is created from an existing one (e.g. path.parent, path.iterdir()). User subclasses of _JoinablePath should implement this abstract method, which allows them to pass instance data to the new path’s initialiser.
In Python 3.14 we added pathlib.types.PathInfo as a high-level protocol for path metadata. User subclasses of _ReadablePath should expose an info attribute that implements the protocol.
In paths returned from pathlib.Path.iterdir(), the info attribute wraps an os.DirEntry object that’s initialised with information about the path gleaned from scanning its parent. But we haven’t changed os.DirEntry at all, nor do we directly expose os.DirEntry objects in pathlib.
In Python 3.14 we added pathlib.Path.copy() to support copying between paths.
In the pathlib ABCs, this method looks like _ReadablePath.copy(self, target: _WritablePath) - it supports copying between arbitrary readable and writable path objects, including preserving entire directory structures. Furthermore, each target path is given an opportunity to copy metadata from its source (specifically its info object.). Therefore the copy() method can be used to upload and download, to archive and extract, etc, depending on its operands.
Still not conclusively answered! A PEP will sort this out
I am currently implementing a library following the interface of pathlib. I noticed that cwd() is a classmethod. This works for local file system paths, where this information is a process global, but is not great for other path implementations which need access to instance information like a connection object.
If anyone is looking to contribute to the pathlib ABCs, there’s an open ticket here about adding complete type annotations in pathlib.types:
I’m no typing expert so I’d appreciate any help. It might also be a good opportunity to review the API and highlight rough edges - very happy to hear feedback! Thanks.
In November’s update I wrote about my plan to split up PathBase and prune its interface. That work is now pretty much complete! I’ve published pathlib-abc 0.4 with the revised interface. Docs here:
I’m now working on a PR for zipp that adds a dependency on pathlib-abc:
This would unify the globbing implementations in the pathlib and zipp, solve all weirdness around trailing slash requirements, and open the door to enabling methods like zipp.Path.walk().
If I can convince @jaraco to merge the PR and backport to CPython, then the next task is to add public support for copying between pathlib.Path and zipfile.Path (both directions). I think we’d make zipfile.Path subclass WritablePath and enable its copy() and _copy_from() methods. We’d probably formalize the private PathInfo metadata methods so we can preserve metadata when copying, e.g. POSIX permissions, modification time.
Big thanks to everyone who has helped over the last few months, including Paul Moore, Petr Viktorin, Steve Dower, Alyssa Coghlan, Bénédikt Tran and Andreas Poehlmann.
If anyone has any feedback on the plan, or ABCs themselves, do feel free to share. Cheers!