Make pathlib extensible

:sparkles: May 2023 progress report :sparkles:

GH-100481 has landed! Path objects have a new with_segments() method that is called whenever a derived path is created, such as from path.parent or path.iterdir(). Thank you Alex Waygood and Éric for the reviews, and everyone who helped bikeshed the method.

What this means: from Python 3.12 you can subclass PurePath and Path, plus their Windows- and Posix-specific variants. You will not receive an AttributeError when you try to instantiate your subclass (issue); any custom initialiser you add will be called (issue, issue); and by overriding with_segments() you can pass information between path objects.

I’ve begun work on tarfile.TarPath, and I’m hoping to have a PR up within a few weeks (lots of tests to write and get passing!). I’m confident I can get this in for Python 3.13. It will utilize a new pathlib._AbstractPath class under-the-hood

However, in order to add a public AbstractPath class, I’m pretty sure we’ll need to move three methods from PurePath to Path, which effectively removes them from the AbstractPath interface. They are:

  • as_uri() – this returns a file:// URI, which is only applicable to local filesystem paths. It also uses os.fsencode() to encode the path, which can vary by system. This doesn’t make much sense for subclasses of AbstractPath in general, which may have a different URI representation or none at all. Library authors shouldn’t be expected to remember to delete or re-implement as_uri() when subclassing AbstractPath.
  • __bytes__() – as with the above, this uses os.fsencode() and is unlikely to be applicable to, say, a path stored in an ISO 9660 disc image with Joliet extensions, which uses UTF-16BE under the hood.
  • __fspath__() – because it would be catastrophically awful if open(TarPath('README.md', ...)) opened a local file in the current working directory called README.md.

These moves will require a deprecation period, and so I think we’re looking at Python 3.15 for the addition of pathlib.AbstractPath. There are ways to do it sooner (e.g. by having AbstractPath not subclass PurePath) but they have their own problems.

I suspect that I’ll be getting into the weeds of tarfile.TarPath development in future updates. Stay tuned, and thanks for following along!

12 Likes