Make pathlib extensible

Support for subclassing pathlib.Path was added in 3.12, so this should just work:

class MyPath(pathlib.Path):
    pass

The cost of the check in Path.__new__() is a shame, but unavoidable without breaking APIs as far as I can tell.

1 Like

On POSIX compliant systems, POSIX is the name of the API that’s mostly used to implement the os module, and os.name is “posix”. On Windows systems, Win32 is the name of the API that’s mostly used to implement the os module, but os.name is the name of the base system/kernel, “nt”. On POSIX systems, sys.platform is the name of the base system/kernel, such as “linux” and “darwin”, while on Windows it’s the name of the API, “win32”.

1 Like

Ah I totally missed that Python 3.12 already made pathlib.Path subclassable when I tested my suggested code in Python 3.10.

I understand why for backwards compatibility you need to keep the relative class hierarchy as is and the if cls is Path: cls = WindowsPath if os.name == 'nt' else PosixPath check in Path.__new__, and I also like that the new implementation drops the whole _Flavour class by simply reusing os.path.

Thanks for all the good work. :slight_smile:

4 Likes

At many places, functions support filenames, path-like objects and file handles like this:

if hasattr(filename, "read") or hasattr(filename, "write"):
    fp = filename
    closefp = False
elif isinstance(filename, (str, bytes, os.PathLike)):
    fp = open(filename, mode)
    closefp = True

But this code rules real generalized path-objects out. So how should we improve this?

Is it ok, to expect, that any object with a open-method can be used

if hasattr(filename, "read") or hasattr(filename, "write"):
    fp = filename
    closefp = False
elif hasattr(filename, "open"):
    fp = filename.open(mode)
    closefp = True
elif isinstance(filename, (str, bytes, os.PathLike)):
    fp = open(filename, mode)
    closefp = True

or is it better to check for a instance of type PathBase?

if hasattr(filename, "read") or hasattr(filename, "write"):
    fp = filename
    closefp = False
elif isinstance(filename, pathlib._abc.PathBase):
    fp = filename.open(mode)
    closefp = True
elif isinstance(filename, (str, bytes, os.PathLike)):
    fp = open(filename, mode)
    closefp = True

It think, to allow a broader support for path implementations, there should be one way to handle this case.

Path objects work just fine with open(), they don’t need a separate branch.

But I don’t love this pattern in general–I would rather write one function that takes a file-like object, and another that takes paths and opens them for use by calling the first function.

:sparkles: Sparkly update! :sparkles:

I’ve resolved GH-73991 by adding new Path.copy(), copy_into(), move() and move_into() methods. Publicly these can only be used for local filesystem copies/moves, but secretly these methods are implemented in PathBase and allow any other instance of PathBase as the destination path. With appropriate implementations of PathBase, it should be possible to move a file from local storage to a .zip file, thence to a .tar file, and finally back to local storage, with three calls to move().

There are underscore-prefixed methods for preserving metadata when copying/moving between different types of PathBase, but they need much refinement. As mentioned in my last update, I also need to develop a FlatPathBase class to better support filesystems that aren’t directory-oriented. To make progress on both of these things, I’m planning to write my own private PathBase-derived version of zipfile.Path that passes all/almost all its tests. I’ll extract the generic bits (e.g. generation of implied directories) into FlatPathBase, and ensure we can round-trip metadata in copy() and move(). If I can get it working, it should allow me to finalize the pathlib._abc APIs.

I think that’s it for now. Cheers!

19 Likes

I haven’t closely followed the new developments but I noticed a really useful method is now deprecated: pathlib — Object-oriented filesystem paths — Python 3.14.0a0 documentation

What was the reason for this? I can’t find anything using GitHub’s search.

See Deprecation of pathlib.PurePath.is_reserved()

1 Like