As of Python 3.12.0 alpha 7, it’s possible to subclass from pathlib.PurePath and Path, plus their Posix- and Windows-specific stablemates.
A major use case for extending these classes is to implement embedded or remote filesystems - for example, a pathlib-like interface to .tar files or S3 buckets. This requires that underlying tarfile.TarFile or botocore.Resource objects to be shared between path objects, such as when such path objects are generated via path.parent or path.iterdir(). I’m looking to introduce an instance method that is called whenever such paths are created. The default implementation would look like this:
def newpath(self, *pathsegments):
"""Construct a new path object from any number of path-like objects.
Subclasses may override this method to customize how new path objects
are created from methods like `iterdir()`.
As mentioned in the docstring, subclasses can customize this method to pass information around, e.g.:
I hate to say it, but I don’t much like either newpath or makepath. In general I’m not a fan of the verbs do or make in method names; they are so bland and generic as to add no nutritious information to the name. And we don’t use new in Python, calling the class itself serves as a constructor.
Which leads me to my next point: this method is a constructor. The cited default implementation is literally equivalent to the constructor. The novel feature you propose is secretly propagating state information from one instance into the newly created instance. Explicit is better than implicit, so, how about we make it explicit? Add a context or state parameter to the base class constructors, then pull that context or state off the instance and pass it in explicitly. Or maybe, instead of passing in the state, you pass in the existing instance you want the new instance to be similar to, in which case the constructor argument would need a different name (sibling? prototype?).
And isn’t this new method equivalent to instance.joinpath('/', *segments), at least on non-Windows?
An opaque “context” or “state” object doesn’t seem any more explicit. I’d prefer that subclasses are free to accept and store relevant stuff in their __init__() methods, e.g. session_id or tarfile in the examples above.
This I quite like, I’ll have a play, thanks!
That would make relative paths absolute on non-Windows.
I’ve updated the PR to use Larry’s idea, and it’s much improved. In the PR I used “template” as the argument name, as it seems a tiny bit less jargon-y than “prototype”. OTOH it’s less precise and could perhaps be misunderstood as a format string, or something like that. I’m perfectly happy with either. Any thoughts/suggestions?
I later thought, maybe this could be a variant of copy or clone? You construct a clone, then overwrite the actual path but keep the rest. But I can’t think of any examples of specifically “clone and also modify as part of the same method call” in the standard library–all the clone and copy methods I know of don’t take arguments. So that would be novel. Anyway, if you’re happy with template (which wfm) I say go for it.
I like the initializer-argument approach as well. It seems very elegant. I worry, though, if it’s Pythonic - I struggle to think of a really clear precedent for doing things this way.
If this approach is chosen, I think prototype is a much more accurate parameter name than template. A “template” connotes something with semantically, deliberately blank or placeholder values that will be filled in; the template is not properly usable as-is. A “prototype”, on the other hand, connotes an example that will be copied or emulated with modifications, which is what we have here.
My only concern with “prototype” is that, in everyday usage, it tends to mean something like “an early sample or model built to test a concept or process”. E.g. “my prototype nuclear reactor is suspiciously warm”. I might be overthinking it!