As of Python 3.12.0 alpha 7, it’s possible to subclass from pathlib.PurePath and Path, plus their Posix- and Windows-specific stablemates.
A major use case for extending these classes is to implement embedded or remote filesystems - for example, a pathlib-like interface to .tar files or S3 buckets. This requires that underlying tarfile.TarFile or botocore.Resource objects to be shared between path objects, such as when such path objects are generated via path.parent or path.iterdir(). I’m looking to introduce an instance method that is called whenever such paths are created. The default implementation would look like this:
def newpath(self, *pathsegments):
"""Construct a new path object from any number of path-like objects.
Subclasses may override this method to customize how new path objects
are created from methods like `iterdir()`.
"""
return type(self)(*pathsegments)
As mentioned in the docstring, subclasses can customize this method to pass information around, e.g.:
It enables passing information from one path instance to another. In the SessionPath example above, that’s self.session_id. That would not be possible if it were a classmethod!
I hate to say it, but I don’t much like either newpath or makepath. In general I’m not a fan of the verbs do or make in method names; they are so bland and generic as to add no nutritious information to the name. And we don’t use new in Python, calling the class itself serves as a constructor.
Which leads me to my next point: this method is a constructor. The cited default implementation is literally equivalent to the constructor. The novel feature you propose is secretly propagating state information from one instance into the newly created instance. Explicit is better than implicit, so, how about we make it explicit? Add a context or state parameter to the base class constructors, then pull that context or state off the instance and pass it in explicitly. Or maybe, instead of passing in the state, you pass in the existing instance you want the new instance to be similar to, in which case the constructor argument would need a different name (sibling? prototype?).
[edit]
And isn’t this new method equivalent to instance.joinpath('/', *segments), at least on non-Windows?
An opaque “context” or “state” object doesn’t seem any more explicit. I’d prefer that subclasses are free to accept and store relevant stuff in their __init__() methods, e.g. session_id or tarfile in the examples above.
This I quite like, I’ll have a play, thanks!
That would make relative paths absolute on non-Windows.
Spoiler alert: it’s way more intuitive. Always listen to Larry, folks! I’ll adjust the PR soon. We can then argue about “prototype” vs “sibling” vs “template” vs whatever
I’ve updated the PR to use Larry’s idea, and it’s much improved. In the PR I used “template” as the argument name, as it seems a tiny bit less jargon-y than “prototype”. OTOH it’s less precise and could perhaps be misunderstood as a format string, or something like that. I’m perfectly happy with either. Any thoughts/suggestions?
I later thought, maybe this could be a variant of copy or clone? You construct a clone, then overwrite the actual path but keep the rest. But I can’t think of any examples of specifically “clone and also modify as part of the same method call” in the standard library–all the clone and copy methods I know of don’t take arguments. So that would be novel. Anyway, if you’re happy with template (which wfm) I say go for it.
So, this creates a new instance of the same type, conceptually a modified copy of the original instance, where the (path) segments have changed, but everything else matches the original?
My standard rules directly tell me the name for such a thing: with_segments. (Similarly, a @classmethod creating the instance from scratch would be from_segments.)
Maybe you could give us sample code, illustrating what each looks like? You said the template parameter to the constructor cleaned things up, so, let us see the before and after.
I like the initializer-argument approach as well. It seems very elegant. I worry, though, if it’s Pythonic - I struggle to think of a really clear precedent for doing things this way.
If this approach is chosen, I think prototype is a much more accurate parameter name than template. A “template” connotes something with semantically, deliberately blank or placeholder values that will be filled in; the template is not properly usable as-is. A “prototype”, on the other hand, connotes an example that will be copied or emulated with modifications, which is what we have here.
My only concern with “prototype” is that, in everyday usage, it tends to mean something like “an early sample or model built to test a concept or process”. E.g. “my prototype nuclear reactor is suspiciously warm”. I might be overthinking it!
I was specifically thinking of the Javascript concept of a “prototype”. But it isn’t directly analogous, and in any case Python isn’t Javascript, and the reason for the name may not be obvious to first-time readers of the API, and I don’t have a strong opinion about the right spelling of the name of the parameter.