March 2022 progress report
Slightly less to report this month, so I’m going into detail on my current task.
Brett Cannon made a helpful suggestion in PR 31085:
My current thinking (while I work through my PR review backlog to reach this PR), is this should be used for zipfile.Path
. I also wouldn’t be apposed to that happening this PR if @barneygale wanted to give that a shot while he waits for a review from me.
This clarifies the order of things: I need to address the three issues I mentioned in the last post before we can add an AbstractPath
class.
The first is to do with subclassing and flavours: users expect to be able to subclass pathlib.Path
, instantiate their subclass, and have their new object use the local system’s path flavour (Windows or POSIX). This is what bpo-24132 is all about. Currently this falls over because many PurePath
/ Path
methods expect to find a _flavour
attribute, which is only set in PurePosixPath
, PureWindowsPath
and their subclasses.
How can we solve this? An obvious solution is to set _flavour
in PurePath
(e.g. switching on os.name
), but if we look a little deeper we can spot a nice simplification.
It’s worth considering the possible values _flavour
can take: pathlib._posix_flavour
or pathlib._windows_flavour
, which are singleton instances of _WindowsFlavour
and _PosixFlavour
respectively, those types being subclasses of _Flavour
. I usually refer to them as “flavour classes”. Here’s how things look today:
PurePath._flavour = xxx not set!! xxx
PurePosixPath._flavour = _PosixFlavour()
PureWindowsPath._flavour = _WindowsFlavour()
What are flavour classes? In my view, they’re essentially re-implementations of posixpath
and ntpath
, with a few changes and improvements. Here are the most striking concurrences (from the 3.7.0 source tree):
flavour os.path
======= =======
sep sep
altsep altsep
casefold() normcase()
splitroot() splitdrive()
gethomedir() expanduser()
resolve() realpath()
In the case of splitroot()
, gethomedir()
and resolve()
, the implementations clearly derive from the implementations in posixpath
and ntpath
. But the implementations were not kept in sync after pathlib landed in CPython, meaning every bug needed to be fixed in two places, and that didn’t always happen. In PRs over the last year or two I’ve removed the gethomedir()
and resolve()
implementations, which solved some pathlib bugs.
Here’s the rub: these classes don’t need to exist. We can make PurePosixPath
reference posixpath
directly, and do the same thing with PureWindowsPath
and ntpath
. Flavour classes don’t bring enough to the table to really justify their existence. I can see why they did when pathlib was a standalone package, mind!
And that brings me to the second observation: we already have a handy dandy attribute that points to either posixpath
or ntpath
depending on the system - it’s called os.path
! So we can solve the problem elegantly by setting _flavour
as follows:
PurePath._flavour = os.path
PurePosixPath._flavour = posixpath
PureWindowsPath._flavour = ntpath
And so that’s what I’ve attempted to implement in PR 31691. Brett has kindly assigned himself as a reviewer, and I expect he’s still working his way through his mighty review queue. That’s all the news! o/