Is there a `pathlib` equivalent of `os.scandir()`?

barneygale · October 28, 2024, 12:16am

@ncoghlan added some great feedback to the issue and PR about the potential dir_entry attribute, and how it could work for paths that aren’t generated by Path.iterdir(). I’ve been trying to get to grips a potential API, but a lot of it isn’t obvious (e.g. when caches are generated, updated, expired), which I think indicates that my whole approach is wrong.

What if we simply added Path.scandir() instead?

There’s a few things to weigh. The most obvious objection is that path.scandir() is only an alternate spelling of os.scandir(path). But some of pathlib’s utility comes down to trivially wrapping the most useful os functions in methods (for example, see Path.stat(), chmod(), rmdir()), despite (sometimes quite convincing) arguments that this is improper practice. I suggest that scandir() is among the most commonly-used os functions and might deserve a Path method on its own merits.

My own pet interest is in eventually exposing a PathBase class that users can subclass to implement virtual filesystems. To that end, having a scandir() method that yields caching os.DirEntry-like objects is a massive help for implementing various high-level PathBase methods, like glob(), walk() and copy().

There’s a bit of cross-over between these concerns. Python 3.14’s Path.copy() method is implemented in PathBase at the moment, and uses PathBase.iterdir() to walk directory trees. For performance reasons it ought to work with os.DirEntry when dealing with local paths, which means I can either 1) add a near-duplicate implementation in Path that uses os.scandir(), or 2) Add scandir() to the PathBase interface and call that from copy().

Thoughts?