Ergonomics of new pathlib.Path.scandir()

I’ve now reverted the addition of Path.scandir():

There’s some discussion here about how we might expose the cached file type in pathlib:

IIUC the current proposal is as follows: we add a Path.info property, which exposes an object with is_file(), is_dir() and is_symlink() methods. Accessing the property doesn’t perform FS access. The property type is either os.DirEntry (for paths generated by Path.iterdir()) or a new pathlib type (other paths). In both cases, stat() results are cached.

I propose that p = Path(p) is sufficient for clearing the p.info cache, and so we don’t need to provide further methods/property deleters/etc.

If we add something like Path.info, then:

  • Users will be able to get at the os.DirEntry objects internally generated by Path.iterdir(), which are otherwise thrown away
  • Users writing performance-sensitive code can rely on path.info for cached file status whether or not the path object was generated by iterdir()
  • In the pathlib ABCs, we can replace PathBase.stat() with PathBase.info. The former is too low-level and awkward to implement for most virtual filesystems.

Here’s a rough example:

>>> from pathlib import Path
>>> p = Path.cwd()
>>> p.info
<pathlib._local._FileInfo object at 0x7f6f357fcf00>
>>> p.info.is_dir()
True
>>> q = next(p.iterdir())
>>> q
PosixPath('Lib')
>>> q.info
<DirEntry 'Lib'>

Thoughts?

6 Likes