Make `os._walk_symlinks_as_files` public

os.walk() works differently from pathlib.Path.walk(): Even when followlinks is False (the default), os.walk() appends symlinks to directories to dirs instead of nondirs (so does os.fwalk()), while pathlib.Path.walk() correctly treats symlinks to directories as nondirs. This unintuitive behavior was reported 14 years ago (GH-57179), and has been documented since 2022.

Python 3.12 introduced pathlib.Path.walk(), which corrected this unintuitive behavior with a new API, but os.walk() and os.fwalk() remained unchanged.

In 2024, to fix a recursion error in shutil.rmtree(), a new os._walk_symlinks_as_files sentinel for the followlinks parameter of os.walk() was added to Python 3.14 (GH-119634), and was backported to Python 3.13 and 3.12. This effectively provided a way to avoid the unintuitive behavior. Later, pathlib.Path.walk() was changed to use os.walk() with os._walk_symlinks_as_files instead of its own implementation (GH-119573). However, os.fwalk() remained unchanged.

In my opinion, we can make this sentinel public, allowing users to write something like os.walk(some_dir, followlinks=os.walk_symlinks_as_files) to get the same behavior as pathlib’s. If so, there are two questions:

  1. Is there a better name for os._walk_symlinks_as_files?
  2. Should we make os.fwalk() support this sentinel as well, to ensure consistency between os.walk() and os.fwalk()?
4 Likes

CC @barneygale What do you think?

I have had at least one bug caused by these discrepancies – we had to switch from pathlib.Path.walk to os.fwalk in an application, and we mishandled symlinks – and I think a sentinel existing and being documented would have helped us because it calls attention to the behavior.

That said, I think it’s important to give followlinks some understandable semantics and a good API. os.walk() is used a lot by beginners, as they’re trying to figure out how to traverse a filesystem.

True | False | TheSentinel feels a bit too much like each value is a special case.
Is there a good phrasing of this with an enum? e.g.,

class SymlinksFlag(enum.Enum):
    FOLLOW = enum.auto()  # True translates to this
    TARGET_TYPE = enum.auto()  # and False to this
    AS_FILES = enum.auto()

and let usage use the enum name (don’t flatten it out like the re flags):

os.walk(dirname, followlinks=os.SymlinksFlag.AS_FILES)

I am not at all attached to this naming. I’m advocating primarily that “the interface is important”.

4 Likes