Currently when you use os.listdir(), Path.iterdir() / os.scandir() to list the items in a directory, this also includes hidden items.
You can check whether they start with ., but that doesn’t detect paths such as C:\ProgramData on Windows & os.path.expanduser(~/Library) on macOS.
My proposal is to add a function that handles these paths correctly:
import os
from os.path import basename
import stat
if os.name == "nt": # ntpath.py
if hasattr(os.stat_result, 'st_file_attributes'): # Windows
def ishidden(path):
"""Test whether a path is hidden."""
try:
st = os.stat(path)
except (OSError, ValueError):
return True
return bool(st.st_file_attributes & stat.FILE_ATTRIBUTE_HIDDEN)
else:
def ishidden(path):
"""Test whether a path is hidden."""
os.fspath(path)
return False
else: # posixpath.py
if hasattr(os.stat_result, 'st_reparse_tag'): # macOS
def ishidden(path):
"""Test whether a path is hidden."""
try:
st = os.stat(path)
except (OSError, ValueError):
return True
return bool(st.st_flags & stat.UF_HIDDEN)
else: # Unix
def ishidden(path):
"""Test whether a path is hidden."""
path = os.fspath(path)
if isinstance(path, bytes):
prefix = b"."
else:
prefix = "."
return basename(path).startswith(prefix)
Usage:
from os import listdir
from os.path import join
directory = expanduser("~")
for item in listdir(directory):
path = join(directory, item)
if not ishidden(path):
print(path)
Note: New files starting with . on macOS, as well as some system files on Windows aren’t hidden in the current implementation & the root on Windows is hidden.
It’s an interesting idea. I think it would make a bit more sense the other way around as ishidden() since “hidden” is a defined attribute. It’s telling that all three of your implementations return not
I think there has been a proposal for this before; digging up and addressing any concerns from prior discussions should be the first step. Off the top of my head, one edge case to consider is a non-hidden file in a hidden directory: is it hidden or not?
I am more in favor of adding a ‘hidden’ parameter to the os.listdir function instead. Also, it’s not clear where the proposed function should be added. However, without taking into consideration (and addressing) previous discussions, we would be repeating ourselves.
On unix systems hidden is a convention, filenames starts with “.” and is not “.” Or “…”. On Windows it is a attribute that requires an API call that is not reasonable for os.listdir to do us it is an extra call, as it will slow down os.listdir.
Actually, a hidden=True parameter for os.listdir() and os.scandir() would be nearly free to implement on Windows, though that in itself doesn’t make it a good idea. Directory entries on Windows include basic metadata, including the long filename, short filename (if any), file attributes, reparse tag (if any), file size, and timestamps (birth, modify, access).
There are also backup files ending with ~ , and files in a hidden directory which are inherently hidden.
A hidden folder (sometimes hidden directory) or hidden file is a folder or file which filesystem utilities do not display by default when showing a directory listing[1].
Using hidden=True doesn’t incur any additional cost since it reflects the current behavior. Introducing hidden=False as an option for os.listdir would bring it in line with common practices found in command-line tools like dir /a and ls -a
I think it would make a bit more sense the other way around as ishidden() since “hidden” is a defined attribute. It’s telling that all three of your implementations return not
True, I wanted to avoid needing to apply not to the result, but we can’t predict how the function will be used.
Off the top of my head, one edge case to consider is a non-hidden file in a hidden directory: is it hidden or not?
Windows treats the root directory as hidden, so we first need to decide whether we want to respect that fact.
I’m not sure this API is needed. I would guess that in most cases you can fetch your directory listing and use a list comprehension to filter out the visible or invisible files.
My counter proposal is about avoiding this block of code:
directory = "C:\\"
items = listdir(directory)
filtered_items = [item for item in items if not ishidden(join(directory, item))]
for item in filtered_items:
print(item)
We can already use a platform-specific one-liner to check if a file is hidden or not, e.g., os.path.basename(os.fspath(path)).startswith(b'.' if isinstance(os.fspath(path), bytes) else '.'). Also, ‘hidden’ is just one of the file attributes in Windows.
For Windows, it’s best to use os.scandir() for this. It gets the basic stat result from the entries in the directory listing, which is read in a batch of entries per system call. For example:
import os
import stat
def hidden(e):
attrs = e.stat(follow_symlinks=False).st_file_attributes
return attrs & stat.FILE_ATTRIBUTE_HIDDEN
filtered_entries = [e for e in os.scandir('C:/Windows') if hidden(e)]
for e in filtered_entries:
print(os.path.normpath(e.path))
I’m not. One can easily imagine wanting to know this for an arbitrary
filesystem path, not just “what I get from listdir”.
I have other objects to adding it to listdir:
os.listdir, like most os.* things, is a pretty direct mapping to the
OS API of that name, in this case the POSIX listdir; putting more
knobs on it breaks that idea
Once we start adding parameters to special things, where does one stop?
parameters for only-files, only-dirs, not-files, not-dirs,
only-particular special file types,
not-decodable-with-my-current-locale, etc etc. I can imagine uses for
all of these in some circumstances, and as such I do not want them in listdir, but as some kind of filter applied later eg a list
comprehension.
And you’re off into policy land and the bike shed of many colours.
I’d be in favour of something in pathlib which could test “would my
current platform consider this hidden?” And/or in os.path to match.
I gather Windows has an actual “hidden” file attribute? Some UNIXy
systems have something similar.
The venerable UNIX “starts with a dot” convention should be honoured on
a UNIX platform. What should one do on a UNIX platform which also has a
“hidden” file attribute? I’d lean towards honouring that, too.
Wouldn’t it be more consistent if we had os.path.ishidden() & os.DirEntry.is_hidden()
like os.path.isdir() & os.DirEntry.is_dir(), etc.
Then we have a function for arbitrary paths & an efficient one for directory entries.
The big real use case I can come up with is GUI file chooser dialog boxes. But those are implemented in high-level UI frameworks who have these things worked out already. Once again, I’m not sure these APIs make sense in the stdlib.