Os.path.isempty() & pathlib.Path.is_empty()

Currently there’s no way to safely delete or overwrite a file or folder without losing data.

I propose a function that allows you to check for this:

def isempty(path: StrOrBytesPath, *, follow_symlinks: bool = True) -> bool:
    """Test whether a file or directory is empty."""
    if not follow_symlinks and (islink(path) or isjunction(path)):
        return True

    if isfile(path):
        return not getsize(path)

    if not isdir(path):
        return True

    return not listdir(path)

Additionally check_empty parameter could be added to functions that delete / overwrite files.
Raising a FileNotEmptyError or returning False when it’s not empty.

Not empty:

  • A directory containing a file or subdirectory (excluding . & ..)
  • A non zero-byte file

Empty:

  • A directory containing no files nor subdirectories (excluding . & ..)
  • A zero-byte file
  • A junction (if follow_symlinks is disabled)
  • A symlink (if follow_symlinks is disabled)

Why don’t you put these helper functions into a PyPI package and see if other people find them useful.

Personally I have never needed to check like this before deleting stuff.

7 Likes

Similar to your last suggestion, this is a pretty application-specific function. In addition to the general objection you have already seen in the previous discussion, for this suggestion there are a few opinions where we can disagree about, most notably, I wouldn’t consider a directory only containing (an?) empty file to be save to delete. These are decisions every application has to make on their own, and I don’t think the python stdlib should provide an opinion on this.

5 Likes

Why don’t you put these helper functions into a PyPI package and see if other people find them useful.

I would first like to finalise the details.

Personally I have never needed to check like this before deleting stuff.

In an interactive program, you can delete such files without asking for confirmation.

There are a few opinions where we can disagree about, most notably, I wouldn’t consider a directory only containing (an?) empty file to be save to delete. These are decisions every application has to make on their own, and I don’t think the python stdlib should provide an opinion on this.

OK, updated the implementation.

I had assumed that this is somthing that you are using in an application and therefore know the details.

Are you guessing at a needed function rather then solving an actual use-case you have?
That is not productive use of any ones time (as proved by academic research).

2 Likes

I was using a similar function in the old file manager I wrote, but I wasn’t happy about the interface library it’s built on, so now I’m redesigning everything from scratch. At the moment, nothing is final.

I was simply wondering if these functions could be useful for other people too.

That wasn’t the point. You are writing an application. You have to make these decisions. Think about them, potentially asks users, make a survey of what other programs do, etc… If there is an overwhelming agreement between these factors, that is ofcourse a clear decision to make. But if that isn’t the case (which I strongly suspect), then they don’t belong in the stdlib and instead each application should make their own decisions.

Great! Do that. This isn’t the correct place for such discussions. This is about adding stuff to the stdlib, not designing a third party library. Look at something like platformdirs. They also have to make these kinds of opinionated decisions. Once they have made decisions, one can discuss adding it to the stdlib (which was recently suggested, not clear if that is still in the works). But trying to discuss multiple layers at the same time (“how should this function behave”, “is this function useful”, “does it belong in the stdlib”, “where does it belong”) is not going to lead to a productive result. If you nail down at the very least the first point and are willing to defend your decisions (preferably with a well formulated pre-PEP), then it might make sense to discuss whether they belong in the stdlib.

8 Likes

That is a rather bizarre definition of “no data”. This does NOT belong in the stdlib; it is tied very closely to your specific needs and belongs in your application.

1 Like

I modified this list 2 hours ago, because MegaIng didn’t agree with me:

  • An empty directory
  • An empty file
  • A hardlink (with the second implementation)
  • A junction
  • A symlink

I think you’re missing the point. There isn’t a “right answer” that will make people suddenly agree with this proposal. The correct behavior is application-specific and therefore the function is not appropriate for the standard library.

9 Likes

I believe it makes more sense the other way around: isempty.
I also removed the hardlinks implementation as it was subjective.
Is this new classification of empty files and directories more intuitive?

  • A directory containing no files nor subdirectories (excluding . & ..)
  • A zero-byte file
  • A junction (if follow_symlinks is disabled)
  • A symlink (if follow_symlinks is disabled)

A meta note: please avoid repeatedly changing the topic title, as it makes it much harder to follow the conversation via email. As you’ve also found, constantly editing previous posts makes it hard to respond to those posts since they’re a moving target, and email participants won’t even notice the changes.


I would highly recommend to follow @barry-scott’s advice from above:

8 Likes

Not for me, no.

1 Like

Maybe your program can do this. There will be others that have a meaningful reason to care about the presence or absence of a specific file in a specific location, even if it’s empty. (Perhaps you’ve seen some examples named __init__.py :wink: )

5 Likes

I was using a similar function in the old file manager I wrote, but I
wasn’t happy about the interface library it’s built on, so now I’m
redesigning everything from scratch. At the moment, nothing is final.

Very sound. But this isn’t for the stdlib. Have this discussion in the
“Help/users” category instead. Your criteria are inherently
application specific.

I was simply wondering if these functions could be useful for other
people too.

They may be (but I’d choose a less… vague name than hasdata,
myself). So write yourself a module where you keep this kind of thing
and publish it on PyPI. If others find it useful, then they can use
it!

Glaring example: I’ve got a cs.fileutils module on PyPI filled with all
sorts of things I’ve found (and find) useful. Anyone could use it if
they like. Also its lower level companion, cs.fs.

If you’re encountering lots of bikeshedding, give your function various
optional flags to set/unset particular behaviours.

Personally, if I expect to remove a directory I expect it to already
be empty, and just us os.rmdir. Making it empty enough for that is my
app’s job.

Still the side note, it’s all grouped nicely in my mail reader because
the messages have correct In-Reply-To headers:

 06Apr2024 10:22 Karl Knechtel v -  ┌>[Py] [Ideas] Os.path.isempty() & pathlib.Path.is_empty()
 06Apr2024 09:18 Paul Moore via  -  │ ┌>[Py] [Ideas] Os.path.isempty() & pathlib.Path.is_empty()
 06Apr2024 02:34 Nice Zombies vi r  │┌>
 06Apr2024 02:20 Barry Scott via -  ├>
 06Apr2024 03:04 Cornelius Krupp -  ├>
 06Apr2024 02:17 Nice Zombies vi - ┌>
 06Apr2024 07:12 Zachary Ware vi r │┌>[Py] [Ideas] Os.path.isempty() & pathlib.Path.is_empty()
 06Apr2024 01:44 Barry Scott via - ├>[Py] [Ideas] Os.path.hasdata() & pathlib.Path.hasdata()
 06Apr2024 06:41 Nice Zombies vi - ├>[Py] [Ideas] Os.path.isempty() & pathlib.Path.is_empty()
 06Apr2024 05:10 James Webber vi - │ ┌>
 06Apr2024 04:35 Nice Zombies vi - │┌>
 06Apr2024 04:13 Chris Angelico  - ├>
 06Apr2024 01:45 Cornelius Krupp - ├>
 06Apr2024 01:38 Nice Zombies vi - [Py] [Ideas] Os.path.hasdata() & pathlib.Path.hasdata()

I can even see the subject line changes.

What’re you using to read the list? (Or do you delete messages once
read, which I don’t.)

OTOH, as an email user I find retrospective editing of posts very
annoying. Unless the author’s fixing up stupid typos (I do this myself,
I make them all the time), we never see the revised versions and even
people on the forum have to go back and try to figure out what’s been
changed. Posting a followup correction/response is much better.

An empty directory or empty file might have extended attributes. On Windows, either might also have alternate data streams.

1 Like

I’m afraid you still missed my reply about changing the name of the function to ìsempty because I renamed the post title.

There will be others that have a meaningful reason to care about the presence or absence of a specific file in a specific location, even if it’s empty. (Perhaps you’ve seen some examples named __init__.py :wink: )

OK, maybe it’s better to always ask for confirmation.

That’s not his problem, that’s yours. Please don’t make substantive edits and assume that people will just notice them. Post responses in the thread so we can actually see them.

3 Likes