Os.path.samefile: Which exception?


(Peter Suter) #1

os.path.samefile is documented as:

This is determined by the device number and i-node number and raises an exception if an os.stat() call on either pathname fails.

Which exception is raised? The other functions in os.path explicitly document this (“Raise ValueError if …”, “Raise OSError if …” etc.)

Trying it out I get: FileNotFoundError: [WinError 2] The system cannot find the file specified: '...'
But since this is not documented I can not rely on that? It could be a different error in some cases?

One might suggest that the answer is in the linked os.stat() documentation. It doesn’t say anything about exceptions there either, but at the top of the os documentation it says:

All functions in this module raise OSError in the case of invalid or inaccessible file names and paths, or other arguments that have the correct type, but are not accepted by the operating system.

So does this also apply to os.path.samefile? It doesn’t seem clear to me.

Thanks.


(Brett Cannon) #2

So the exception should be considered whatever os.stat() would raise, which as you point out isn’t much clearer either. The real trick is what situation is triggering the exception as failures can range. It will probably require reviewing the code to figure out what should be documented for what failure conditions.

As for OSError, that can probably be clarified to “package” (even though os is technically not a package), but once again it would require double-checking quickly if that does hold for os.path (which I would be surprised if it didn’t).


(Peter Suter) #3

The note on OSError does not seem to hold in general for most os.path functions. Only four are explicitly documented to “Raise OSError if the file does not exist or is inaccessible”:

  • os.path.getatime(path)
  • os.path.getmtime(path)
  • os.path.getctime(path)
  • os.path.getsize(path)

Of the others, I could only get two to raise an OSError:

  • os.path.samefile('', '')
  • os.path.ismount(':::')

Some raise ValueError for some (but not all) invalid paths:

  • os.path.abspath('\0')
  • os.path.exists('\0')
  • os.path.lexists('\0')
  • os.path.isfile('\0')
  • os.path.isdir('\0')
  • os.path.islink('\0')
  • os.path.realpath('\0')
  • os.path.relpath(':::')

Others do not seem to raise any exceptions for any of the invalid paths I tried:

  • os.path.basename(path)
  • os.path.commonprefix(list)
  • os.path.dirname(path)
  • os.path.expanduser(path)
  • os.path.expandvars(path)
  • os.path.isabs(path)
  • os.path.join(path, *paths)
  • os.path.normpath(path)
  • os.path.split(path)
  • os.path.splitdrive(path)
  • os.path.splitext(path)

(Serhiy Storchaka) #4

But not in 3.8.


(Peter Suter) #5

Thanks, I was not aware of that. I found https://bugs.python.org/issue33721 now.

Similar exceptions are raised perhaps in hundreds functions. It is impossible to document them all

This is surprising. Should one generally assume any Python standard library function can raise any exception? I.e. try: ... except: ... must be used? I would have assumed the goal is to document all exceptions.


(Inada Naoki) #6

Yes. Some typical exceptions are expected. But there are some exceptions; KeyyboardInterrupted, RuntimeError, MemoryError, maybe more.

Not must. For some unexpected exceptions, default error handler is enough for most cases.
For example, when you writing script, Python will exit with error message including stack trace.
When you writing Web applications, WAF (Web Application Framework) will catch any exception, log the error with stacktrace, and return “503 Internal Server Error” to the web application users.

So you must use try ... except ... in very rare cases.

I would have assumed the goal is to document all exceptions.

I don’t think it’s not pragmatic for Python. There are some Python implementations. There are some implementation specific exceptions.
Even when thinking only CPython, listing all possible exceptions is a very tough task, and it will block Python evolves.


(Peter Suter) #7

Let’s focus on the “expected exception” instead of the “exceptions to those expected exceptions”.

Can a user of the Python standard library expect assume “expected exceptions” are documented? What are the “expected exceptions” for os.path.samefile? I don’t want to use try: ... except: ..., but I don’t want a “503 Internal Server Error” either. I just want to compare two paths and see if they refer to the same existing file or not.

I suspect OSError should be considered an “expected exception” for os.path.samefile and should be documented. Then I could use try: ... except OSError: .... Am I wrong?

I guess I could use os.path.isfile(a) and os.path.isfile(b) and os.path.samefile(a, b) instead. Although it seems to me that os.path.samefile() should ideally do that for me. But it’s documented to raise instead. It’s just not documented what. One can guess it’s just os.stat()'s exception. But to me it seems somewhat troubling that one can not be sure.

Thanks.


(Inada Naoki) #8

I’m OK to add document about samefile raises OSError.

What can you do other than return 503 when you cannot know two files are same or not by some reasons, like PermissionError?
If you want to handle “file not found” case, you can catch the specific exception. For all other cases, only you can do is return 503. Am I wrong?

I’m not against adding document to samefile. But I just against about this one:


(Peter Suter) #9

I’m OK to add document about samefile raises OSError .

Great, thanks.

What can you do

Raise a more informative error like CPython’s shutil or pycp:

Or silently skip file copying like Django:

Or continue searching the list of potential files:

Or probably many other things.

The Django file above also handles some PermissionErrors.

Apparently it’s already common to assume os.path.samefile only (usually) raises OSError.


(Inada Naoki) #10

Is your topic is document exception, or changing behavior?

I don’t know translating all OSError into False is good idea. For example, when a and b is same file, but current user doesn’t have permission to access it, I expect PermissonError, not False.

– “Errors should never pass silently.”


(Peter Suter) #11

I don’t propose changing any behavior. I would welcome improved documentation. I suspected someone would tell me I misread or missed some section in the current documentation, which explains that OSError is raised.

(What do you make of https://docs.python.org/3.8/library/os.path.html#os.path.exists which says “On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.”)

But I agree: If possible, the standard library should raise PermissionError, not return False. Again, I propose no change of behavior.

But users of the standard library often can and want to handle such errors; as shown above, often the correct preferred way to handle it may look similar to “ignoring it silently”. Of course this should only be done with a good reason and not blindly.

Since PermissionError is a OSError, it would be enough to document only OSError.


(Inada Naoki) #12

Sorry, I don’t understand “What do you make of~”. I’m not good at English.

If you request me comment about changing it’s behavior; changing behavior is hard decision. Strong reason is required when breaking backward compatibility.
And I don’t have such strong reason. So -1 unless someone shows reason strong enough to break backward compatibility.

Yes.

I agree.


(Peter Suter) #13

I don’t propose changing behavior. I was just curious. Sorry for the confusion.