When using strings for paths, one can conveniently generate error or debugging messages using f"{pathname!r}" and conveniently get quoting and escaping. When using PosixPath, doing that will (correctly) get something like "PosixPath('/etc/passwd')".
Unfortunately, multiple format conversions (i.e. f"{path!s!r}" or f"{path!sr}"), which would be a possible shortcut, are not allowed.
This hints at the possibility of Path-specific format options for escaping, which could cover common use cases besides tidy user messages:
shlex.quote equivalent (shlex.quote("'") → '\'\'"\'"\'\''), convenient when building shell constructs
ls equivalent (' becomes '"\'"'), convenient when mentioning filenames in messages
repr(str(path)) equivalent that always adds the quotes, covenient when one always wants quotes around filenames in messages
So, assuming the three above examples would get formatting strings like :s, :q, :Q, you could do things like:
cmd = ["open", path]
subprocess.run(cmd)
log.info("Ran command: %s", ' '.join(f"{c:s}" for c in cmd))
print(f"Opening file {path:q}...")
raise RuntimeError(f"Input file {path:Q} has the wrong size")
While there is nothing here we couldn’t do without, it seems like a way to encourage cleaner ways of dealing with the unexpected that can be found in filesystem paths.
:>>> x = pathlib.Path('.bashrc')
:>>> x
PosixPath('.bashrc')
:>>> str(x)
'.bashrc'
:>>> print(f'This is a path {str(x)} repr')
This is a path .bashrc repr
:>>> print(f'This is a path {str(x)!r} repr')
This is a path '.bashrc' repr
I no longer rely on !r for putting quotes around things in messages. If you want quotes around your path in a string, put them there explicitly. The code remains just as simple, and the intent becomes obvious.
raise UsageError(f"The destination '{path}' must already exist.")
Neither of the other quoting options seem like they should be in output strings. You’re typically displaying the string to a user, who will be confused by the extra escape characters (which are shell specific, not Python specific). They can always use the shell’s own quoting if they want to paste it: command 'path with spaces'. Python’s subprocess doesn’t need that quoting either, it already handles ["command", "path with spaces"] correctly.
Somehow I’ve avoided learning about __format__() all these years, so take this reply with a large pinch of salt…
I’m open to the idea of a formatting shortcut for repr(str(path)), which is a common incantation for logging or raising exceptions. For simple filenames it produces a user-friendly result like 'foo.txt', but for filenames involve e.g. newlines or quote characters it still produces something reasonable and unambiguous.
I’m less sold on a shortcut for shlex.quote() or the like. It’s too niche to belong in pathlib IMO.
Q: does this apply to pathlib or os.PathLike more generally? I wonder if we could add two presentation types for path-like objects, like this:
We’d need to decide if we want some like str and int formatting, which have a strict set of codes and nothing else, or something more like datetime’s formatting, with some replacement codes, and everything else shows up in the output verbatim. Personally, I like the datetime’s take on it, especially if the format string will be user provided (say, in a log file configuration file).
Just spitballing here, but say you want to have %s be the string, %S be the repr, %p be the parent, and %P be the repr of the parent:
import os
from pathlib import Path, PosixPath
# Obviously incomplete implementation.
def format(self, spec):
return (
spec.replace("%s", os.fspath(self))
.replace("%S", repr(os.fspath(self)))
.replace("%p", os.fspath(self.parent))
.replace("%P", repr(os.fspath(self.parent)))
)
# Pay no attention to the man behind the curtain.
PosixPath.__format__ = format
p = Path("/foo/bar")
print(f"{p:result %S directory %P}")
Would print “result '/foo/bar' directory '/foo'”.
Of course in f-strings this is pointless, it’s in str.format calls where it’s powerful.
Yes, that’s another option (for another thread) to deal with some of the issues. But adding Path.__format__ might still be useful. I can see wanting to call repr(os.fspath(obj)) instead of repr(str(obj)) to catch things like None, where the second would succeed and the first wouldn’t.
Or rather, format(obj, “%s”) would fail on None but succeed on a Path, assuming ”%s” was valid for a Path object. It’s the same reason to use f”{obj:2d}" instead of f”{obj!s:>2}” if you know obj will always be an int.
The datetime format works that way to mirror C’s strftime. It’s useful for C where concatenating strings is a pain, but in Python you can easily do:
I just found out about this problem today. I am trying to do:
self._hash_dir = self._get_dir(kind= 'hash')
for source in Path(self._out_path).glob('*'):
if source == self._cache_dir:
continue
log.debug(f'{source:<50}{"-->"}{self._hash_dir}')
and expecting the formatting to work with a path from PathLib, it turns out that it does not. From the user POV, we expect these things to work like normal strings for things like formatting and I believe the corresponding functionality should be added.
Cheers.
Edit: I just realized (because of the exception I just saw) that even this line:
if source == self._cache_dir:
does not work. I.e. the comparison between a PathLib path and a string is not implemented or just does not work. I cannot think of any reason why the underlyng string path would not be comparable to a string, besides the fact that it was just neglected. But maybe I am missing something?
Having new and shiny libraries is amazing, but if they are incomplete and cause troubles to the user, will the user adopt them? Or the user might just build his/her own stuff that works for them.
I did not know about samefile. However i do not think that is a good way of doing this. source is already a path and defining the __eq__ dunder to mean the same as samefile does seem natural.
Here for a path object, been equal to some string, would mean:
That string is also a path
That path is the same as source
That is natural and intuitive. Otherwise you are forcing the user to instead of just using his intuition, go and read the documentation.
I would always rather design tools that are intuitive and require as little documentation as possible.