When using strings for paths, one can conveniently generate error or debugging messages using f"{pathname!r}" and conveniently get quoting and escaping. When using PosixPath, doing that will (correctly) get something like "PosixPath('/etc/passwd')".
Unfortunately, multiple format conversions (i.e. f"{path!s!r}" or f"{path!sr}"), which would be a possible shortcut, are not allowed.
This hints at the possibility of Path-specific format options for escaping, which could cover common use cases besides tidy user messages:
shlex.quote equivalent (shlex.quote("'") → '\'\'"\'"\'\''), convenient when building shell constructs
ls equivalent (' becomes '"\'"'), convenient when mentioning filenames in messages
repr(str(path)) equivalent that always adds the quotes, covenient when one always wants quotes around filenames in messages
So, assuming the three above examples would get formatting strings like :s, :q, :Q, you could do things like:
cmd = ["open", path]
subprocess.run(cmd)
log.info("Ran command: %s", ' '.join(f"{c:s}" for c in cmd))
print(f"Opening file {path:q}...")
raise RuntimeError(f"Input file {path:Q} has the wrong size")
While there is nothing here we couldn’t do without, it seems like a way to encourage cleaner ways of dealing with the unexpected that can be found in filesystem paths.
:>>> x = pathlib.Path('.bashrc')
:>>> x
PosixPath('.bashrc')
:>>> str(x)
'.bashrc'
:>>> print(f'This is a path {str(x)} repr')
This is a path .bashrc repr
:>>> print(f'This is a path {str(x)!r} repr')
This is a path '.bashrc' repr
I no longer rely on !r for putting quotes around things in messages. If you want quotes around your path in a string, put them there explicitly. The code remains just as simple, and the intent becomes obvious.
raise UsageError(f"The destination '{path}' must already exist.")
Neither of the other quoting options seem like they should be in output strings. You’re typically displaying the string to a user, who will be confused by the extra escape characters (which are shell specific, not Python specific). They can always use the shell’s own quoting if they want to paste it: command 'path with spaces'. Python’s subprocess doesn’t need that quoting either, it already handles ["command", "path with spaces"] correctly.
Somehow I’ve avoided learning about __format__() all these years, so take this reply with a large pinch of salt…
I’m open to the idea of a formatting shortcut for repr(str(path)), which is a common incantation for logging or raising exceptions. For simple filenames it produces a user-friendly result like 'foo.txt', but for filenames involve e.g. newlines or quote characters it still produces something reasonable and unambiguous.
I’m less sold on a shortcut for shlex.quote() or the like. It’s too niche to belong in pathlib IMO.
Q: does this apply to pathlib or os.PathLike more generally? I wonder if we could add two presentation types for path-like objects, like this:
We’d need to decide if we want some like str and int formatting, which have a strict set of codes and nothing else, or something more like datetime’s formatting, with some replacement codes, and everything else shows up in the output verbatim. Personally, I like the datetime’s take on it, especially if the format string will be user provided (say, in a log file configuration file).
Just spitballing here, but say you want to have %s be the string, %S be the repr, %p be the parent, and %P be the repr of the parent:
import os
from pathlib import Path, PosixPath
# Obviously incomplete implementation.
def format(self, spec):
return (
spec.replace("%s", os.fspath(self))
.replace("%S", repr(os.fspath(self)))
.replace("%p", os.fspath(self.parent))
.replace("%P", repr(os.fspath(self.parent)))
)
# Pay no attention to the man behind the curtain.
PosixPath.__format__ = format
p = Path("/foo/bar")
print(f"{p:result %S directory %P}")
Would print “result '/foo/bar' directory '/foo'”.
Of course in f-strings this is pointless, it’s in str.format calls where it’s powerful.
Yes, that’s another option (for another thread) to deal with some of the issues. But adding Path.__format__ might still be useful. I can see wanting to call repr(os.fspath(obj)) instead of repr(str(obj)) to catch things like None, where the second would succeed and the first wouldn’t.
Or rather, format(obj, “%s”) would fail on None but succeed on a Path, assuming ”%s” was valid for a Path object. It’s the same reason to use f”{obj:2d}" instead of f”{obj!s:>2}” if you know obj will always be an int.
The datetime format works that way to mirror C’s strftime. It’s useful for C where concatenating strings is a pain, but in Python you can easily do: