Please don't break invalid escape sequences

Usually I’m one of the strongest advocates of “don’t break things, don’t break things”, but this is a case where any code that’s broken by this change was already broken. The “and/or” example shouldn’t ever be an issue anyway; the correct usage there is a forward slash, as you’ve used here, and writing “and\or” is simply incorrect grammar.

Windows paths are the most common one though, and the issue here is that you get data-dependent bugs. Consider how current versions of Python behave:

>>> print("C:\Documents\Text File.txt")
<unknown>:1: SyntaxWarning: invalid escape sequence '\D'
C:\Documents\Text File.txt
>>> print("c:\documents\text file.txt")
<unknown>:1: SyntaxWarning: invalid escape sequence '\d'
c:\documents	ext file.txt
>>> print("C:\Users\DefaultUserName")
  File "<python-input-5>", line 1
    print("C:\Users\DefaultUserName")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
>>> print("C:\Documents\" + dirname + "\" + basename + ".txt")
<unknown>:1: SyntaxWarning: invalid escape sequence '\D'
  File "<python-input-7>", line 1
    print("C:\Documents\" + dirname + "\" + basename + ".txt")
                                        ^
SyntaxError: unexpected character after line continuation character

All of these different results are from the same bug: Unescaped backslashes in a string literal.

Further: If Python doesn’t make this an error, code may break inexplicably in a future update. Imagine if a future Python version were to add "\e" as an escape sequence equivalent to "\x1b" (very useful when writing ANSI colour codes, and found in many other languages). Now any program that uses "c:\everything" will be broken.

Academics have to learn the bare minimum of Python syntax. You already need to understand that "c:\text file" won’t work. Why is it a problem that "C:\Text File" also won’t work?

The solutions are extremely easy, too. My preferred recommendation is simply to use forward slashes! Everyone understands those, and they have no problems on different OSes. Raw string literals work too, but they introduce the edge case that r"C:\Path\to\directory\" + filename won’t work, and IMO that’s annoying enough to not want to have to explain it to people. Or just double all the backslashes, since string literals are not raw text in a program.

I’d like to suggest that we stop trying to support broken code that works only by accident as it is even more painful to fix when it suddenly stops working.

10 Likes