Getting syntax errors in docstrings, how to fix?

Windows 10 with Python 3.12 in a virtual environment.

I have a docstring at the beginning of my program, before any imports or code, it looks like this. This is a test program to show the error.

'''This is a test program. Put file in c:\users\me\stuff. 
Regex is stuff/morestuff/print\d+
'''

I get a syntax error:

File "C:\Users\me\OneDrive - coname\Documents\PythonProjects\blah\Buscard\testdoc.py", line 1
    '''
    ^^^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 15-16: truncated \uXXXX escape

Here’s the error. SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 15-16: truncated \uXXXX escape

I assumed docstrings are treated like comments, but I was wrong. How do I get rid of this error? I have windows paths and regex documented in all kinds of programs using the backslash.

I was using docstrings as comments. Does Python have true multi-line comments where I won’t get a syntax error when I type in a windows dir with backslashes, or when I type in regex meta characters?

EDIT: I actually need multi-line comments as comments in the beginning of the program document many things all in one place, and the comments do tend to change early in the programming process. Getting the end user to decide on specs and stick with it is not possible.

In Perl I enclosed this documentation in POD documentation blocks.

Got it! I just turn the docstring into a raw string.

r'''
This is a multiline comment. 
Use file in c:\users\me\python\stuff\
Regex is ahh\d+
'''

Sorry to be a bother but this may help someone else. So many other articles I just read say the docstring can be used for multi-line comments, except any backslash values ARE interpreted by python and may generate a syntax error. This may not cause a problem with unix paths but it does cause problems with Windows paths and regex metacharacters like \d.

It also helps that I will write Windows paths with a slash, not backslash, and Python seems fine with that: c:/user/me/python/data.data.

4 Likes

See also

They’re string literals, not comments. If you want comments, use comments, and they won’t be parsed :slight_smile: Alternatively, just write ALL your paths using forward slashes, since that will work just fine: c:/users/me/stuff No errors here.

If they were treated like comments, then there would be no reason for them to exist separately from comments. The entire point of a comment is that it cannot affect the meaning of the code at runtime; but docstrings do have a runtime effect (they add the __doc__ attribute to functions, classes and modules, which can be found using the built-in help system).

As the name suggests, docstrings are treated like strings (that are also automatically turned into those attributes). That means they have the same syntax requirements, and same flexibility, as any other string. The triple-quote style is not specific to docstrings (but it does allow for multi-line strings, which is why they are normally used for docstrings). The source code between the opening and closing quote works the same way as in any other string literal: backslashes have the same meaning and semantics.

And, just like any other string, you can use the r prefix for raw strings if you prefer the raw-string handling for backslashes. (You can even use the f prefix - but please don’t; you should know exactly what the docstring should say, without needing to run any code to fill in the blanks.)

Python does not have any kind of multi-line comment. Aside from that, “docstrings” are only the strings that appear as the first statement in the function, class or module. Anything else is just a string literal that you create and then immediately throw away. It’s not a “comment” any more than 1 on a line by itself is a comment.

“Regex meta characters” have nothing to do with this. Python’s regex support is at the library level, and the language has no built-in recognition of regex syntax. You get errors only because of the string literal syntax. It doesn’t matter if you were trying to write a regex, or trying to write a Windows file path, or anything else. It matters that (in this case) the code contains a backslash, followed by a lowercase u, followed by something that doesn’t properly complete that escape sequence (4 hex digits giving a Unicode code point below 65536). Or, in other cases, something else that’s invalid.

If you want to write multiple lines of comments, and have them treated semantically as comments, you have exactly one valid option: prefix each line with #.

Whoever wrote this was irresponsible. A docstring exists to provide documentation. Other string literals exist to describe strings.

1 Like

Thank you all. I’m still learning, and memorizing, the finer points of Python. I’m still a noobie though. :slight_smile:

1 Like