Comment string literals (c-strings)

There are a lot of people that are looking for multiline comments:

Most commonly multi-line strings are used for this:

print('Hello, world!')
"""The previous statement
   prints 'Hello, world!'."""

But pylint (rightfully) complaints about this:
test.py:2:0: W0105: String statement has no effect (pointless-string-statement)

Additionally, they can’t be used midline or where docstrings are expected:

"""Hello, I'm
   a docstring."""
a = '1' + '2' + '3' + """This will be included in the string.""" \
    '4' + '5'
for i in range("""stop/"""5):  # SyntaxError: invalid syntax. Perhaps you forgot a comma?
    print(i)

What if we had raw-string literals that were ignored by the compiler by prefixing the string with 'c'?

c"""Hello, I am a 
    multiline comment."""
a = '1' + '2' + '3' + c"This will not be included in the string." \
    '4' + '5'
for i in range(c"stop/"5):
    print(i)

Renders as:

Parsing these shouldn’t be too difficult, and users can easily convert their multi-line strings to c-strings by prefixing them with a 'c' (or replace the 'r' prefix).

1 Like

I already saw that tweet, but thanks for linking it for people that haven’t.

I have 2 problems with using multi-line strings for comments:

  1. They are not rendered as comments, this would be easy to achieve for c-strings
  2. They can’t be used in all contexts, c-strings don’t have that restriction

There are no multiline comments in Python (no matter what anyone says). If you want a comment that spans multiple lines, use one # on each line. If that’s annoying to do, then upgrade your editor.

This has been brought up many times by the way, so it’s probably worth searching old posts.

Incidentally, you probably shouldn’t use line continuations in your example instead of parentheses, but this is a matter of style.

Thus, your example in conventional Python is:

# Hello, I am a 
# multiline comment.
a = ('1' + '2' + '3' # This will not be included in the string.
     + '4' + '5')
for i in range(5):  # 5 is the stop parameter.
    print(i)
2 Likes

Weird, I’m only finding a single topic:

Exactly! You should have considered the date of the tweet and realized that this topic has been discussed many times in the past.

I’m not saying you shouldn’t propose the idea, but you should document previous discussions.

What would these do?

a = 1 + c"Comment" * 3
print(c"Comment", 1)
sum([], start=c"Comment")

Either they’re strings or they’re comments. If you want them to behave like /* ... */ comments in C-like languages, don’t make them look like strings.

2 Likes

I think 2 is a lot! And I think there are even more if you go back to searching the old python-ideas mail group.

Also this proposal is practically the same as the /* */ proposal, but with c" or c""" and " or """ as delimiters.

That’s rendered like this:

Which is equivalent to this:

a = 1 +* 3 # SyntaxError: invalid syntax
print(, 1) # SyntaxError: invalid syntax
sum([], start=) # SyntaxError: expected argument value expression

So, they’re nothing whatsoever to do with string literals. Why abuse string literal notation for them? Why not just propose the introduction of /* */ comments? At least people (and tools) will know what to expect.

3 Likes

Correction, this is not related to multi-line comments.

Fair enough, but I would say the midline comment is clearer.

In theory it would be easier to parse, as you can just reuse the string parsing logic.
You can also easily convert strings used as comments to actual comments (no escaping is required).

Explaining this to Python users would be simple: they’re parsed the same way as r-strings, but ignored by the compiler. Just add a 'c' prefix to the strings you used as block comments.

It’s harder to correctly parse string literals than to parse comments. There are three popular types of comment syntax, all very easy to parse:

  1. A token meaning “comment till end of line”. Python’s hash comments are of this form, as are SQL’s double-hyphens. Once a comment has started, the only thing you need to look for is a line ending. (There may be an exception for an escaped end-of-line but that’s all.)
  2. Non-nestable comments. Once you find the start-comment token (eg /*), the only thing you need to search for is the end-comment token (eg */). Extremely simple to parse. Does not care about line endings or anything. Has the consequence that a comment inside a block of code to be commented-out will break it.
  3. Nestable comments. From the start-comment token, you need to search for two things: another start-comment sequence, or the end-comment sequence. (Commonly /* and */ again.) This has the consequence that a start-comment sequence inside a string literal will break the comment (thus leading to stylistic choices like concatenating two string literals just to avoid having that happen).

All three are far simpler than the complexities of string literals, which recognize a variety of escaping rules. For example, this is a valid string literal:

x = "start \\\
and end"
print(x)

With just two backslashes, it wouldn’t be valid.

But reusing string literal parsing would lead to unnecessary confusion. People do not expect string literal behaviour from comments or vice versa. You’re gaining nothing but confusion by this.

I honestly don’t think there’ll be a lot of support for a /* */ comment proposal, but it has a better chance than something based on string literals IMO.

3 Likes

Let me try to summarise the 2 threads on the mailing list:

  • Mailman 3 Comment syntax similar to /* */ in C? - Python-ideas - python.org
    /*Hello, I am a 
      multiline comment*/
    
    • Not everyone likes midline comments (but some people do)
    • Unused string literals suffice:
      print('Hello, world!')
      """The previous statement
         prints 'Hello, world!'."""
      
  • Mailman 3 Multi-line comment blocks. - Python-ideas - python.org
    #:
        Hello, I am a 
        multiline comment
    
    • Not backwards compatible (doesn’t apply here)
    • There are enough alternatives
    • Regular comments can be used to comment out code as often as you want:
      # def fun(a, b, c):
      #     """Docstrings are fine when commented"""
      #     pass
      #     # This is a nested comment.
      # And no need for an end-delimiter either.
      
    • Midline comments are more flexible, but finding use cases is hard
    • Some tools don’t handle unused string literals properly (because it’s hard)

I won’t deny that, but I was talking about the Python compiler and not about other tools.
I’ll note that highlighting unused string literals as comments is even harder than detecting c-strings, because they don’t have a special prefix.

There are 2 solutions if this is a problem:

  1. Don’t allow single line c-strings
  2. Don’t allow line continuation for c-strings, but at that point you could just support /**/.
1 Like

Unused strings are not comments.

What suffices for multi-line comments is:

# This
# and that.

I think the reason these proposals never take off is that no one can show that a compelling need for another comment syntax.

That is incredibly hard to read. People currently expect comments at the right, and not mixed into code.

3 Likes

…sounds more like we should not allow these types of comments.

1 Like

It has approval of Guido, see his quote above.

Forget that last example then, there are better ways to comment out code.

You’re still looking at this the wrong way. Answer me this:

** WHY** should this proposal be about string literals?

It’s just an idea I had while laying in bed. I thought I finally came up with a great proposal, I guess not.
What a waste of time.

Waste of time indeed, but perhaps not for the reasons you think.

For anyone else out there who has what they believe is a brilliant idea, here’s how to NOT waste everyone’s time:

  1. How is this idea similar to other things (usually other languages)? WHY is this correct for Python?
  2. How is this idea different from other things? WHY is this correct for Python?

There. Now you have the basis for your post.

2 Likes