/* C-like multiline comments */

Zezombye · October 9, 2024, 11:54pm

Using multiline strings as mutliline comments, as is commonly done, has several disadvantages:

Multiline strings cannot be used as comments in several places.

Middle of array (or function call, or tuple…):

some_array = [
  1,  
  2,
  """3,
  4,"""
  5,
  6
]

-> SyntaxError (no comma after string)

some_array = [1,2,3,4,5,6] #it is impossible to comment out 3,4 without unnecessarily splitting the array into new lines

Between if/elif/else (elif raises SyntaxError when preceded by a string statement · Issue #108272 · python/cpython · GitHub)

if 1:
	foo = 2
"""
This is very important.
Keep it.
"""
else:
	foo = 3

-> SyntaxError (the string reset the indentation)

Raw strings must be used when the comment contains an escape.

See: Python fails to parse triple quoted (commented out) code · Issue #75436 · python/cpython · GitHub, Unicode errors occur inside of multi-line comments · Issue #73471 · python/cpython · GitHub

'''
log = list(r'..\Unknown\*.txt')
'''

-> SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes: truncated \UXXXXXXXX escape

It sometimes also gives out a SyntaxWarning for unrecognized escapes.

Multiline strings are not rendered as comments by syntax highlighters, which makes it more difficult to parse the code and skim over the parts that are comments.

For all intents and purposes, Python already has multiline comments (they are even officially parsed as docstrings in PEP 8 – Style Guide for Python Code | peps.python.org). It’s just that currently, those multiline comments sometimes fail in various ways and sometimes can’t even be used.

Having specific syntax for multiline comments would allow for them to be used anywhere and not be a hack.

Additional requests:

Have /** denote the start of a docstring, which are used the same as the current docstrings and are put in the __doc__ attribute.
Have multiline comments be able to be nested. This is unlike other C-like languages but more easily allows to comment out code that already has multiline comments in it.

bschubert · October 9, 2024, 11:59pm

Previous proposal for /*-style comments:

Some other somewhat related comment proposals:

Rosuav · October 10, 2024, 12:01am

So, what you’re saying is: 1. Triple quoted strings are not comments. 2. Triple quoted strings are not comments. 3. Triple quoted strings are not shown as comments.

This should not be a surprise. They are indeed NOT comments. Docstrings are not comments. Triple-quoted strings in other contexts are not comments. It so happens that a string literal on its own does nothing, but that doesn’t make it a comment.

Multiline comments have been proposed many times over the years. What is different about this proposal? Have you answered any of the objections previously raised? Has anything changed since those proposals were made?

Zezombye · October 10, 2024, 12:16am

To my knowledge the only such proposal was the one that Brian linked above (I’ve searched the forum, the github issues and the PEPs and did not find anything else). That thread brought up issues with indentation (which could indeed maybe be a problem in some circumstances), discussed other syntax (irrelevant to my proposal), and a few objections:

“It is too easy to read the unadorned lines of code as actual code” - not an issue with syntax highlighting which is everywhere nowadays.
“The lexer will have increased complexity” - I don’t know anything about that but if the complexity is that big then that should be given as the official reason to not have multiline comments.
“It overloads already used operator symbols in completely new and incompatible ways” - I don’t think this is a good argument because that is already the case for existing symbols such as ** dict destructuring which has nothing to do with the power operator, or the | operator for match/case which has nothing to do with binary or.
“It can’t be applied to existing code if that code contains */” - Not an issue to me because this combination of character very rarely occurs anyway. That same argument can be applied to multiline strings as well.
“Comments in the middle of a line or logical statement are terrible things to do” - I don’t see why.
“It would be a massive amount of churn for all parsers and highlighters” - I can’t contradict that, but I don’t see why adding two tokens to keep track of is a massive amount of churn.

Are there any other topics that already discussed the subject? I was indeed surprised to not see any.

Rosuav · October 10, 2024, 12:23am

Yes, because… they’re not comments. This consideration does not apply to hash comments, which work just fine on lines containing hash characters.

The question of whether /* /* */ this part */ is a comment is one that has been debated at length, and there’s no single right answer. Both answers are somewhat wrong. And yes, I’ve used languages that have made both choices, and have been frustrated by running into their respective issues.

But the biggest question is: Why do it? “Because triple quoted strings make bad comments” isn’t a good enough answer. Why do we need slash-star comments?

MegaIng · October 10, 2024, 12:25am

These are just two I instantly found via quick search in the mailing list, I am sure there are more if you try different terms. The fact that the discourse proposal is called “Yet Another Multiline Comment Proposal” should have been an indicator for you that you are missing something.

storchaka · October 10, 2024, 9:11am

So you have syntax highlighting everywhere that supports hypotetical multiline comments, but no editor that supports commenting/uncommenting multiple lines at once?

jcampbell05 · October 10, 2024, 1:27pm

So if the one of the main issue people are hoping to solve is having multiline comments that maybe don’t need to care as much about indentation as docstrings do since docstrings are part of the code since they are still strings from the POV of python.

The simplest implementation I’ve seen that doesn’t involve implementing c style multiline comments, is in Lua

In Lua a single comment is declared using -- If this comment starts with [[ it tells the parser to stop parsing the remaining tokens until it sees another comment begining with ]]

From the parsers point of view it’s a simpler thing to implement (Increment a counter for each comment beginning with “[[”, decrement it for every line beginning with “]]” if this counter is bigger than 1 then ignore the following lines of code otherwise process them)


Lua

--[[
Comment
--]]

Future Python Multiline comments

# [[
Commented out code
# ]]

Nineteendo · October 10, 2024, 4:57pm

That wouldn’t be backwards compatible and you can just let your IDE add a # to the start of each line.

alexrov · October 10, 2024, 9:11pm

This should not be a surprise. They are indeed NOT comments.

It sure does quack like a duck, though. It has always struck me as odd that Python does not have multi-line comments. Users end up utilizing multi-line strings as though they were comments, because there is nothing else to satisfy that use case. That reveals the deficiencies in using them as such. It’s clear that OP understands they are not comments, but that doesn’t change the fact that it’s still how they are used.

I think it’s fair to keep pushing for a feature that would be widely used, even with the failure of previous proposals. Although, I agree that the proposal should be substantively different or add new information or data in support of the request.

My main issue with C-style multi-line comments in particular is the first one addressed above: lack of clarity regarding whether a particular line is code or comment. Some people will optionally format their comments like this

/*
* First line
* Second line, etc.
*/

That immediately indicates that each line is part of a block comment. The problem is that now we’ve essentially reinvented single-line commenting, which Python already has with # comments. I don’t think Python should assume that someone is using an editor with syntax highlighting. That’s the only insurmountable issue I’m seeing. The other objections could be addressed by using a different symbol or combination of symbols to represent a block comment, or else those issues don’t appear to be compelling concerns as far as I understand.

But the biggest question is: Why do it? “Because triple quoted strings make bad comments” isn’t a good enough answer.

Well, how do people use comments?

To document their code
To “deactivate” old code without deleting it — which may result in loss of information or context.

The first case is sometimes more conveniently handled by multi-line comments when large blocks of text are required. This is especially the case when the documentation contains flow diagrams or ascii representations of non-word entities (perhaps diagrams or images). You wouldn’t want a character on each line disrupting the visibility of whatever you’re trying to represent. You can learn to live with it, as we all have, but it’s not ideal.

The second case is well-served by multi-line comments in any situation in which one needs to comment out large blocks of code. As @storchaka said, this is trivial on a lot of modern editors, but I really don’t think the language should assume that such an editor is being used. I might find myself having to write a script to comment out parts of my code if faced with having to insert a # on 3,293 individual lines. Not great. The same could be accomplished with 2-4 characters in other languages.

Anyway, I’ve written way more than I initially intended, but I think the concept at least merits consideration, even if the implementation is not appropriate. I don’t have the right answer.

MegaIng · October 10, 2024, 10:01pm

And most people just use multiline string for documentation - in fact, python explicitly supports using them for this purpose and various tools extend their usage for this even more (by allowing them on attributes as well).

And this is IMO the strong arguments against multiline comments. Because no, they don’t. You think they do, but you are wrong. They only work if your code doesn’t already contain a multiline comment. Oh, and it can’t contain */ in any string or single line comment.

A partial solution to this is to allow these comments to be nested so that /* /* */ */ is actually a complete single comment instead of ending at the first */. But that now requires these specific syntax constructors to always be balanced even if they only appear in strings (just in case you ever want to comment out that code) - Or you lose one of the two benefits you just described.

Ofcourse multiline strings have this exact same problem - but single line comments don’t, so just use them.

jcampbell05 · October 10, 2024, 10:07pm

Just out of interest does python optimise away the doc strings ?

Zezombye · October 10, 2024, 11:40pm

Thanks, I indeed didn’t search the mailing list. I searched for “multiline comments” and “multi-line comments” and didn’t find any other thread than those two.

My rebuttals to objections:

Take advantage of Python’s rules for handling open parentheses:

foobar(a, b, # comment on b
c, d, e, f)
foobar(a, b, # comment on b
# more and longer comment on b
 c, d, e, f)

This introduces a new line and makes the code less readable (and more annoying to properly keep indentation when adding/deleting the comment etc). Most importantly, it does not work without parentheses such as in if statements.

if a == 3 /*and b == 2*/:
    c = 4

If you want to comment out b == 2 to debug, then the only way is to add parentheses around the if statement:

if a == 3 \#and b == 2
:
	c = 4

-> SyntaxError: unexpected character after line continuation character

if (a == 3 #and b == 2
):
    c = 4

Requiring adding parentheses around an if statement every time you want to comment stuff out is quite annoying. Plus, it makes the code harder to read.

For commenting out part of a line I think best practice is duplicating the
entire line as a comment and editing it directly. That handles scenarios
that inline comments don’t and more importantly ensures reverting is error
free.

# def foo(a, b=None):
def foo(a, b=[]):

In the case of the if statement above this would result in misunderstandings. Consider the following: I want to comment out b == 2 and later on I add c == 5.

#if a == 3 and b == 2:
if a == 3 and c == 5:
	c = 4

Here it looks like I am experimenting with replacing b == 2 with c == 5. Meanwhile:

if a == 3 /*and b == 2*/ and c == 5:
    c = 4

Here I am clearly experimenting with having b == 2 alongside c == 5, which is a different thing.

So now you’re changing the semantics from multiline to embedded comments.
Being able to embed a comment within an expression is a very different thing
from just having comments extend across multiple lines.

I guess so. The biggest problem with lack of multiline comments is indeed the fact that you cannot do embedded comments with the current syntax, even with the multiline string hack.

So you have syntax highlighting everywhere that supports hypotetical multiline comments, but no editor that supports commenting/uncommenting multiple lines at once?

VS Code and Notepad++ indeed have a shortcut to comment/uncomment, however Vim doesn’t for example. That is also the case for some primitive web text editors. As @alexrov said, Python shouldn’t assume the user is using an advanced text editor.

Also, I’ve never needed a shortcut to comment (why would I use a shortcut to type 4 characters?), I shouldn’t have to change my workflow for a language.

The argument that, without syntax highlighting, multiline comments can be hard to see (and not having syntax highlighting is extremely rare these days, vs not having a language-specific shortcut to comment/uncomment), could be valid, however note that the same applies with multiline strings. People are currently using """ and """ to do multiline comments (as encouraged by Guido himself), that proposal would simply replace that with /* and */. If having a # on each line is required, it should be part of the code style, just like tabs vs spaces for example.

alexrov · October 12, 2024, 12:12am

The examples in the OP illustrate why that doesn’t work for all cases. The latter two out of the three code snippets result in a syntax error based on the content and placement of a multi-line string that one would want to utilize as a comment.

Well, you beat me to the punch in rebutting the example you gave. So it seems we agree that multi-line strings don’t suit every need for large comment blocks. That forces me to re-state my previous concern that adding n>50 single-line comments is not always practical in every editor, and a language shouldn’t care about what editor you’re using. If the parser just maintains a stack of open/close tokens like @jcampbell05 suggested with the Lua analogy, I don’t see why /* /* */ */ or any tiered hierarchy like that would cause problems.

The only issue would arise when someone tries to multi-line comment out some code but starts or stops the block in the middle of another multi-line comment. We have to draw the line somewhere, and that’s where I would draw it. You can’t expect the parser to handle that favorably.

even if they only appear in strings

I’m not married to /* */ specifically. If this turns out to be too ubiquitous a combination of symbols, we can easily use something else that would never unintentionally make it into a string. Take,

#multi_line_comment_start!
foo = bar
#multi_line_comment_end!

as one silly but illustrative example. This would still be a lot less tedious than applying # to a thousand lines without writing code to mutate your own code or using a supporting editor.

Rosuav · October 12, 2024, 12:42am

print(“/* You can’t comment this code out without thinking about it”)

alexrov · October 12, 2024, 1:40am

I addressed that in the same post, though. See:

Rosuav · October 12, 2024, 1:42am

No matter what you select, it will always make it into a string, because people will generate code in code. The /* pairing wouldn’t be particularly common except that it IS the commenting character.

You can’t handwave this away.

Zezombye · October 12, 2024, 2:10am

In the very rare case that /* or */ occurs into a string, one of the workarounds with # mentioned above can be used.

I’ll argue for /* */ instead of other characters, not only for consistency with other languages, but also because these characters have been “battle-tested” so to speak. I can’t remember any time I saw either of those two combinations in a string. Also, numpad can easily be used to write those, while other characters would perhaps be more difficult to write depending on the keyboard layout.

Nineteendo · October 12, 2024, 7:56am

Doesn’t work, comments are not illegal in strings:

#multi_line_comment_start!

print("""
#multi_line_comment_end!
""")

#multi_line_comment_end!

jcampbell05 · October 12, 2024, 2:21pm

Seems like we are debating two cases here:

The ability to multi comment without needing to manually prepend # everywhere or use docstrings which is still technically code and so has to follow the rules of the r langauge

In this case I think a small tweak to the comment syntax to indicate it’s not just a single line is all we need. This is why I brought up lua as it handles this in an elegant way which would for python

Self contained comment which can be placed in middle of expression

This case is a bit more tricky as we now need to handle thing like what happens if it’s in the middle of a string.

This might be something worth ignoring for now and picking up later. Let’s just focus on the first case