In the column indicating the Trailing Marker, is \n""" a shorthand for the regex \s*\n""" ? Or do those languages really require the closing """ to be at the start of a line?
bash heredoc requires the endmarker to be at the beginning of the line. Other languages with heredocs required the endmarker to be at the beginning of the line, but in the process of extending to allow indentation, they allowed the endmarker to be indented.
However, I do not like the idea of including newline characters in markers. Is the content starting from the next character after the opening marker, or from the next line? Does the content include characters up to just before the end marker, or up to the previous line? I think this way of thinking is more natural. And when considering up to the line before the end marker as content, the question is whether or not to remove the last newline.
All heredoc forms consider up to the line before the end marker as content. Only PHP removes the last newline; other languages do not remove the last newline.
Among the languages that use """, C# and Swift do not allow writing the end marker on a content line, so they are similar to heredocs. Both remove the last newline.
Julia and Java allow writing the end marker on a content line. Neither removes the last newline. This is the idea this PEP currently adopts.
Regarding the opening marker, it can be said that all languages treat from the next line as content. Julia allows writing content immediately after the opening marker, but when starting with """\n, content can be written from the next line too. Ignoring the dedent feature, this PEPâs d"""\n corresponds to Juliaâs """\n, and Pythonâs existing """ is the same as Juliaâs """content....
I understand that itâs implemented differently[1], thatâs why I called it an implementation detail. I didnât respond on this anymore because itâs pretty off-topic, but Iâm coming back to it because it leads to an idea that might be interesting.
Just to wrap up the scala-side of things, in terms of actual use, you can consider """...""".stripMargin as a verbose postfix string modifier. That â together with the discussion about opening & closing markers â gave me an idea that Iâm not super-duper in love with, but which may perhaps be worth exploring whether it can achieve a better trade-off globallyâŠ
Which is that, if we want to introduce new behaviour around new-line stripping, we might consider imbuing the closing quotes with a modifier.
In other words, we could do:
# modifier for opening quote --> keep existing behaviour of not stripping \n
s = d"""
____foo
______bar
____baz
____"""
assert s == "\nfoo\n bar\nbaz\n"
# modifier for closing quote --> can have new behaviour, e.g. stripping \n
s = """
____foo
______bar
____baz
____"""d
assert s == "foo\n bar\nbaz"
Obviously, itâd also be possible to introduce only the latter. Iâll be the first to admit that behavioural differences between d""" vs. """d would have to be taught (rather than be self-explanatory), but at least the syntactically new position would allow us to introduce new behaviour in a consistent manner.
Has anyone got an actual use case for keeping the leading \n? I can understand the contention about the trailing newline but the conversation keeps circling back to the leading one, seemingly solely on consistency for the sake of consistency grounds.
This is a quite odd take on the term âimplementation detailâ.
I would call is a âfundamental design decisionâ, not only of Python, but of most (modern) programming languages[1]: If one writes "...".dedent(), the member function acts on the string object created by the string literal, not on the string literal itself. So, it cannot depend on syntactic aspects of the literal, it must behave as a="..."; a.dedent(). Again, this is not because of âimplementation detailâ, it is how member functions generally work.
People (in Scala like in Python) want to keep their code indented without polluting their multiline strings with superfluous spaces. In Scala, the idiomatic way to do that is with a method, but, as I keep explaining, itâs besides the point that itâs a method. Once again, itâs about USAGE being equivalent to d-strings[1], not implementation.
I donât think there is a use case for it. I think it is a question of consistency vs utility.
Itâs fine for the PEP to decide that since nobody actually has use for the leading newline, itâs not preserved.
The conversation goes in circles when the justification is that this way is âintuitiveâ or (worse) âcorrectâ and therefore doesnât deserve to be documented. I just want to make sure the PEP acknowledges that there is a nontrivial decision to be made here â itâs not obvious even if the majority of folks have the same preference.[1]
I have a feeling this discussion is going around in circles without reaching a conclusion or agreement. Maybe everyone would do good to self-censor their instinct to immediately reply when they see something that they donât agree with?
I donât think we are looping about whether to remove the newline after the opening quote.
Vetinari was the only one who proposed the idea of not removing the first newline, but he agreed that there was no merit to that idea. His real opinion was to allow writing content after the opening quote. (PEP 822: Dedented Multiline String (d-string) - #96 by h-vetinari)
The discussion about the opening quote is settled for now and not being repeated.
Only updating the PEP remains as my task.
Vetinari does not agree that Scalaâs stripMargin is a method of strings and unrelated to the behavior of quotes, but that is not a topic that affects this PEP, no further discussion is necessary.
I donât think there are any other topics to discuss or content to add to the PEP.
The âuse caseâ framing is not very helpful when talking about consistency. Is it a âuse caseâ to not have to audit your outputs for correct vertical spacing when switching from f""" to df"""? Is it a âuse caseâ for users to have less special cases to learn and keep in mind?
Please donât misrepresent my statements. Of course I understand itâs a method.
Again, thereâs no need to misrepresent my position.
That aside, @gcewing and @Jost had voiced similar concerns.
In any case, I tire of the thinly-veiled hostility in this discussion. Good luck!
I apologize for my poor English. When I said âdoes not agree,â I meant of course âunrelated to the behavior of quotes,â not that you denied that stripMargin is a method.
Hmm. Did I misundarstand you?
I was just trying to summarize opinions as the PEP Author, not to misrepresent your opinion.
FYI, I have made a pull request to add âAllow content after opening quotesâ to the rejected ideas.
I mentioned about âit looks symmetricalâ in the section.
Thereâs a contingent of at least a few thread participants who seem to think this is at least an idea worth considering. I advocated for this in the Ideas thread. I have not pushed for it again because I consider that feedback to have been taken and rejected.
I would really like to see it listed in Rejected Ideas, with at least one sentence to explain the benefits which justify giving up consistency with other multiline strings. The current draft still says nothing on this topic, as far as I can see.
EDIT: On rereading it again, I think the PR above addresses this? Itâs phrased very, very differently from how I think about the issue, but it touches on the same topic.
I wrote this in response to âthe conversation keeps circling back to the leading one, seemingly solely on consistency for the sake of consistency grounds.â
So I didnât mean that no one has supported this idea in the past, but that only Vetinari proposed the idea in this thread, and since he himself actually wants to allow not just newlines but characters as well, there are no longer people repeating this idea.
Upon rereading your comments from the previous thread, I did not think you wanted to keep the newline immediately after the opening quote. You were opposed to the introduction of d-strings itself.
That is related in the sense of not increasing the behavior of quotes. However, it is not the same idea. No circle made there.
In the previous thread, you opposed increasing the complexity when concatenating some string literals. You thought str.dedent() was better than d-strings.
So this is not just a matter of the newline immediately after the opening quote, but the whole issue of d-strings that dedent in literals in the first place.
So please read âMotivationâ, âRatinaleâ, and âRejected Ideas / str.dedent() methodâ sections too. The reasons why changes to literals are necessary are written there.
The biggest problem is that str.dedent() does not work for t-strings. Even if we added a Template.dedent() method, f-string + str.dedent() and t-string + Template.dedent() would not behave the same way.
t-strings are intended to be used for writing SQL, HTML, etc., and those use cases often overlap with use cases that require dedent.
So I believe that providing consistent dedent functionality for t-strings and f-strings is worth the cost of complicating the string literal concatenation rules.
I thought the issues with string literal concatenation are naturally included in the complexity of adding string prefixes, so I didnât think they need to be mentioned specifically.
However, if you think that point should absolutely be included in the PEP, could you propose a sentence explaining the magnitude of the issue?
Thanks for going back to the old posts! I did indeed prefer str.dedent() (and still do, even with its disadvantages). There were also a few posts starting here in which I advocated for keeping the leading newline character for consistency with existing multiline string syntax.
The old thread was quite long, so Iâm not surprised that this point may have gotten lost.
I think youâve misread which point I want covered in Rejected Ideas. Let me just propose some text and you can take it or not (or modify), as you prefer:
Rejected Ideas
Preserving the Initial Newline
Existing multiline string literals do not require that the first character be \n. If there is an initial newline, it is preserved, and users rely on """\ to strip the leading newline as is often desired.
D-strings could implement the same behavior, keeping them consistent with other string literals.
However, consistent behavior with other multiline literal types would allow for string content to start immediately after d""", with no initial newline.
An explicit goal of this PEP is to reserve the initial line of the string for future use, as a potential space for markers for the language of the dedented text.
Furthermore, based on discussion, there are no significant use cases which actively benefit from having an initial newline in the resulting string content.
I have written that based on my best understanding of the argument against consistency with other multiline string literals. It doesnât convince me, personally, but I think itâs a reasonable viewpoint.