I already wrote “This could help simplify Python’s grammar in the future.”
Unless this transition happens, teachers need to teach all behaviors, not only single consistent behavior to students.
I didn’t think that being able to achieve unified behavior only within a single file was a noteworthy advantage for PEP.
I see the user experience of knowing what the contents of a string will be after parsing as different from the complexity of the grammar, even if they both flow from the same specification.
I also think that PEPs should take a long view. The advantage of a transition is that it will eventually allow for teaching a single, consistent behavior. It is a given that teaching multiple behaviors will be necessary during a transition, but it seems strange to claim that eventually consolidating isn’t an advantage. Again, I am not asking you to adopt the __future__ approach, and it’s fine to reject it. I just think the advantages are being minimized.
That’s subjective, hence why it should be documented.
It’s difficult to write text for you, given that the rationale you gave so far
is not exactly self-evident either. But as a point for you to reject, here’s an attempt. Actually two, because there are two separate mechanisms: the one where nothing is stripped (for consistency), and the Julia approach, where empty initial lines are stripped, but strings do not have to start with a newline.
Keeping the initial newline
An argument brought up during discussions on DPO was that all string modifiers (b, f, r, t and now d) should keep consistent behaviour around initial (& trailing) newlines, not least since it can be surprising that dedenting (a horizontal operation) affects newlines (a “vertical” property), aside from introducing an unnecessary inconsistency in the semantics of multi-line strings that users will have to internalise. The proposed semantics would be that the opening line (directly following the triple quotes) would be excepted from dedenting. Prior art for this is Scalafor example, as well as Julia (partially, see below). This is rejected because […][1]
Allow string content after opening triple-quotes
Julia chooses an approach where a bare newline after the opening quotes is stripped (as in this PEP), but any non-empty content after the opening quotes is kept (and does not participate in dedenting). This is rejected as […too complex, etc…]
an example in your favour is Java; you’re already referencing the JEP for that. ↩︎
Actually, the prior art in other languages in the PEP are not very consistent at all w.r.t. handling opening and trailing newlines. The only language (of the ones surveyed) with matching behaviour is Java.
I understand that you feel it is more consistent to leave the first line break, but I cannot agree with you.
The first newline is immediately after the opening quote, on the same line. It is not present on the lines within the string’s content.
On the other hand, the last newline character is neither immediately before the closing quote nor on the same line. That newline character is removed from the content line of the string.
Comparisons with other languages are also not a strong enough reason to outweigh the actual benefits when deciding the details.
Other languages do not distinguish between dedented multiline strings and non-dedented multiline strings, or even include things that are not syntax at all. There is little value in maintaining consistency with those languages.
What’s important for d-string is to provide an efficient and easy-to-use syntax for use cases that already use textwrap.dedent("""...""") and for use cases where you want to remove indentation from multiline strings but cannot use textwrap.dedent("""...""") for some reason.
If you can explain the real-world benefits in such use cases, I will add a section for the idea.
Regarding the idea of removing the final newline, I explained that it was rejected after comparing its merits and demerits, stating the merit that it would eliminate the need for a workaround in use cases where some indentation is desired but the final newline is not.
I wrote “implify Python’s grammar in the future.” in the PEP.
When I wrote this sentence, I didn’t feel the need to explicitly state that simple grammar has advantages for learning and teaching, but was it unclear?
I really liked the idea of importing __future__.
However, I understand the disadvantages of having two types of Python coexist for a long time, and I also understand how much people fear that, so I prioritized a practical solution over my own preference.
If the merits are not sufficiently explained compared to the demerits, it is probably because the demerits are the reason why I had to make a choice I did not like.
Then why do you refer to any prior art at all, if their reasoning and conclusions apparently have no weight? These are not details, but the core mechanics of the feature.
I did. The benefit is staying consistent with existing Python string handling. You don’t have to agree with my conclusion to see that this would be an obviously beneficial thing (taken by itself). I even provided a formulation for you to reject with a proper argument – “I cannot agree with you” is not an argument.
This could be the seed for populating the “This is rejected because […]” placeholder I left for you.
Looking at this sheet, it is understood that there is no language that prohibit writing characters after an opening quote and preserves the newline after the opening quote.
This PEP is consistent with languages that do not allow characters to be written after any other opening quote.
It is a nonsensical design with no merit to not remove line breaks after an opening quote, when you cannot write characters there.
P.S. Scala’s stripMargin is string method, not syntax. It is totally different. String methods doesn’t know where string came from (e.g. multiline string literal, regular string litera, other expressions, read from file…).
I agree, but you haven’t answered why it must be forbidden to write characters there.
To be clear, you’re proposing to change a fundamental and long-standing aspect of python’s string handling. This part:
The onus is on you to make the case why that’s beneficial or necessary. Just saying “it’s obviously better” is not good enough, you should explain why it would be worse to stay consistent with existing behaviour.
That’s what I’m asking you to document (and you seem to have the arguments ready to go). I don’t understand why that’d be controversial, much less worth resisting.
Oh! Did you claim me to write the idea that you agree it’s totally meaningless?
You should have insisted on adding to the rejected idea the idea of allowing characters to be written after quotes, rather than the idea of not removing the first line break.
I and several people already explain it several times.
Scala doesn’t have dedent syntax. """ is just a multiline string.
Julia’s """ have dedent feature, but it is also a general purpose multiline string.
On the other hand, d-string is only for dedent use case. This is why d-string is different from """ in Scala and Julia.
I understand that. What I’m saying is that the arguments should be reflected in the PEP.
Both questions are deeply intertwined – once you forbid characters after the quote, the newline question disappears (but then the prohibition needs a rationale); if you allow characters there, then you need to discuss newline handling.
Incidentally, Julia’s approach would be more compatible with existing Python that what you’re proposing. You can reject the Julia approach as too complex of course, but that too should ideally be documented.
I don’t agree that it’s less clear. It’s an additional corner case for an unusual usage pattern, for the benefit of being more consistent overall. This is also how Julia handles leading spaces on the opening line.
I know what you mean, but that it’s an implicit method is more of an implementation detail. For the purpose of this discussion (and actual usage in the language), """...""".stripMargin in Scala works exactly like a d-string[1]. Such strings are not passed around, waiting to have the method applied; it’s basically always done at the definition site or not at all.
modulo the way Scala uses | to indicate indentation explicitly, but that’s unrelated to newline-handling. ↩︎
String method doesn’t know where the string came from. Multiline literal, expressions, reading from file or socket, etc… Even though it was often used with string literals, it is designed to be usable with other strings as well.
So stripMargin is like textwrap.dedent() rather than d-string.
But you repeatedly insisted that you would not remove line breaks without the idea of allowing other characters…
Now that you already agree that it’s an obsession with foolish consistency that brings no benefit, I’ll end the framing war over it.
I will add to the “Rejected Idea” the reason why I did not design the d-string to be exactly the same as in Julia.
Have you written a bit of Scala? I can assure you it’s used equivalently to d-strings. And it’s not equivalent to textwrap.dedent because the postfix method doesn’t need parentheses around the whole string.
That’s not what I wrote, but OK, simple misunderstanding. I’ve been saying – with examples:
– that it’s not self-evident why content after the opening quotes must be forbidden (for the feature to function technically or semantically).
You’re knocking down a strawman. I did say I understand why people want to strip the initial newline (and that this should IMO be done for all strings, not just d-strings). There’s nothing foolish about having multiline strings behave regularly.
There was no “framing war” from my POV though: I’m only asking you to document in the PEP why you’re breaking a long-standing invariant. It’s strange to me how you take that as aggression (“war”), but let me apologise for any offence given unintentionally.
Parentheses around the whole string are not important here.
Unlike string methods, syntax can utilize information such as the type and position of quotes and escape sequences before they are processed.
String methods do not even know whether the string was a literal or loaded from another file. There is no way to know whether the first newline comes immediately after the quote or text file starting with blank line.
Since I have little experience with Scala, if stripMargin is a special method that obtains information not only from the string but also from the compiler, then I am mistaken.
What strikes me about this table is that there is the strong correlation between the two “opening” columns and the two “trailing” columns:
Languages that don’t allow content after opening """ always strip the opening \n (i.e., they effectively use """\n as the opening marker); whereas languages that allow content after opening """ don’t tend to[1] strip the opening \n.
Languages that don’t allow attaching trailing """ to content always strip the trailing \n (i.e., they effectively use \n""" as the trailing marker); whereas languages that allow allow attaching trailing """ to content don’t strip the trailing \n.
The vast majority of languages (Swift, C#, PHP, Scala and Python) are consistent in having symmetric opening/trailing markers for multiline strings; they just disagree on whether or not the markers include a \n. Java is the only language with asymmetric markers, while Julia is the only language where markers depend on the string content.
My impression is that we largely agree that Julia’s handling is too complex. Thus, the remaining questions become:
Should we use symmetric or asymmetric opening/trailing markers?
When framed this way, I have a fairly strong preference for symmetric markers, both axiomatically[4] and for consistency with the vast majority of languages (including Python itself).
Should the symmetric markers include a \n or not?
Here, I have a slight preference for not, mainly for consistency with existing Python multiline strings. (This also matches the existing docstring style in PEP-8, as mentioned previously in this thread.)
A meta note: I’m not sure whether this is the right way of framing the issue or whether it is better to frame it in terms of common use cases for dedenting (as was done previously in this discussion). But while the previous framing left me confused and uncertain, this new framing gives me what looks like an obvious and easy-to-explain answer and there might be some value to that.
with Julia in some cases being the only exception ↩︎
Using asymmetric markers feels very wrong to me, almost as bad as mixing different types of brackets. There’s a good reason function calls don’t look like print("Hello world"} or list literals aren’t written as l = [1, 2). ↩︎
It feels unfair to mix statistics on symmetry by mixing multiline string syntax that can be used to write single-line strings without dedent and syntax optimized for writing multiline strings with dedent.
In Scala and Python, strings that start with """ are also used to write single line strings that include quote characters, such as """_'"_""".
In Julia’s case it is indeed complex, but when the goal is dedent one would mostly use """\n. Allowing content immediately after """ is for the same reason as in Scala and Python.
The d-string is specialized for multiline strings, Let’s rewrite the previous table selecting only syntax optimized for writing multiline strings.
Strips last newline: C#, Swift, PHP
Keep last newline: Java, Julia, Ruby, Perl, bash
Looking at it this way, we can see that all languages ignore the newline immediately after the opening marker when writing multiline dedent strings, but are roughly split in half on whether or not to remove the newline of the last content line.
And what is important for Python d-string is that it is easy to rewrite existing textwrap.dedent("""...""").
As can be seen from this table, ignoring the newline immediately after the opening marker and starting content lines from the next line is very important for ease of writing dedented multiline string, so it is worth changing the behavior from the existing """.
In contrast, whether or not it is better to remove the newline included in the last content line is debatable. There is no strong reason to change it from the existing """ behavior.