I was wondering why it cannot be combined with “b”. I think dedentation and the bytes data type are orthogonal to each other (from a design perspective, maybe not from the implementation perspective).
where text becomes 'Hello\nWorld!\n', with a trailing newline, should paragraph preserve the trailing newline to become '<p>\n__Hello\n__World!\n\n</p>', or should it automatically remove the trailing newline to avoid a blank line in the output, so it can become a prettier '<p>\n__Hello\n__World!\n</p>'?
If the implicit behavior of automatically removing a trailing newline from a {...} evaluation spooks people, I suggested using a backslash to explicitly avoid an extra newline:
paragraph = df'''
<p>
__{text}\
</p>
'''
But then it will likely make most df-strings ridden with ugly backslashes.
I’m personally more in favor of an implicit behavior of automatic removal of trailing newlines from {...} evaluations to keep the usage clean, but would not mind an explicit solution.
fmm, I did not support byte strings in the same way, as t-strings and f-strings do not support byte strings.
However, I can’t think of a clear reason why d-strings couldn’t support byte strings. From an implementation perspective, it is easier for d-strings to support byte strings compared to t-strings and f-strings.
When writing C or HTML snippets, Unicode strings are usually used, but byte strings are also used in some cases.
I will take some time to consider whether to add support for byte strings.
I like the way multi-line strings are dedented in C# ( Raw string literals - """ - C# reference | Microsoft Learn ), this proposal is similar, with the most significant difference (imo) being the inability to not have a trailing newline (C# doesn’t include the newline before the closing quotes). I wish this was possible to do with Python’s dedented multi-line strings though it seems to be at odds with the assertion comparing dedented and non-dedented multi-line strings that you presented above. I tend to agree that this assertion is important to hold true but it does seem like an annoying quirk that the string will always end with \n, requiring me to e.g. use .rstrip() on it. I can’t think of anything that would allow the assertion to hold true and allow the string to not have end with a trailing newline, sadly.
I acknowledge that the concatenation approach is not ideal for copy-and-pasting.
However, regarding the comparison to heredoc, my primary concern is the visual inconsistency it introduces. Subjectively speaking—if you will pardon the phrasing—I find that syntax somewhat unaesthetic.
Furthermore, I believe d-strings struggle to handle scenarios involving deep indentation. In such cases, block strings often push the content too far to the right, causing line length issues that I prefer to avoid.
In this context, the necessity of explicit newlines in concatenated strings becomes an advantage. It allows me to manually wrap long content to strictly adhere to line-length limits.
if config.is_valid:
if user.is_authenticated:
# In deep indentation, implicit concatenation gives me
# precise control over line breaks and length.
query = (
"SELECT id, username, email, created_at "
"FROM users "
"WHERE status = 'active' "
"AND role = 'admin'"
)
The whole point of a d-string (as the title indicates) is a multiline string. If your use case is a long single-line string then yes by all means multiple single-line string literals is a great tool for the job, though with line continuation a d-string doesn’t look too bad to me either:
if config.is_valid:
if user.is_authenticated:
query = d"""
SELECT id, username, email, created_at \
FROM users \
WHERE status = 'active' \
AND role = 'admin'
"""
Fortunately, you always have the option of not using the new string syntax.
I hope you don’t intend, with your posting, to argue against, the rolling forward of a feature that has been missed through decades by thousands of people, just because it is another option that doesn’t fit your particular taste.
Inada already explained why there should be a trailing newline.
And since the indentation of the closing triple quotes is arguably the most visually intuitive way of specifying the level of dedentation, the above d-string should dedent by only 4 spaces instead of 8.
I do agree that formatting the content of the string with more indentation than the enclosing quotes follows the current styling recommendations better, though it’s probably a necessary tradeoff if we want the closing triple quotes to control the level of dedentation.
Love the PEP, I think it’s very well argued and thought-through!
I can understand how you arrive at this from the POV of the algorithm that determines the indentation of the trailing triple-quote and deducts that from all the lines, but this restriction seems artificial to me.
Taking the two relevant of your examples, I don’t see how
s = d"""Hello
__World!
"""
print(repr(s)) # 'Hello\n__World!\n'
s = d"""\
__Hello
__World
__"""
print(repr(s)) # 'Hello\nWorld!\n'
would have any ambiguity in their mechanics. Likewise for the following third example that ties both cases together
s = d"""Hello\
__World!\
__"""
print(repr(s)) # 'HelloWorld!'
The way I read this is that the line containing the opening triple quotes simply does not participate in the stripping of indentation. This is IMO still a rule that’s very intuitively explainable[1]. As any feature, it has some potential for suboptimal use, e.g.
s = d"""__Hello
______World!
____"""
print(repr(s)) # '__Hello\n__World!\n'
but I think that’s where we should rely on “consenting adults” being able to make the choices they prefer (and of course popular linters will come up with best practice rules anyway).
To summarise: I think it’s imperative to avoid ambiguity, but I also think it’d be better to avoid restrictions that aren’t strictly necessary for that.
not least because the position of the opening quotes will generally be different from the rest of the multi-line string anyway. ↩︎
This could lead to surprising outcomes if text that appears to be part of a string is not actually part of it. In addition to being confusing, it could be abused to be misleading, possibly even with security risk if the situation was just right.
Can it be a SyntaxError instead?
Perhaps with the exception of empty lines being allowed to still be empty lines. Maybe also allow lines with only indentation characters (spaces/tabs)
def examples():
____d"""
____this should
be an error
____"""
____d"""
____this would be ok:
____this too probably:
__
____"""
I intended to propose the same thing you are suggesting, but my English was inaccurate.
The pointed-out sentence was intended to mean that when the indentation to be deleted is 4 spaces, lines with only 2 spaces would not result in a syntax error but become blank lines. However, it can be read as if any short string is allowed.
I will revise the expression in that paragraph to avoid misunderstanding.