IIUC the proposed d
string uses the prefix before the closing “”“.
In the example you cited, there was no prefix before the closing “””.
So the embedded indentation was preserved.
(I had to look twice at the example as well.)
IIUC the proposed d
string uses the prefix before the closing “”“.
In the example you cited, there was no prefix before the closing “””.
So the embedded indentation was preserved.
(I had to look twice at the example as well.)
End-user perspective:
While I have multiline string literals and occasionally have to look up dedent to get the code to both look nice and conform to black&friends, I still find that this usage is just too rare.
Instead, what if we had tagged template literals, like ES6?
Dedent could be a 1st party use case, and libraries would bring in many other.
Personally, I don’t tend to use dedent a lot to tidy up multi-line strings. I typically just use a global variable, which I can write without indentation, so there’s no need for a dedent. Having said that, it is clumsy and being able to define the string at point of use would be convenient.
So I’d find something more convenient beneficial, but not essential. But as noted, I don’t find the problem sufficiently important to have strong feelings on the matter.
str.dedent
method. That seems like the simplest to add, and a good “quick win”.The ValueError
cases seem like they could be the cause of confusion. People will forget that the quotes have to be on lines of their own, and ValueError
at runtime is not a great result. A syntax error would be better. Yes, linters could also catch this, but not everyone uses linters (nor should they have to).
Also, will the resulting string have a terminating newline? I’ve had cases where I’ve needed both options, and the d-string approach will have to pick one, and not support the other. The str.dedent
approach of using the initial line’s indent doesn’t have this problem (needing an initial newline is much rarer, in my experience).
Overall, I’m +1 on a str.dedent
method, and maybe +0 on d-strings with the existing dedent semantics, but -0 on d-strings with the proposed “last line indent” semantics.
I was talking about “str method implements algorithm exactly same to d-string”.
t1 = d"""
hello
"""
t2 = """
hello
"""
assert t1 == t2.dedent_as_d_string()
Do you mean SyntaxError
instead of ValueError
from this str method?
Java and Julia uses longest common indent from nonblank lines and the last line with terminating quote. They allow putting terminating quote in the last content line to omit last newline.
var s = """
Hello
"""; // s == " Hello\n" -- indent is calculated by quote line. This is differ from textwrap.dedent().
var s = """
Hello"""; // s == "Hello" -- indent is calculated by this only line. No indent are kept.
var s = """
Hello\
"""; // s == " Hello" -- using backslash to remove last newline to keep some indent. It's rare.
On the other hand, Swift trims terminating newline. So you need to put blank line for terminating newline.
var s = """
Hello
""" // s == " Hello"
var s = """
Hello
""" // s = " Hello\n"
I prefer terminating newline by default so I prefer Java and Julia, although my first post borrow from Swift.
I don’t know tagged string. So I don’t have opinion about it for now. I will learn it later.
As for details, I prefere Julia rules. Differences between Julia and Java:
"""This is a docstring.
"""
"""
This is a docstring.
"""
"""\
This is a docstring.
"""
There’s precedent for the intended use case in PEP 257.
TatSu uses a variation of that algorithm to be able to specify templates for code rendering, have them look nice in the Python module, but rid of the leading spaces so a formatted template can be indented and embedded into the result of another template.
Typically templates start with """\
so leading spaces are recovered from the first significant line.
def trim(text, tabwidth=4):
"""
Trim text of common, leading whitespace.
Based on the trim algorithm of PEP 257:
http://www.python.org/dev/peps/pep-0257/
"""
if not text:
return ''
lines = text.expandtabs(tabwidth).splitlines()
maxindent = len(text)
indent = maxindent
for line in lines[1:]:
stripped = line.lstrip()
if stripped:
indent = min(indent, len(line) - len(stripped))
trimmed = (
[lines[0].strip()] +
[line[indent:].rstrip() for line in lines[1:]]
)
i = 0
while i < len(trimmed) and not trimmed[i]:
i += 1
return '\n'.join(trimmed[i:])
Sorry, no. I thought the comment was in relation to the d-string form. I don’t think d-strings should raise runtime errors, they should compile directly to the dedented string. Sorry for the confusion.
We already have textwrap.dedent
. I think having other approaches (whether d-strings or a string method) that implement different rules would be needlessly confusing. So unless you’re proposing to change textwrap.dedent
(which has its own backward compatibility issues) I would prefer the Python rules () over the Julia or Java ones.
If we go to from __future__ import dedent_multiline_string
, I agree with you because user need to check all existing triple quoted strings.
On the other hand, if we go to d"""
prefix, we don’t need to care about backward compatibility.
I prefer not allowing d"""\
nor d"""first line
because it is much simple and clean.
Julia uses “triple-quoted strings are also dedented to the level of the least-indented line” basic rule. It is same to textwrap.dedent()
.
There are some difference between them, but I can see Julia rule is natural extension for literal to Python rule.
So it is subjective that Python rule and Julia rule are different two rules.
OK. So call it the textwrap.dedent
rule rather than the Julia rule. I’d expect more Python users to understand the former than the latter.
And to repeat, if “there are some differences between them”, either make dedent
match the Julia rules, or use the dedent
rules unchanged. Having the two forms use different rules would be a source of confusion and errors.
What if we had something like triple backtick literals
x = func```
my string
```
def func(str_literal):
return ...
borrowing from js. Instead of introducing a ton of new letters we could borrow from javascript and allow any templating function i.e.
x = dedent```
hey there
hope all is well
```
I guess this doesn’t solve composition
Triple backticks have been proposed and rejected in the past for various uses. One reason is that they’re visually tricky to differentiate, but I think the bigger one is that they are quite difficult to type on some (non English) keyboards.
I wish we could just get rid of the old repr ` operator.
The benefit d-string have is that works in parse timing. So dedent
prefix is not nice enough.
On the other hand, if we introduce ```
, we can dedent it by default.
See the discussion in this thread:
See this comment: