In the previous thread, Serhiy proposed __future__
import and I am +1 on it.
But Guido was -1 on __future__
:
I think tools like pyupgrade or 2to3 will reduce maintenance cost of existing code.
In the previous thread, Serhiy proposed __future__
import and I am +1 on it.
But Guido was -1 on __future__
:
I think tools like pyupgrade or 2to3 will reduce maintenance cost of existing code.
The proposal looks nice. If a new syntax is introduced (like d'''...'''
), the d-strings should not be a just a short-hand for textwrap.dedent
, but it should be done the right way.
The most important extensions of d-strings compared to textwrap.dedent
should be a compatibility with f-strings and t-strings, moreover it should support line continuation, so
foo = d'''
abc\
def
'''
should remove not only the newline, bu also all spaces before d
. JEP 378 would be a good starting point, but I am not particularly happy that JEP 378 allows the closing quotes to be indented more than the text (with the superfluous spaces been ignored).
Here is a tiny suggestion for an optional dedent adjustment. The number specifies indentation of the result in characters. By default it is 0 (max. dedent possible).
There is an update about str.dedent
being more suitable for this.
html = d:4"""\
<div> <!-- leading whitespace trimmed to 4 whitespace chars -->
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</div>
"""
An indentation could be even inserted (with space characters):
DEFAULT_HTML = d:4"""\
<div> <!-- four leading spaces inserted-->
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</div>
"""
I don’t think that proliferation of string modifiers is a good idea.
Can’t we just have some operator added to f-strings, which then takes care of the dedent ?
E.g.
f"""{#!dedent}
Some text
More text
"""
The compiler could take care of the conversion at compile time, if possible (some of the formatting in the f-string may prevent determining the right dedent to use), or defer this to runtime by using textwrap.dedent()
.
I think that’d be a pretty major change to f-strings (a magic keyword that modifies the rest of the string?). I’d rather have a separate modifier[1].
that I can ignore if I don’t want to use it ↩︎
Think of it as a shebang: the whole formatted string gets passed to a dedent function, very much like a Python script that is read from disk and then run using Python.
I mean I get it. I just think it’s a really big change to f-strings. My mental model of an f-string is “expressions inside of {}, and a literal string outside of {}”. This format would change that.
It’s not insurmountable, but I don’t think such a change is simpler[1] than introducing a different prefix.
in terms of teaching it, reading it, etc ↩︎
I mentioned this idea, because it will allow to add more such “interpret this string in some special way” kind of semantics without having to change Python’s syntax every time (there have been numerous such suggestions in the past and the d-string one is another new variant).
But yeah, I don’t want to hijack the discussion with a new proposal. Just suggesting that we may be better off, considering these things in a broader sense.
I would still advocate for the simpler alternative of str.dedent()
, where we can do the dedenting at compile time.
It doesn’t cover the more esoteric cases mentioned in this thread, but it also doesn’t have the drawbacks of more syntax or combinatorial explosion of string prefixes.
A
In terms of user convenience, the following order would be better
But we need to balance the costs and benefits of Python’s language specification and implementation complexity. My concern is that adding str.dedent() will negate the introduction of an additional d-string and ultimately make it less convenient for the user than if d-string were introduced.
Therefore, I am prioritizing the d-string discussion ahead of the str.dedent() discussion, and since there is still a year until Python 3.15 is in beta, there is no need to rush the str.dedent() discussion.
@methane please could you edit the message to be specific about what the __future__
import would change? I scrolled back several messages but I don’t think I’ve seen a concrete explanation.
Two more poll suggestions: make it multiple choice, and add a ‘do nothing’ option (/and maybe also a ‘add str.dedent()
’ option). Currently I can’t vote in the poll: I would vote ‘no’ on d-prefix and I don’t know what the future import entails so I would also vote ‘no’, but there’s no option for this!
A
Thanks for the rationale, this is a fair point.
I cannot the vote because of time limit. So I repost it.
I want to poll about d-string vs __future__
import.
__future__
import changes syntax of triple quote string literal like proposed d-string.
d/D/df/dF/DF/Df/fd/fD/FD/Fd/dt/dT/DT/Dt/td/tD/TD/Td/ud/...
__future__
import doesn’t need new prefix.
str.dedent()
is the most simple option.
__future__
import if you want both of dedenting literal and str.dedent()
__future__
import changes triple quote literal to auto dedentstr.dedent()
and never add d-string nor __future__
import.I think the choice d-string prefix versus member function depends very much on the mental model and thus on the precise semantics.
I voted in favor of the d-string prefix since I would prefer sematics that goes beyond what a function could do, similar to JEP 378. A prefix can only be applied to string literals (including f- and t-strings).
But if dedention similar to textwrap.dedent
was intrduced to Python, i.e., something that can be applied after the fact to any string object, I would actually prefer str.dedent()
. If the core developers decide that application of str.dedent()
to a string literal would be cmputed at compile-time, I would see this as a kind of an optimization, like constant folding.
I will vote for the str.dedent
, because it is a function (method) and as such it can take arguments.
Inspired by the baseline idea, I made a proposal about optional adjustments to dedent. I posted examples for d-strings, but I will prefer this one (just a short example without going into details):
if cond:
html = """\
<div>
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</div>
""".dedent(4) # leave 4 whitespace leading chars when dedenting
Regardless of my proposal, other ideas for .dedent
fine control may appear and the d-string syntax is not suitable for that.
Spelling out the literal number of spaces of indentation looks rather inelegant compared to the visual cue provided by the indentation of the closing quote as I suggested, which turns out to be also the solution adopted by JEP 378:
if cond:
html = d"""\
<div>
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</div>
""" # leave 4 whitespace leading chars when dedenting
Also, str.dedent
won’t help preserve line continuation markers and make them meaningful to dedentation in a string literal, and won’t help indent interpolated values in an f/t-string, which are features that can only be helped with a new type of string literal.
Thank you for your reply. I’m afraid I got somehow lost in the discussion regarding which ideas and sub-ideas are currently in favour and which were not.
You find the explicitly set indentation inelegant. Yes, it might look so, but it is more capable than the visual clue. It could also indent, not only dedent (let’s ignore naming for now). And it could replace tabs with spaces or vice versa - if there will be any demand for such feature, of course.
HTML1 = """\
<div>
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</div>
""".dedent(4)
I voted for dedent, but I disagree that it forever prevents the introduction of d-strings. Yes, it makes it harder to justify adding them, but not impossible.
I don’t mean str.dedent() makes d-string impossible forever.
As I explained in this comment, introducing str.dedent()
will make d-string harder.
Since this thread is about improving multiline string literal, we need to focus on it.
So please vote to str.dedent()
if (and only if) you are against improving literal because of you think str.dedent()
is enough for Python users.
Adding str.dedent()
or not will be discussed after we decide to improve literal or not.
If you are +/- 0 on literal improvement, please skip this vote and discuss about str.dedent()
later.
OK. The only use cases that I’ve seen which can’t be handled by str.dedent()
are f-strings where interpolated values contain newlines. These aren’t literals, so I’ll stick with my vote. I’m happy to discuss the f-string case separately.