Pre-PEP: d-string / Dedented Multiline Strings with Optional Language Hinting

methane · May 19, 2025, 12:42am

In the previous thread, Serhiy proposed __future__ import and I am +1 on it.

But Guido was -1 on __future__:

I think tools like pyupgrade or 2to3 will reduce maintenance cost of existing code.

tstefan · May 19, 2025, 3:33pm

The proposal looks nice. If a new syntax is introduced (like d'''...'''), the d-strings should not be a just a short-hand for textwrap.dedent, but it should be done the right way.

The most important extensions of d-strings compared to textwrap.dedent should be a compatibility with f-strings and t-strings, moreover it should support line continuation, so

foo = d'''
       abc\
       def
       '''

should remove not only the newline, bu also all spaces before d. JEP 378 would be a good starting point, but I am not particularly happy that JEP 378 allows the closing quotes to be indented more than the text (with the superfluous spaces been ignored).

xitop · May 19, 2025, 6:07pm

Here is a tiny suggestion for an optional dedent adjustment. The number specifies indentation of the result in characters. By default it is 0 (max. dedent possible).

There is an update about str.dedentbeing more suitable for this.

        html = d:4"""\
            <div>   <!-- leading whitespace trimmed to 4 whitespace chars -->
                Lorem ipsum dolor sit amet, consectetur adipiscing elit.
            </div>
            """

An indentation could be even inserted (with space characters):

DEFAULT_HTML = d:4"""\
<div>   <!-- four leading spaces inserted-->
    Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</div>
"""

malemburg · May 19, 2025, 8:17pm

I don’t think that proliferation of string modifiers is a good idea.

Can’t we just have some operator added to f-strings, which then takes care of the dedent ?
E.g.

f"""{#!dedent}
   Some text
   More text
"""

The compiler could take care of the conversion at compile time, if possible (some of the formatting in the f-string may prevent determining the right dedent to use), or defer this to runtime by using textwrap.dedent().

jamestwebber · May 19, 2025, 8:28pm

I think that’d be a pretty major change to f-strings (a magic keyword that modifies the rest of the string?). I’d rather have a separate modifier^[1].

that I can ignore if I don’t want to use it ↩︎

malemburg · May 19, 2025, 8:56pm

Think of it as a shebang: the whole formatted string gets passed to a dedent function, very much like a Python script that is read from disk and then run using Python.

jamestwebber · May 19, 2025, 9:10pm

I mean I get it. I just think it’s a really big change to f-strings. My mental model of an f-string is “expressions inside of {}, and a literal string outside of {}”. This format would change that.

It’s not insurmountable, but I don’t think such a change is simpler^[1] than introducing a different prefix.

in terms of teaching it, reading it, etc ↩︎

malemburg · May 19, 2025, 9:46pm

I mentioned this idea, because it will allow to add more such “interpret this string in some special way” kind of semantics without having to change Python’s syntax every time (there have been numerous such suggestions in the past and the d-string one is another new variant).

But yeah, I don’t want to hijack the discussion with a new proposal. Just suggesting that we may be better off, considering these things in a broader sense.

AA-Turner · May 20, 2025, 2:58am

I would still advocate for the simpler alternative of str.dedent(), where we can do the dedenting at compile time.

It doesn’t cover the more esoteric cases mentioned in this thread, but it also doesn’t have the drawbacks of more syntax or combinatorial explosion of string prefixes.

A

methane · May 20, 2025, 4:10am

In terms of user convenience, the following order would be better

both of d-string and str.dedent()
d-string only
str.dedent() only
status quo

But we need to balance the costs and benefits of Python’s language specification and implementation complexity. My concern is that adding str.dedent() will negate the introduction of an additional d-string and ultimately make it less convenient for the user than if d-string were introduced.

Therefore, I am prioritizing the d-string discussion ahead of the str.dedent() discussion, and since there is still a year until Python 3.15 is in beta, there is no need to rush the str.dedent() discussion.

AA-Turner · May 20, 2025, 4:50am

@methane please could you edit the message to be specific about what the __future__ import would change? I scrolled back several messages but I don’t think I’ve seen a concrete explanation.

Two more poll suggestions: make it multiple choice, and add a ‘do nothing’ option (/and maybe also a ‘add str.dedent()’ option). Currently I can’t vote in the poll: I would vote ‘no’ on d-prefix and I don’t know what the future import entails so I would also vote ‘no’, but there’s no option for this!

A

AA-Turner · May 20, 2025, 4:53am

Thanks for the rationale, this is a fair point.

methane · May 20, 2025, 5:31am

I cannot the vote because of time limit. So I repost it.

I want to poll about d-string vs __future__ import.
__future__ import changes syntax of triple quote string literal like proposed d-string.

d-string makes string prefix more complex.
- E.g. d/D/df/dF/DF/Df/fd/fD/FD/Fd/dt/dT/DT/Dt/td/tD/TD/Td/ud/...
- But it doesn’t break backward compatibility in the future.
__future__ import doesn’t need new prefix.
- It changes all triple quote literals.
- It will break backward compatibility in the future. But it will make language simple.
- Tools like pyupgrade would help the transition.
str.dedent() is the most simple option.
- But it cannot work nicely with t/f-string.
  - People want to use t-string for HTML or SQL. And they want to dedent it too.
- It cannot work nicely with line continuation too.
- Vote to d-string or __future__ import if you want both of dedenting literal and str.dedent()
  - Because we are discussing about literal improvement is necessary or not.

Add d-prefix to string literal
__future__ import changes triple quote literal to auto dedent
Add str.dedent() and never add d-string nor __future__ import.
Status quo; never add any.

0 voters

tstefan · May 20, 2025, 6:35am

I think the choice d-string prefix versus member function depends very much on the mental model and thus on the precise semantics.

I voted in favor of the d-string prefix since I would prefer sematics that goes beyond what a function could do, similar to JEP 378. A prefix can only be applied to string literals (including f- and t-strings).

But if dedention similar to textwrap.dedent was intrduced to Python, i.e., something that can be applied after the fact to any string object, I would actually prefer str.dedent(). If the core developers decide that application of str.dedent() to a string literal would be cmputed at compile-time, I would see this as a kind of an optimization, like constant folding.

xitop · May 20, 2025, 7:10am

I will vote for the str.dedent, because it is a function (method) and as such it can take arguments.

Inspired by the baseline idea, I made a proposal about optional adjustments to dedent. I posted examples for d-strings, but I will prefer this one (just a short example without going into details):

    if cond:
        html = """\
            <div>
                Lorem ipsum dolor sit amet, consectetur adipiscing elit.
            </div>
            """.dedent(4)   # leave 4 whitespace leading chars when dedenting

Regardless of my proposal, other ideas for .dedent fine control may appear and the d-string syntax is not suitable for that.

blhsing · May 20, 2025, 7:29am

Xitop:

    if cond:
        html = """\
            <div>
                Lorem ipsum dolor sit amet, consectetur adipiscing elit.
            </div>
            """.dedent(4)   # leave 4 whitespace leading chars when dedenting

Spelling out the literal number of spaces of indentation looks rather inelegant compared to the visual cue provided by the indentation of the closing quote as I suggested, which turns out to be also the solution adopted by JEP 378:

    if cond:
        html = d"""\
            <div>
                Lorem ipsum dolor sit amet, consectetur adipiscing elit.
            </div>
        """  # leave 4 whitespace leading chars when dedenting

Also, str.dedent won’t help preserve line continuation markers and make them meaningful to dedentation in a string literal, and won’t help indent interpolated values in an f/t-string, which are features that can only be helped with a new type of string literal.

xitop · May 20, 2025, 8:18am

Thank you for your reply. I’m afraid I got somehow lost in the discussion regarding which ideas and sub-ideas are currently in favour and which were not.

You find the explicitly set indentation inelegant. Yes, it might look so, but it is more capable than the visual clue. It could also indent, not only dedent (let’s ignore naming for now). And it could replace tabs with spaces or vice versa - if there will be any demand for such feature, of course.

HTML1 = """\
<div>
    Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</div>
""".dedent(4)

pf_moore · May 20, 2025, 9:08am

I voted for dedent, but I disagree that it forever prevents the introduction of d-strings. Yes, it makes it harder to justify adding them, but not impossible.

methane · May 20, 2025, 9:36am

I don’t mean str.dedent() makes d-string impossible forever.
As I explained in this comment, introducing str.dedent() will make d-string harder.

Since this thread is about improving multiline string literal, we need to focus on it.
So please vote to str.dedent() if (and only if) you are against improving literal because of you think str.dedent() is enough for Python users.

Adding str.dedent() or not will be discussed after we decide to improve literal or not.
If you are +/- 0 on literal improvement, please skip this vote and discuss about str.dedent() later.

pf_moore · May 20, 2025, 10:41am

OK. The only use cases that I’ve seen which can’t be handled by str.dedent() are f-strings where interpolated values contain newlines. These aren’t literals, so I’ll stick with my vote. I’m happy to discuss the f-string case separately.