PEP 822: Dedented Multiline String (d-string)

Do you think that sentence is sufficient to explain the need to describe the idea of keeping only the newline without content, while already describing the idea of allowing content after the opening quote?

Allowing content after opening quote is more consistent.
And the reason that idea is rejected would just repeat most of the reasons for rejecting the idea of allowing content immediately after the opening quote.

After reading the open PR, I think it is the same topic. I’d like it to be clearer that this is a question of being consistent with other multiline strings or not.

I’m not totally satisfied with the emphasis in the new text, but the topic is covered, which is the main thing I care about.

I do see use cases for writing on the first line of the d-string. It’s more compact, more similar to the way we currently write docstrings, easier to switch…

But I’m convinced linters will fix those problems before they ever get to me. ā€œAdd a new line to the start of a d-string if it doesn’t start with a new lineā€ doesn’t seem like a hard rule to add.

OTOH, I think keeping the door open for annotated d-strings, something in the direction of

d"""#SQL
select statement from table;
"""

is valuable.

(Maybe there’s other better things to use that first line for. I think we’ll have a better view on that in the future regardless.)

1 Like

I’ve been trying to exhaustively envision the use cases of d-strings… and at the end I think there are two ways they can be used cleanly : a ā€˜vertical flavor’ and a ā€˜horizontal flavor’…

The vertical flavor wants to copy paste snippets of text into indented code and add the quotes over and under, at the end it wants to concatenate vertically (thus the last newline is actually convenient)

... :
     snippet = d"""
       -  Lorem ipsum
           Bla bla bla
     """

text += snippet

The horizontal flavor wants to define blocks and incorporate them afterwards in a template, it probably wants to keep the number of lines minimal, and does not require the indentation level to be defined by the closing quotes (because it wants to reindent in the template) :

... :
    statement = d"""#some comment
    do_something()"""

... :
    instructions = df"""
    for in in range(n):
        {statement}
    """"

(Note ; Here the remaining problem is the multiline reindentation of {statement} in the template.)

I don’t think there are really transverse ways (besides the vertical and horizontal flavors) for using d-strings properly.
→ If I’m right, keeping the last newline should actually be more convenient… and ā€˜transverse flavors’ that would require to remove it would be anecdotical and clumsy, thus discouraged intrinsically by the syntax, and the python neatness will be preserved.

1 Like

I feel torn on this PEP for a few reasons:

  1. I very much prefer the leading \n removal, but I’m unsure of the leading-but-not-trailing \n removal
  2. I sometimes wish I could embed a multi-line string within another multi-line string and have both automatically dedented (I made a t-string-powered library to demonstrate this wish) and I hoped this PEP might accomplish this but it does not (deliberately it seems, as there’s a complexity trade off)
  3. I teach Python and this seems like a syntax I would want my students to know about but the mental model for this doesn’t seem nearly as intuitive to teach as traditional multi-line strings (both due to point 1 and 2 and because dedenting is difficult to reason about in general)

All that said, I really like the idea of improving the dedenting mechanisms in Python, whether through a string method, a new syntax, or just an enhancement to textwrap.dedent

On leading & trailing newline removal

I find myself copy-pasting these two functions around between various projects:

This dedent version that removes the leading newline only:

def undent(text):
    return dedent(text).removeprefix("\n")

And this version which strips a trailing newline as well (like inspect.cleandoc):

def undent(text):
    return dedent(text).removeprefix("\n").removesuffix("\n")

It seems that I use the first version most of the time but I use the second version (to remove both prefix and suffix \n) about one-third of the time.

It would be nice to avoid using .rstrip() with d-strings, but that would make my primary use case more awkward.

On dedenting and re-indenting replacement fields

Given this string:

code = r"""
def strip_each(lines):
    new_lines = []
    for line in lines:
        new_lines.append(line.rstrip("\n"))
    return new_lines
""".strip("\n")

I would love it if this:

example = d"""
    Example function:

        {code}

    That function was indented properly!
"""

Resulted in this:

>>> print(example)
Example function:

    def strip_each(lines):
        new_lines = []
        for line in lines:
            new_lines.append(line.rstrip("\n"))
        return new_lines

That function was indented properly!

But I understand that this could complicate the mental model even further, especially raising questions of ā€œwhat about replacement fields that are mid-lineā€.

On teaching this to beginners

The ā€œHow to teach thisā€ section doesn’t currently address teaching this to new Python programmers.

I suspect that I would teach this most often to folks who have never heard of textwrap.dedent and who may never hear of it (if d-strings become successful enough to supersede its use).

My main concern with teaching this to beginners is explaining how it works. The prefix/suffix newline removal feels a bit magical and I’m suggesting even more (likely unfeasible) magic above, but dedenting is also a bit magical, especially if a {...} replacement field includes a string that contains a newline.

I hope some of the above concerns may be useful to consider. I know I’ve repeated/overlapped at least a bit with previously expressed concerns.

Thanks for pushing this idea forward in various forms over the years @methane! I’ve found what I’ve read from the previous threads inspiring.

8 Likes