D-string vs str.dedent()

IIUC the proposed d string uses the prefix before the closing “”“.
In the example you cited, there was no prefix before the closing “””.
So the embedded indentation was preserved.

(I had to look twice at the example as well.)

End-user perspective:

While I have multiline string literals and occasionally have to look up dedent to get the code to both look nice and conform to black&friends, I still find that this usage is just too rare.

Instead, what if we had tagged template literals, like ES6?

Dedent could be a 1st party use case, and libraries would bring in many other.

Personally, I don’t tend to use dedent a lot to tidy up multi-line strings. I typically just use a global variable, which I can write without indentation, so there’s no need for a dedent. Having said that, it is clumsy and being able to define the string at point of use would be convenient.

So I’d find something more convenient beneficial, but not essential. But as noted, I don’t find the problem sufficiently important to have strong feelings on the matter.

  • I would use a str.dedent method. That seems like the simplest to add, and a good “quick win”.
  • I would probably use a d-string, but I do have some concerns about the proliferation of string prefixes, and I have sympathy with the idea that a generalised tagged string literal syntax would be a better long-term solution.

The ValueError cases seem like they could be the cause of confusion. People will forget that the quotes have to be on lines of their own, and ValueError at runtime is not a great result. A syntax error would be better. Yes, linters could also catch this, but not everyone uses linters (nor should they have to).

Also, will the resulting string have a terminating newline? I’ve had cases where I’ve needed both options, and the d-string approach will have to pick one, and not support the other. The str.dedent approach of using the initial line’s indent doesn’t have this problem (needing an initial newline is much rarer, in my experience).

Overall, I’m +1 on a str.dedent method, and maybe +0 on d-strings with the existing dedent semantics, but -0 on d-strings with the proposed “last line indent” semantics.

I was talking about “str method implements algorithm exactly same to d-string”.

t1 = d"""
      hello
    """
t2 = """
      hello
    """
assert t1 == t2.dedent_as_d_string()

Do you mean SyntaxError instead of ValueError from this str method?

Java and Julia uses longest common indent from nonblank lines and the last line with terminating quote. They allow putting terminating quote in the last content line to omit last newline.

var s = """
    Hello
  """;  // s == "  Hello\n"  -- indent is calculated by quote line. This is differ from textwrap.dedent().

var s = """
    Hello""";  // s == "Hello"  -- indent is calculated by this only line. No indent are kept.

var s = """
    Hello\
  """;  // s == "  Hello" -- using backslash to remove last newline to keep some indent. It's rare.

On the other hand, Swift trims terminating newline. So you need to put blank line for terminating newline.

var s = """
    Hello
  """  // s == "  Hello"

var s = """
    Hello

  """ // s = "  Hello\n"

I prefer terminating newline by default so I prefer Java and Julia, although my first post borrow from Swift.

I don’t know tagged string. So I don’t have opinion about it for now. I will learn it later.

As for details, I prefere Julia rules. Differences between Julia and Java:

  1. Java trims trailing whitespaces. This is an interesting idea, but I think thit it should be considered separately. There are reasons to trim trailing whitespaces in ordinar triple-quoted strings too.
  2. Java does not allow any non-whitespace characters in the line with the opening quote. Since in Python there are several styles of writing triple-quoted strings:
    • """This is a docstring.
      """
      
    • """
      This is a docstring.
      """
      
    • """\
      This is a docstring.
      """
      
    I think that it is better to allow non-whitespace characters in the line with the opening quote.

There’s precedent for the intended use case in PEP 257.

TatSu uses a variation of that algorithm to be able to specify templates for code rendering, have them look nice in the Python module, but rid of the leading spaces so a formatted template can be indented and embedded into the result of another template.

Typically templates start with """\ so leading spaces are recovered from the first significant line.

def trim(text, tabwidth=4):
    """
    Trim text of common, leading whitespace.

    Based on the trim algorithm of PEP 257:
        http://www.python.org/dev/peps/pep-0257/
    """
    if not text:
        return ''
    lines = text.expandtabs(tabwidth).splitlines()
    maxindent = len(text)
    indent = maxindent
    for line in lines[1:]:
        stripped = line.lstrip()
        if stripped:
            indent = min(indent, len(line) - len(stripped))
    trimmed = (
        [lines[0].strip()] +
        [line[indent:].rstrip() for line in lines[1:]]
    )
    i = 0
    while i < len(trimmed) and not trimmed[i]:
        i += 1
    return '\n'.join(trimmed[i:])

Sorry, no. I thought the comment was in relation to the d-string form. I don’t think d-strings should raise runtime errors, they should compile directly to the dedented string. Sorry for the confusion.

We already have textwrap.dedent. I think having other approaches (whether d-strings or a string method) that implement different rules would be needlessly confusing. So unless you’re proposing to change textwrap.dedent (which has its own backward compatibility issues) I would prefer the Python rules (:wink:) over the Julia or Java ones.

If we go to from __future__ import dedent_multiline_string, I agree with you because user need to check all existing triple quoted strings.

On the other hand, if we go to d""" prefix, we don’t need to care about backward compatibility.
I prefer not allowing d"""\ nor d"""first line because it is much simple and clean.

Julia uses “triple-quoted strings are also dedented to the level of the least-indented line” basic rule. It is same to textwrap.dedent().
There are some difference between them, but I can see Julia rule is natural extension for literal to Python rule.

So it is subjective that Python rule and Julia rule are different two rules.

OK. So call it the textwrap.dedent rule rather than the Julia rule. I’d expect more Python users to understand the former than the latter.

And to repeat, if “there are some differences between them”, either make dedent match the Julia rules, or use the dedent rules unchanged. Having the two forms use different rules would be a source of confusion and errors.