PEP750: Template Strings (new updates)

effigies · November 24, 2024, 5:02pm

I’m not really clear on why you would use a t-string instead of an f-string if you just want to interpolate format(value, format_spec), but even if it is the “normal case”, I would rather it be a function (or method) than replace .value.

The main reason is that the format spec is an input to the template processor, not value.__format__. Types that do not support __format__ or don’t support the format_spec mini-language expected by a template processor would produce an error. Either it is done eagerly, making it impossible to construct templates that don’t use __format__, or it is done as a property and computed on demand. IMO, properties should act like attributes as much as possible, and performing a potentially exception-raising computation for valid inputs would seem to violate that.

As a separate point, I don’t really like that the type of .value would change based on the presence of format_spec. (I could see an argument for applying conv eagerly, although I honestly don’t see the use case for conv in a template string, either.)

finswimmer · November 25, 2024, 2:44pm

Hey,

what I’m missing in this PEP is a section that describes how this is different / an improvement over string.Template.

fin swimmer

Nineteendo · November 25, 2024, 5:01pm

See PEP 750 – Template Strings | peps.python.org

Templates provide developers with access to the string and its interpolated values before they are combined. This brings native flexible string processing to the Python language and enables safety checks, web templating, domain-specific languages, and more.

string.Template doesn’t do this, but it might be confusing that it has the same name.

jimbaker · November 25, 2024, 8:11pm

Something similar is seen in the support in PySpark SQL and DuckDB implicit support and explicit support for working with Pandas DataFrame objects; note DuckDB extends to other objects, including functions.

TabAtkins · November 25, 2024, 11:13pm

I’m not really clear on why you would use a t-string instead of an f-string if you just want to interpolate format(value, format_spec) ,

Multiple examples in the PEP use the format options, so I’m not sure what’s unclear about the usefulness of conv/fmt_spec. You could be string-formatting only some of the interpolations, or formatting an interpolation that will still be specially processed by the template processor (for example, HTML-escaping), or just doing normal f-string stuff but to a value that’s gonna later be passed to a formatter that expects a Template.

The main reason is that the format spec is an input to the template processor, not value.__format__ .

It’s not immediately clear to me that this is the case. All of the PEP’s examples using the format spec are using it exactly in the “pass to str.format()” way. The PEP itself links to “format specification”, and says “when present they should be respected, and to the extent possible match the behavior of f-strings”. However, since their handling is completely up to the template function right now, this will likely not be true in practice; many/most template functions will be naively written to not process their format args.

My expected behavior would be:

If a conv is passed, the value is auto-converted appropriately.
If a format_spec is passed:
- If the value has __format__, that’s auto-called with the format spec.
- If it doesn’t, you get the same TypeError that you do today, when you pass format arguments in an f-string with a value that doesn’t have __format__.

I can see an argument for diverging from f-string behavior when the value does have __format__ but the interpolation doesn’t have a format_spec. Today, f-strings still call __format__ with no args, because they’ll need a string anyway, but t-strings could reasonably want to leave the value alone by default.

The big idea, I think, is just that today, in every place that uses format_spec, it’s the value’s responsibility to handle that; the context of the value has no input into the process. Inverting that for t-strings, in a completely opaque, case-by-case basis, doesn’t seem like a great idea to me.

Maybe today’s behavior is just because every place doing formatting is expecting to get a str out of it anyway, so the “context” is all identical and there’s no need for a context-specific control in the first place. But if we do want to diverge here, I think some real thought should be given to how that’s done, and how the ecosystem can be expected to actually handle this. The PEP should also showcase such a context-dependent format spec in an example.

Either it is done eagerly, making it impossible to construct templates that don’t use __format__ , or it is done as a property and computed on demand.

I expect it would be done eagerly, as I outlined above. Then, getting .value on the Interpolation would indeed be a normal property access.

As a separate point, I don’t really like that the type of .value would change based on the presence of format_spec .

If you’re doing what the PEP does in several examples, this already essentially happens - the value is used as-is if there’s no conv/fmt, or turned into a string if there is. Under my assumption (possibly incorrect?) that conv/fmt stuff is intended to always act like an f-string, then it’s not even “changing types” - it’s being formatted as requested, and str is the correct type of the resulting value that the template should see.

jimbaker · November 25, 2024, 11:56pm

Supporting such exact roundtripping is rejected in a much earlier version of the PEP, with a longer exposition: peps/peps/pep-0750.rst at main · jimbaker/peps · GitHub

This rejection got shortened in subsequent editing. I think our rationale was that we wanted to see what it would take to implement, but fundamentally we didn’t want to lose the ergonomics of template strings.

For your yaml example function, we expect that simply using the Interpolation.expr will be a common pattern for key/value type settings. See for example in the PEP the implementation of ValuesFormatter for structured logging which uses this fragment:

    def values(self, template: Template) -> Mapping[str, Any]:
        return {
            arg.expr: arg.value
            for arg in template.args
            if isinstance(arg, Interpolation)
        }

Another example would be a boolean attribute like checked in applicable HTML elements in an extension of the example html function presented in the PEP:

html(t"<input type="checkbox" {checked} />")

should render to

<input type="checkbox" checked />

if checked, otherwise

<input type="checkbox" />

This is possible because Interpolation.expr == 'checked'.

Hopefully this helps with how you might want to implement your yaml function with template strings. I expect your example would then be:

blah:
    {foo}

which renders to the following:

blah:
    foo:
        key: "value"
        other: 123

ruro · November 26, 2024, 9:07am

I am not 100% sure if it makes sense to lump these two issues together. format_spec round-tripping and equals are basically orthogonal, so the rejection of one of those features doesn’t necessarily have to invalidate the other.

Unfortunately, this doesn’t quite work. For example, consider an extended version of your checkbox example:

def twoCheckboxes(checked1, checked2):
    return [
        html(t'<input type="checkbox" {checked1} />'),
        html(t'<input type="checkbox" {checked2} />'),
    ]

Under your proposed semantics, this would produce <input type="checkbox" checked1 /> which is not valid.

Same thing happens with yaml. It makes sense to allow both

blah:
    {foo=}
    # equivalent to foo: {foo}

and

blah:
    foo: {arbitrary_variable_name}
    # or even foo: {arbitrary[python_expression(None) ** 42]}

Always tying the yaml key to the expr doesn’t work. Debug {x=} formatting is a convenience feature only for the case where the name of the variable in python and in the target language happens to match.

funkyfuture · November 26, 2024, 2:29pm

iirc this was pointed out in the previous related thread, but i want to reiterate that critique after i took the time to read the PEP: the html function example that is used through the document as an example is just not convincing. simply because the implementation still would have to validate the literal parts of the templates. and that would be much easier (for the processing lib’s author as for lib users) with the approach that many libraries already use with functions (e.g. html(body(h1("The Life Of Brian", {_class="underlined"})))). such constructs are also easier to perceive when they are highly nested with Python’s syntax.

to be clear: this is an editorial issue.

but i’d also be interested on @hynek’s take on the logging example.

erlendaasland · November 26, 2024, 9:06pm

I finally got around to read the updated PEP, and I like it very much. Good luck with it!

hynek · November 27, 2024, 5:50am

As far as I can see, it’s only using t strings for formatting fields/interpolation, so I don’t think there is much to be said in the context of logging, specifically. It seems to be just an application that needs that kind of generic functionality.

dkp · November 27, 2024, 6:05pm

This is exciting to see – thank you!

Lysandros’ tstrings branch is far enough along that you could start using it today; the devcontainer.json and Dockerfile in the examples repo might be useful.

dkp · November 27, 2024, 6:35pm

Thanks, Tab!

The intent of the current PEP is to provide template processing code with the ability to fully manage both conv and format_spec. Such code is welcome to handle them in the manner of f-strings, to handle format_spec without format(), to ignore them entirely, etc. Re-reading the PEP today, I think we need to call this out more clearly.

My experience working with the early implementation of the PEP is that this flexibility is a net positive. For instance, an html() method can offer custom format specs that look a lot like Django or Jinja’s template filters. Eager invocation of format() seems unwelcome here.

In addition, the PEP currently leaves the door open for tb literals in the future – for example, struct.pack(tb"{value:>H}") – but __format__() dunders return str, not bytes.

I’ve generally found this flexibility to be “just fine” from the perspective of writing template processing code. format() is a builtin and is easy to invoke if desired. That said, there are probably some small things we could consider to improve the DX of writing template processing code:

We could ship a convert(value: obj, conv: Literal["a", "r", "s"]) -> str method too, probably in templatelib.
We could add an Interpolation.formatted_value() -> object method too, for those that want it. It would invoke convert() and then format() if one or both is needed; otherwise, it would just return the unformatted .value directly.

As for the f() method in the PEP: I took the editorial bent that the main thing we wanted to convey in our examples was how t-strings generalize f-strings, and in which cases t-strings might be preferable. So it seemed to make sense to offer f() as an example. Its use in other examples is mostly incidental, except perhaps for the logging example. I could see an argument for shipping templatelib.f() but I haven’t suggested it before because I’m not convinced it will be commonly used in practice – if a dev just wants f(), perhaps f-strings are just fine?

dkp · November 27, 2024, 6:43pm

My understanding is that the lack of an Interpolation.debug comes down to implementation details; I’ll leave it to Lysandros to comment.

In practice, I personally wouldn’t use the debug feature or Interpolation.expr for either yaml or html syntax. In the checkbox boolean case, you’d need to do:

checked = True
rendered = html(t('<input type="checkbox" checked={checked} />'))
assert rendered == "<input type="checkbox" checked />"

(The implementation of html() in our PEP 750 examples repo supports this already; if checked is False, no attribute will be added.)

and for YAML, I’d want to think a bit about the syntax, but perhaps something like:

foo = {"key": "value", "other": 123}
assert yaml(t"""
blah:
    foo: {foo}
""") == """
blah:
    foo:
        key: "value"
        other: 123
"""

The yaml() processor would have to determine that interpolation.value is a dict and that, grammatically, a dictionary is allowed in this position.

I haven’t thought much about yaml() specifically, but my hunch is that a fairly elegant t-string exposure is possible (and without any need for the debug specifier)…

dkp · November 27, 2024, 7:05pm

Thanks Frank, this is helpful feedback!

Stepping back, the editorial bent I took with the PEP is that we wanted to explain how t-strings generalize f-strings, and when they might be preferable. I hope that the motivation section and the sum total of the examples is compelling, even if one given example isn’t. Any suggestions for improvement are most welcome.

As for html() specifically: I see it as a choice between using a “pythonic DSL” (in which case, nested functions like you suggest) or a “string DSL” (in which case, t-strings would be the way to go).

There are certainly pros and cons to each approach and I personally don’t feel comfortable saying that one is clearly superior to the other.

That said, I’m happy to offer my personal experience: I’ve spent the last year using htpy and domonic, two well-developed Python HTML builder libraries, to build mid-sized websites. There are cases where I find builders straightforward, particularly when I’m building small interactive bits of the DOM. And there are cases where I find them extremely frustrating relative to (say) Jinja templates, particularly for larger portions of pages that mix complex bits of DOM with lots of text. Management of significant whitespace becomes a real chore, amongst other things.

Looking at the Python ecosystem: judging from the lack of stars and small number of PyPi downloads, HTML builders aren’t heavily adopted at the moment. On the other hand, my instinct from the large number of forum posts is that f-strings are regularly used to build HTML (with all the XSS and other issues this introduces).

More broadly: the Javascript ecosystem had a large number of “HTML builder” packages before tagged template strings became a core part of the language. Particularly once tools (formatters, colorizers, linters, etc.) caught up, the community seems to have moved away from using builders and towards template literals. My instinct is that the ability to have properly formatted HTML (and CSS) in code context proved pretty compelling in practice.

dimaqq · November 28, 2024, 8:36am

I’ve given the PEP a very quick read:

Python f-strings are easy to use and very popular. Over time, however, developers have encountered limitations that make them unsuitable for certain use cases.

the linked page doesn’t mention f-strings, it’s not clear what limitations were encountered… or maybe I didn’t read carefully enough?

Arbitrary String Literal Prefixes

This approach was rejected for several reasons:

It was deemed too complex to build in full generality. JavaScript allows for arbitrary expressions to precede a…

I’m a bit confused by this language, does this refer to e.g. the “t” function in translated = t‵age: ${42}‵ or something like (some[expr.here])‵${a} ${b}‵ ?

As a Python user, I think I’d be a little disappointed if I don’t get neat (and nested) syntax for inline template literals that I can [ab]use for my DSL. “Too complex” seems very vague… and I’m not even sure exactly what I’m getting at current PEP stage. Maybe that can be clarified.

ajoino · November 28, 2024, 9:29am

Regarding the “too complex part”, in one of the earlier drafts, might even be before it was a pep, this proposal included arbitrary string prefixes and lazy evaulation, which as I understood it (but might have been wrong) could have been abused to create the often wanted lazy evaluated function. Some people would definitely have abused this for general lazy evaluation and that in my mind would have made it too complex in the sense that it allows too much. But this is all based on my understanding of a very early version of this work and might be completely wrong.

There’s also an issue of when a function becomes a prefix. I think the original proposal was to just make html”..." be an alias for html(...) which would make things more complex in the compiler and also create opportunities for hard-to-find bugs, as forgetting a space would now cause a function call. Once again, please correct me if I’m mistaken about the earlier drafts of the proposal.

Another potential issue with arbitrary string prefixes is that it would take away prefixes for the language to use, unless some rules are applied like at least 3 characters, which introduces some complexity both from an implementation and language evolution perspective.

I am happy that they removed arbitrary string prefixes and prefer the current version.

TabAtkins · November 28, 2024, 7:00pm

My experience working with the early implementation of the PEP is that this flexibility is a net positive. For instance, an html() method can offer custom format specs that look a lot like Django or Jinja’s template filters. Eager invocation of format() seems unwelcome here.

All right, this is reasonable! The PEP could really use an example of this, then. Even as just a descriptive sketch rather than worked-out example code, knowing that this is an intended option is helpful for guiding how we think about it.

[convert(), and formatted_value()]

Yes, these would be welcome. As the PEP examples show, this is going to be a common operation already (and the PEP even states that it’s expected that the standard format commands will work, absent a good reason to do something different), so there’s no particular reason to force people to write boilerplate when we can just make it easy for them to do the right thing.

dimaqq · November 29, 2024, 6:32am

I would propose this:

Update the first page of the PEP with useful examples with a bit more detail (when are local vars or expressions captured?)

Summarise the “too difficult” bits and put a paragraph in the pep.

P.S. I wish we’d repurpose the backticks. I was also kind of hoping to see same API as ES6, same argument names, etc.

larry · November 30, 2024, 7:21pm

I’ve been thinking a lot recently on the topic of templating. I think it’s a misstep for PEP 750 to propose a syntax here in 2024 that doesn’t support the concept of a “filter”.

If I understand correctly, Jinja2, Mako, and Django templates are the three most popular templating libraries for Python and it’s not close. All three support filters. And it turns out–while their syntaxes vary wildly in most respects, they all agree on the syntax to specify a filter. At the end of an expression in a substitution, you add:

    | <filter>

This means “evaluate the expression, turn it into a string, then apply the filter <filter> to that string” in all three libraries. Sure, they differ in how you supply arguments to a filter, and how you specify multiple filters, but all three use literally identical syntax for “apply one filter with no arguments”. I therefore assert that this is a popular feature and there’s an obvious spelling for it. And I think PEP 750 template strings should add it.

Yes, this would add a small incompatibility with f-strings, as | isn’t a special character in the f-string format spec. If a user used | in a format spec today, it’d be passed on to the object in the (non-self) argument to __format__. As a practical matter, none of the built-in objects support | in their format spec. And I admit I don’t know, but: I don’t believe there’s widespread use of custom __format__ methods, and I suspect that the ones that support custom methods don’t do anything interesting with |. Also, template strings are new and their own thing, and while I think “support everything that f-strings support” is an excellent starting point, in this case I urge you to go one step farther.

As far as what syntax to use for supplying arguments, or calling multiple filters, I think only Jinja2 got it right. To call multiple filters, you’d apply this same syntax multiple times:

    | filter1 | filter2

To supply arguments, you allow the filter spec to be an expression that produces filter you want to apply. I propose we follow in the footsteps of decorators here: limit this expression to either a single possibly-dotted symbol name, or a function call. (We could relax this restriction later if it seemed advisable.) In case I’m stating that badly, I propose we allow this the filter name to optionally be followed by a single set of parentheses which turns the filter spec into a function call:

    | filter_a(...) | filter_b | filter_c(...)

[Edit: added possibly-dotted above.]

ajoino · November 30, 2024, 8:34pm

I’ve only used Jinja sparingly and never this filter feature. I understand what the filtering does, but not exactly how it’s useful. Could you provide some examples of where these filters might be used?