Introduction of convenience accessors on Template including strings, interpolations, and values
Discussion of how t-strings and old-style format strings do and don’t relate, along with new example code to take an old-style format string and convert it into a Template
Many smaller bug fixes and improvements (see the PR for details)
I think .args as an interleaved list is an implementation detail that should not be baked into the PEP.
For example, a template processor could cache the static parts of the template and only reprocess the dynamic parts when the template is evaluated with different values.
This seems like it is now redundant with .strings, which would allow you to keep it as an implementation detail and not encourage users to depend on it or require other implementations to maintain compatibility.
The rejection of tb-strings is done in-passing, without justification:
Like f-strings, t-strings may not be combined with the b or u prefixes.
For the same reason that we don’t support bytes.format(), you may not combine 'f' with 'b' string literals. The primary problem is that an object’s __format__() method may return Unicode data that is not compatible with a bytes string.
This restriction does not apply to a template, since it is the function the template is passed to that needs to decide what to do with it. I’m assuming that you don’t want to complicate this PEP by actually defining a Template[bytes], but it would be nice to see this explicitly declared out-of-scope or in Rejected Ideas (and my preference would be out-of-scope, since it seems like a natural extension in the future).
For the “structured logging” example, it appears to me that in order to comply with a predefined structure, users have to name their local variables exactly identical to what’s expected by the structure. Could there be a bit more flexibility (to allow specifying a different name for a slot)?
In addition, a slightly off-topic question: In case a logger expects a specific type for a specific key, is there a way to support static type checking inside templates? e.g. If a structured logger expects type(timestamp) == float, is there a way to warn user when they pass in a timestamp variable of type str?
That’s an interesting point about slots. Short answer: no. t-strings are a step up from f-strings and as such, just use normal scope rules.
The encapsulation you’re looking for is functions that act like components, mediating the input/output for a t-string. Here’s an example from a predecessor project.
Structured logging allows developers to log data in both a human-readable format and a structured format (like JSON) using only a single logging call.
I don’t like this paragraph. This paragraph is talking about having both of plaintext log AND structured logging.
Structured logging is not about human readability. For example, OTLP Event/LogRecord is serialized in protobuf that is not human readable.
Some other examples:
Structured logging is the process of producing, transmitting, and storing log messages in a format that’s easily machine-readable, such as JSON. The main advantage here is that by ensuring logs are structured consistently, you’ll get faster and more accurate automated processing and analysis.
Structured logging involves recording log events in a well-defined, machine-readable format such as JSON. Instead of writing log messages as plain text, log data is organized into key-value pairs, making it easier to search, filter, and analyze.
This is why I dislike “Structured logging allows developers to log data in both a human-readable format and …”
Additionally, mixing plaintext log and JSON log in one line doesn’t seem structured logging well.
JSON/XML/PROTOBUF/etc parsers doesn’t understahd the >>> separater.
You can write plaintext log into Console and send structured log to log transfer agent in one log call. But mixing two style log into one string is not good practice. It is not good for example.
We chose this approach because the venerable Python logging cookbook — part of the official Python docs — already takes it: its default “structured logging” example emits a single string with a human readable portion, a separator (>>>>), and a JSON-structured section.
I think the second “approach” to structured logging in PEP is nicer in that it cleanly separates human-readable from structured output; devs can ignore the human-readable stuff entirely if they want.
“Structured logging allows developers to log data in both a human-readable format and…”
I could see altering this sentence to “Structured logging allows developers to log data in a machine-readable format”, but the reason we opted for its current wording was mostly to keep in line with the existing cookbook example.
Example code contains more than the definition about what is structured logs.
We should not change “what is structured log” definition only to keep in line with the example. It confuse readers. I don’t want make Python users to use technical word in wrong way.
By the way, it seems the cookbook is bit old. Observability has spread at a very rapid pace over the past 5 years. Along with that, best practices for structlog have also become established.
Common practice for structlog puts human readable log message in the struct log fields, like {"message": "message 1", "snowman": "\u2603", "set_value": [1, 2, 3]}.
Structured logging allows developers to log data in machine-readable
formats like JSON. With t-strings, developers can easily log structured data
alongside human-readable messages using just a single log statement.
Thanks for considering the tb-string case, and I’m glad you also feel that it’s a plausible extension! One point that I brought up in passing in the (closed) GitHub thread that I want to re-raise:
I think the implementation of a tb-string could very easily be close enough to what you have proposed here that, instead of being two separate types Template and BytesTemplate, it would make sense to use Template[str] and Template[bytes]. In that case, it might save one future headache by renaming .strings to .literals, which will be less jarring to have type tuple[bytes, ...].
def from_format(fmt: str, /, *args: object, **kwargs: object) -> Template:
"""Parse `fmt` and return a `Template` instance."""
...
I wondered how such an implementation could be realised, but it seems like string.Formatter().parse output will give the raw material needed to instantiate a Template in such a way that the examples here (“Interleaving of Template.args”) make sense…?
I’d like to express my support as I would like to use template strings to build SQL queries and I’ve recently written SQL-tString in anticipation. This library currently supports,
from sql_tstring import sql
a = 2
query, values = sql("SELECT a, b, c FROM tbl WHERE a = {a}", locals())
assert query == "SELECT a, b, c FROM tbl WHERE a = ?"
assert values == [2]
With this PEP it can become,
query, values = sql(t"SELECT a, b, c FROM tbl WHERE a = {a}")
Which will be much easier to use and explain as well as actually being a supported syntax rather than a “hack” of sorts as it is now.
What makes SQL-tString useful to me is that it also accepts values that rewrite the query. For example the special value Absent which will result in the expression (and potentially clause if empty) to be absent (removed) from the resultant query,
from sql_tstring import Absent, sql
a = Absent
query, values = sql("SELECT a, b, c FROM tbl WHERE a = {a} AND b = 2", locals())
assert query == "SELECT a, b, c FROM tbl WHERE b = 2"
assert values == []
I find this technique much better than any of the existing tools I’ve found that build SQL as it requires writing SQL rather than a pseudo SQL language.
I definitely appreciate the new convenience accessors, but I found it puzzling that one was still lacking - a way to get the value of an interpolation, afterconv and fmt_spec have been applied. Multiple examples in the spec end up having to reuse the f() convenience function to handle the formatting; this strongly suggests that authors will have to write this convenience function and use it most times as well.
It is the normal case that you’ll want to apply those in every string interpolation; fancier interpolations that do something unusual will (based on JS experience) likely be the exception. When those fancier interpolations do occur, they’ll be part of a more complex interpolation anyway, and those authors can more easily eat the complexity of understanding which value to use. Requiring authors of simpler interpolations that just produce strings to remember to apply conv and fmt_spec every time is, I think, asking for those specifiers to just not work by default.
My suggestion would be to move the current .value property to a less-conveniently-named property, like .unformatted_value, and then let .value be the automatically-formatted version. In the absence of conv or fmt_spec, this would continue to be the object value that the expression evaluated to, but when either of those are specified, it would become a str, formatted appropriately.
Can you include more rationale / reasoning for making the debug specifier ({foo=}) bake into the string? It seems a bit counterintuitive given that !r/s/a don’t get “baked into” the value.
Also, it’s not 100% clear how the debug specifier interacts with Interpolation.expr (is the equals sign included? is the whitespace before the equals sign included?).
Why not include the equals sign (plus trailing whitespace) as Interpolation.debug or something and make it the responsibility of the formatting code to deal with it.
In general, it looks like a very good addition to the Python language. Thank you for this PEP!
I’m agree. Interleaving, odd length of args, concatenation of the last string of template1 and the first string of template2 all sound like implementation details. Maybe those can be removed from the PEP text?
Moreover, I’m not sure that direct indexing is a best interface for end users of templates.