Major thanks to @dkp who was the primary re-writer for the PEP update and joined @jimbaker at the core dev sprint to hash things out.
Thanks for the update! Iâm writing comments as I read, so if I havenât changed this text itâs because I forgot (or I hit âReplyâ early) and thereâll be a summary at the end. hereâs my summary: Iâm happy with basically everything, have some comments about where types should go, and am pretty sure thereâs a better way to handle the interleaved args but Iâm not totally convinced which way to go with it.
- The Template Type
I expect this type will be internal to the interpreter (in CPythonâs case, a native type), and so putting an isinstance-able version in types
is fine. Donât define it in terms of @dataclass
though, even with the caveat. We canât implement it in terms of that, so better to specify it directly.
Of course, if Template
is going to be directly instantiable, weâre (a) going to have to be okay with the added overhead and (b) it probably doesnât live in types
anymore, just because we shouldnât have to import types
(even implicitly) in order to use a t-string. But if type(t"") is not Template
is acceptable (assuming isinstance(t"", Template)
is still True), then I guess the types
definition can be a duck-type equivalent.
I much prefer always using Interpolation
and not alternating with str
. Interested what others think about that, but as a consumer Iâd prefer to not have to type check - either I can assume that .value
is meaningful, or str()
is some reasonable default behaviour.
Alternatively, if they always come in (literal, interpolation)
pairs, where literal may be an empty string and interpolation may be None/falsey. Then I can for s, v in t.args:
, rather than type checking or coming up with some kind of alternating iteration.
(Side thought: if Template.__str__
applies normal formatting rules to each interpolation, then in many cases t-strings and f-strings could be interchangeableâŚ)
- Concatenation
Why not? Itâs just concatenating .args
isnât it?
- The debug specifier (
=
)
Iâm not convinced about this. Perhaps we should just forbid it here? Or maybe it needs to move into the grammar in a way that lets us pass it through as a conversion flag or its own flag?
- Interleaving âŚ
Ah, structural pattern matching gets a mention. If this was earlier, Iâd have been less concerned about the interleaved approach. It feels a bit clunky? I wonder if thereâs a design more geared towards match that would feel smoother?
In short, my feeling here is that designing specifically for âyou should use matchâ is fine, and so is âyou should not use matchâ, and weâre in a weird kind of middle ground right now where neither approach feels great. (But maybe others will come in and say the match approach does feel great, and itâs just me, which is totally likely! In this case, please put it as the first example rather than the last one.)
Iâm not sure I see the issue here. The types
module could expose the type, and the type could set its __module__
to types
, even if âreallyâ it is an interpreter-internal type. Thatâs similar to how typing.TypeAliasType
now works.
This looks really great! Only thoughts:
- The interleaving feels unsatisfying. If you want to provide easy access to the static portions of
args
, I would do it via a property. Right now you can use an alternating sequence, but you could also use index sequences:
The point isnât to bikeshed this API or implementation detail, but that it feels like interleaving ought to be left as an implementation detail. By baking it into the PEP, youâll make it very difficult ever to make a different choice. If this use case is important enough to structure the type around, then it seems worth making an API and not a trick of ordering (regardless of whether that trick is used internally).class Template: args: list[str | Interpolation] _static_indices: tuple[int] _dynamic_indices: tuple[int] @property def static(self) -> tuple[str]: return tuple(args[i] for i in self._static_indices) @property def dynamic(self) -> tuple[Interpolation]: return tuple(args[i] for i in self._dynamic_indices)
- Concatenation doesnât seem so difficult that itâs better to make this type of string unlike all the other types. It might take a bit of care, but it seems worth it to make template strings behave as youâd expect. Iâd expect them to be viral, so a template string added to any other string ends up as a template string.
- I havenât thought through a full use case, but I donât see why
bytes
strings couldnât be constructed in this way, e.g.,FileWriter(tb"{magic}{header:\x00<{padding}}{blob}")
. Because youâre not relying onobj.__format__()
, the PEP 498 reasoning no longer applies. ATemplate[bytes]
could haveargs: Sequence[bytes | Interpolation]
. That said, I understand that youâve got a thing that youâre trying to do, and that may be a step too far. IDK if itâs worth mentioning it as out-of-scope.
As my own feedback:
- I really like this iteration of the proposal (and I expect weâll be withdrawing PEP 501 in favour of this, since the remaining differences have solid reasons behind them that weigh in PEP 750âs favour)
- Iâd like to see a discussion in the Rejected Ideas section about eager evaluation of conversion specifiers (the topic was explicitly considered and I think the conclusion to keep the lazy evaluation is reasonable, it just didnât get added to the PEP itself)
- Iâd like to see
Interpolation
offer a couple of formatting helper methods to improve the ergonomics of allowing lazy conversion when most template processing wonât need to customise it:f.convert_value()
: apply the conversion specifier (if any) to the field valuef.format_value()
: equivalent to `format(f.convert_value(), f.format_spec)
Considering other feedback:
We should be able to have a common implementation level âunsafeâ template constructor API that relies on the caller to ensure that the input sequence is correctly normalised that both the eval loop and the types.Template.__new__
Python API would call.
The eval loop would assume that the compiler hasnât messed up the arg sequence, while types.Template.__new__
would actually do the required normalisation pass to merge adjacent string segments and insert additional empty strings as required.
That was my initial reaction too (either having a prefix
string field on interpolations or having 2-tuples), but I found the cache_key = template.args[::2]
example genuinely compelling, as none of the other options offer that same ability to easily say âgive me just the string partsâ, and the 2-tuple variant also doesnât even allow you to easily say âgive me just the interpolation fieldsâ.
In addition to the memoization example in the PEP, the interleaving approach makes things like switching to a different placeholder relative straightforward:
def prepare_query(template, *, placeholder="?"):
query_text = template.args[::2]
if any(placeholder in text for text in query_text):
msg = f"Cannot use {placeholder!r} in query template text"
raise ValueError(msg)
prepared_query = '?'.join(query_text)
template_values = [f.value for f in template.args[1::2]]
return prepared_query, template_values
And if we do want the pairwise variation, itertools.zip_longest
can provide it:
segments = iter(template.args):
for prefix, field in zip_longest(segments, segments):
... # Do something with the text prefix
if field is not None:
... # Do something with interpolation field
So yeah, I found the interleaving idea to be superficially off-putting, but it ended up feeling genuinely elegant once I started playing with the possibilities it offers.
I love the new PEP, but feel like the lack of concatenation is a mistake, especially the explicit concatenation.
I means that you canât effectively break a single t-string into multiple lines.
You could always move it to a multi-line string, but thatâs not the same as it includes newline characters, and also because it breaks how people are used to operating on strings.
What was the thought behind why templates canât support concatenation?
Iâd also appreciate if t-strings would support concatenation. Itâs quite common to split a simple string over multiple lines without using triple quotes. E.g.
s = (
"This is some long "
"comment"
)
The PEP explicitly mentions that [...] empty strings are added to the sequence when the template begins or ends with an interpolation, [...]
. With that in mind, wouldnât the implicit concatenation of t-strings just concatenate normal strings?
template = t"Hello " "World"
assert template.args == ["Hello World"]
# --
name = "World"
template2 = t"Hello {name}!" " Some more text"
assert template2.args == ["Hello", Interpolation(value="World"), " Some more text"]
Looks good! Like others, Iâm not entirely happy with the prohibition on concatenation. I agree with the logic for prohibiting +
(the values are Template
objects, so there should be no expectation that they support addition). However, I would like t"..." "..."
to be supported, as a special case. This would be particularly useful in its multi-line form
some_var = t"..." \
"..."
or
some_var = (t"..."
"..."
)
While itâs true that triple-quoted strings are available as an alternative, the fact that they canât be easily indented to line up with the surrounding code (and the common solution of using dedent
doesnât work for t-strings) means that implicit concatenation does have its place.
Iâd limit it explicitly to only allowing "..."
(with no prefix) to be concatenated with a t-string. Yes, thatâs a special case rule, but so is the âno concatenationâ rule.
One thing to note about concatenation: even if it is initially left out, adding it later is now straightforward.
That wasnât the case with the previous version of PEP 750.
Separately from that discussion, @nhumrich and I have also agreed that given the updates to PEP 750 thereâs no longer any differences we feel strongly enough about to champion an alternative, so weâll be withdrawing PEP 501 in favour of PEP 750.
Thank you! Will revisit your broader comments soon but, to address one specific point: we removed the use of @dataclass
(and the corresponding caveat) from the PEP.
I donât see what the difficulty is in concatenating templates like:
t"A{B}C" + t"D{E}F" --> t"A{B}CD{E}F"
The PEP says that the template always starts and ends with a string part. Why canât the string parts just be concatenated?
Is that restriction a hangover from previous versions of the PEP where the template was not returned directly because the tag function would process it first?
(P.S. Hi @dkp and many thanks for your excellent Go website that I have used many times over the years!)
One additional interesting use case would be output switching for cli apps like libxo provides: libxo: The Easy Way to Generate text, XML, JSON, and HTML output
xo_emit("Connecting to {:host}.{:domain}...\n", host, domain);
Depending on a output switch, that gets rendered as
TEXT:
Connecting to my-box.example.com...
XML:
<host>my-box</host>
<domain>example.com</domain>
JSON:
"host": "my-box",
"domain": "example.com"
One counterpoint: I always make my linter forbid this kind of concatenation and only allow ("..." + "...")
(across two lines) because I was once bitten by adding a comma in a (" ..." , "...")
. explicit is better than implicit, at least for me, so I would love to see +
being allowed.
How are Template types handled in terms of hashability and mutability?
- Immutable and always hashable (like a string)
- Immutable and sometimes hashable (like a tuple)
- Immutable but not hashable
- Mutable but hashable (like a regular class)
- Mutable and not hashable (like a list)
As far as concatenation goes, the concern (at least from my PoV) isnât with concatenating template instances with each other, or with concatenating regular strings, since those are both well-defined in a substitution-safe way:
t"some template" + t"other template"
âTemplate(*lhs.args, *rhs.args)
"some string" + t"some template"
âTemplate(lhs, *rhs.args)
t"some template" + "some string"
âTemplate(*lhs.args, rhs)
The problem I see with allowing the latter two cases is when f-strings (and other forms of string formatting) get involved, since they look like the latter two arguably safe cases at runtime, but theyâre actually bypassing the substitution safety features that the use of templates is supposed to be providing.
That concern only applies to string concatenation, though. If template concatenation were allowed (and Iâm struggling to see any cases where it would be dangerous, since everything remains correctly escaped), then the two otherwise risky cases could be safely written as:
t"{"some string"}" + t"some template"
t"some template" + t"{"some string"}"
There are certainly cases where sequence concatenation will produce nonsense (such as combining multiple HTML body sections), but there are also plenty of cases where it will be valid (such as combining HTML paragraph sections, list sections, table sections).
PEP 501 has been officially withdrawn, referring readers to PEP 750 instead: PEP 501 â General purpose template literal strings | peps.python.org
(The PEP 750 update that produced the preview link is still going through its final prepublication review pass, so donât be concerned about the fact that the live PEP index still has the previous iteration of the proposal up)
Perhaps even add a __str__
method for that?
I kinda worry about what this means for future compatibility. Any future addition of a new string prefix to the language risks backwards incompatibility.
Itâs not like string prefixes are added every day but I it would be a shame if it would discourage the addition of a helpful builtin prefix to the language.
Is there a place in the PEP that addresses this?
The PEP has been updated and only proposes a t-prefix now (no arbitrary prefixes), but this would definitely be worth considering when a future PEP proposes this addition.
Does that mean Template.__init__()
âstandardizesâ its *args
before assigning it to self.args
, i.e. enforces the interleaving and odd length?
I didnât see that logic in the examplesâ __init__.py
file and couldnât find class Template
in the reference implementationâs types.py
module. But I think it would make a lot of sense.