Please don’t significantly edit messages after people have responded to them. I am not going to engage further, I don’t think there is any value in that.
Sorry; I wanted to edit for clarity.
I apologize if editing previous posts is unwelcome here. I would generally only do it when clarification seems necessary but I’ll refrain in the future.
I think your point still stands? (To be clear: I originally mentioned the explicit lack of support for laziness in the PEP in my third paragraph, not my second. The original second paragraph didn’t seem to add anything, so I removed it. I also modified the first sentence to be less glib; it’s not a workaround depending on what you’re looking for.)
I am excited about this PEP
Yet, a few things seem worth fixing or clarifying:
- (1) What about hashability of
Template
andInterpolation
instances? (see the post by @Ilotoki0804) Considering the equality rules defined in the PEP, I suppose thatTemplate
hashability should depend on hashability of componentInterpolation
instances, and their hashability should depend on hashabilityof[EDIT] of all four attributes of theirs. Another option is to resign from hashability. Yet another option is to resign from the current definition of equality, replacing it with equality (and hashing) based on object identity (like the default behavior for user-defined classes).value
of theirs - (2) Ad debug specifier (
=
): the PEP says thatt'{expr=}'
is treated ast'expr={expr}'
, but to be consistent with f-strings, it should be treated ast'expr={expr!r}'
. EDIT: in fact, it is more subtle… - (3) What about
Template.__str__()
? I suppose it behaves like__repr__()
, i.e. – in particular – does not provide any “default” way to render the template in an f-string-like manner (that would pose a risk that an unconscious programmer could effectively obtain the f-string-like behavior while some t-string escaping would be required for security; and they’d have a false sense of security – just “because I used t-strings, so I am safe, am I not?”).
I’m finding terms like “proper laziness” and "true laziness’ it a bit
unhelpful. It’s clear that different people mean different things by
“lazy”, but it’s not clear that one is any more “true” than another.
If I understand correctly, the distiction is about lexical vs. dynamic
scoping. In the PEP, the template arguments are evaluated eagerly and
lexically scoped, whereas some people want them to be not only evaluated
lazily but also dynamically scoped.
Dynamically scoped lazy evaluation is a concept that doesn’t really
exist in Python right now. It feels like something that should be doable
without requiring new syntax, but there isn’t quite enough introspection
capability for it. We have globals() and locals(), but nothing that
captures the entire lexical environment including intermediate scopes.
Suppose we had a function, let’s call it environment(n), that returns a
mapping you can look up to find the value of a name as though it were
written into the code n frames up from the point where environment() was
called. Then the functionality of dynamically-scoped templates could be
implemented using code that parses an ordinary format string and looks
up the names appropriately.
I would be far more supportive of a PEP for such an environment function
than any kind of template string syntax. It’s a smaller unit of
functionality that has many more potential uses, and doesn’t require any
new syntax.
With a couple of tweaks, the PEP 501 template rendering algorithm works for already composed PEP 750 templates:
def format_template(t):
rendered_segments = []
for segment in t.args:
match segment:
case str() as s:
rendered_segments.append(s)
case Interpolation() as field:
rendered_segments.append(
field.format_value()
)
return "".join(rendered_segments)
(assuming the field formatting helper I suggested above is included - you can write your own either way, it’s just annoying)
Omitting an obviously accessible implementation of this is intentional, since it’s a potential security trap for some expected template use cases (writing this renderer is a good learning activity, but using it bypasses the intended security benefits of using structured templates to separate trusted and untrusted data segments).
The harder part of your question is specifying that the interpolation values should be filled in later. This is the problem I was writing about in PEP 750: Tag Strings For Writing Domain-Specific Languages - #224 by ncoghlan
(I don’t think the exact solution to this problem needs to be in PEP 750 itself. I do think the PEP should point out that it does provide the pieces needed to solve the problem later)
Edit: some thoughts on what solving the problem later might look like:
- a fast, non-customisable alternative to
string.Formatter
intemplatelib
that produces a template with strings as the values and quoted strings as the expression fields (as if you had written a template with a quoted string in every field) - something like the
replace_values
method in my linked message
in that case there’s no point in using lambdas in the template string then?
it seems like the lambda workaround doesn’t solve anything
A useful approach could be to make it possible to have “unbound” replacement fields – perhaps by specifying them just as escaped, i.e., as parts of literal string parts (using {{
and }}
escaped delimiters):
title = "Professor"
# Below `{title}` is a normally bound replacement field,
# and `{{name}}` is "unbound" (i.e., just escaped):
raw = t"select title, name from lecturers where title={title} and name={{name}}"
# Just now the rendering result would be (obviously) *wrong* (but see below...):
assert sql(raw) == "select title, name from lecturers where title='Professor' and name={{name}}"
assert len(raw.args) == 3
assert isinstance(raw.args[0], str)
assert isinstance(raw.args[1], Interpolation)
assert isinstance(raw.args[2], str)
assert template.args[1].value == "Professor"
# And later...
raw_complete = raw.bind(name="John Doe")
# Now the result is correct:
assert sql(raw_complete) == "select title, name from lecturers where title='Professor' and name='John Done'"
assert len(raw.args) == 5
assert isinstance(raw.args[0], str)
assert isinstance(raw.args[1], Interpolation)
assert isinstance(raw.args[2], str)
assert isinstance(raw.args[3], Interpolation)
assert isinstance(raw.args[4], str)
assert template.args[1].value == "Professor"
assert template.args[3].value == "John Done"
…or perhaps by specifying them using some dedicated form of an unbound replacement field (but this would require to extend the PEP, of course), e.g.:
# Below `{title}` is a normally bound replacement field,
# and `{:name}` is an unbound replacement field (!):
raw = t"select title, name from lecturers where title={title} and name={:name}"
# This will raise an error (not all fields have been bound yet!):
sql(raw)
assert len(raw.args) == 5
assert isinstance(raw.args[0], str)
assert isinstance(raw.args[1], Interpolation)
assert isinstance(raw.args[2], str)
assert isinstance(raw.args[3], UnboundInterpolation) # new type...
assert isinstance(raw.args[4], str)
assert template.args[1].value == "Professor"
assert template.args[3].field_name == "name" # ...with attributes specific to it
# And later...
raw_complete = raw.bind(name="John Doe")
assert sql(raw_complete) == "select title, name from lecturers where title='Professor' and name='John Done'"
assert len(raw.args) == 5
assert isinstance(raw.args[0], str)
assert isinstance(raw.args[1], Interpolation)
assert isinstance(raw.args[2], str)
assert isinstance(raw.args[3], Interpolation)
assert isinstance(raw.args[4], str)
assert template.args[1].value == "Professor"
assert template.args[3].value == "John Done"
EDIT: improved and more “cross-sectional” proposals are in my later post.
A useful approach could be to make it possible to have “unbound” replacement fields – perhaps by specifying them just as escaped
On reading your post, it occurred to me that we wouldn’t need new syntax for that, it could just be a convention on the template processor side that took advantage of two features of the existing syntax:
...
is a valid Python expression:some arbitrary text:<the actual format string>
is a valid format string definition
This means
with_placeholders = t"select title, name from lecturers where title={...:title} and name={...:name}"
would put Ellipsis
in the interpolation field values, "..."
in the expression fields, and "title"
and "name"
respectively in the format_spec
field.
Given that convention, you could write a post-processor that moved the implicitly quoted string portion into the value
field when the field value was a literal ellipsis:
def parse_placeholders(t: Template) -> Template:
segments = []
for segment in t.args:
match segment:
case str() as s:
segments.append(s)
case Interpolation(Ellipsis, "...", conv, format_spec):
value, _, format_spec = format_spec.partition(":")
expr = f"...:{value}"
field = Interpolation(value, expr, conv, format_spec)
segments.append(field)
case Interpolation() as field:
segments.append(field)
return Template(*segments)
A useful approach could be to make it possible to have “unbound” replacement fields – perhaps by specifying them just as escaped, i.e., as parts of literal string parts (using
{{
and}}
escaped delimiters):
Can’t this be simply implemented using string methods? (We can still bike shed over the name) Corresponding to str.format()
and str.format_map()
.
assert "Hello {name}!".template(name="World") == t"Hello {"World"}!"
assert "Hello {name}!".template_map({"name": "World"}) == t"Hello {"World"}!"
it could just be a convention on the template processor side that took advantage of two features of the existing syntax:
...
is a valid Python expression:some arbitrary text:<the actual format string>
is a valid format string definition
Interesting idea!
Though, IMHO a more useful tool than the parse_placeholders()
one you proposed, would be something along the lines of the following:
def bind_template_fields(t: Template, **fields_to_bind: Any) -> Template:
segments = []
for segment in t.args:
match segment:
case str() as s:
segments.append(s)
case Interpolation(Ellipsis, "...", None, expr) as unbound_field:
field_spec, _, format_spec = expr.partition(":")
field_name, _, conv = field_spec.partition("!")
if field_name not in fields_to_bind:
segments.append(unbound_field)
continue
if not conv:
conv = None
elif conv not in ('a', 'r', 's'):
raise ValueError("invalid conversion character: "
"expected 's', 'r', or 'a'")
value = fields_to_bind[field_name]
field = Interpolation(value, expr, conv, format_spec)
segments.append(field)
case Interpolation() as field:
segments.append(field)
return Template(*segments)
That is, it would produce a new Template
object with the given field values “bound” to it – as if they were there from the beginning.
And it could be even more useful, if it was available as a Template
’s method – perhaps named bind
?
EDIT: improved and more “cross-sectional” proposals are in my later post.
Can’t this be simply implemented using string methods? (We can still bike shed over the name) Corresponding to
str.format()
andstr.format_map()
.assert "Hello {name}!".template(name="World") == t"Hello {"World"}!" assert "Hello {name}!".template_map({"name": "World"}) == t"Hello {"World"}!"
Another possibility would be to make the Template
’s constructor behave like that:
assert t"Hello {"World"}!" == Template("Hello {name}!", name="World")
assert t"Hello {"World"}!" == Template("Hello {name}!", {"name": "World"})
Then the current constructor would become a classmethod from_segments()
:
assert t"Hello {"World"}!" == Template.from_segments(
"Hello",
Interpolation("World", '"World"', None, ""),
"!",
)
(Then, probably, it would also be worth renaming the attribute args
to segments
…)
EDIT: for a more comprehensive proposal, see my later post.
PS Note that the ideas from the above two posts could co-exist.
What about hashability of
Template
andInterpolation
instances?
Agree, we still need to add this to the spec. (github issue) I think it will fall out in the straightforward way (Template
is hashable if and only if all interpolation values are also hashable) but we’re waiting to make sure nothing comes up in the prototype cpython branch.
to be consistent with f-strings, it should be treated as
t'expr={expr!r}'
Ah, that’s a good catch. Thanks! (github issue).
What about
Template.__str__()
?
Agree, we need to explicitly mention this in the spec. (github issue). And yes, it’s just __repr__()
for the exact reasons you suggest.
...
is a valid Python expression:some arbitrary text:<the actual format string>
is a valid format string definition
Hah, using ellipsis in this way is a fun and very clever hack!
Though, IMHO a more useful tool than the
parse_placeholders()
one you proposed, would be something along the lines of the following:
Likewise for bind_template_fields()
!
I just added a small new example along these lines to the pep750-examples repo.
In particular, it defines a Binder
class that takes a Template
in the constructor and provides a bind(self, **kwargs) -> Template
method similar to bind_template_fields()
. Rather than using Ellipsis
, I use Cornelius’ suggestion elsewhere of quoting interpolations.
This test passes:
def test_binder():
template: Template = t"The {'cheese'} costs ${'amount':,.2f}"
binder = Binder(template)
bound = binder.bind(cheese="Roquefort", amount=15.7)
cheese = "Roquefort"
amount = 15.7
assert bound == t"The {cheese} costs ${amount:,.2f}"
There’s also a related Formatter
class that provides a format()
method; sort of an imperfect answer to str.format()
.
What the PEP doesn’t do is address how we should think of
str.format
when looking at templates as analogous to partially-evaluated f-strings.
Just a follow-up: I added a github issue to track this.
Because t-strings, like f-strings, eagerly evaluate their interpolations, I tend to think of them less as “partially evaluated f-strings” and more as “evaluated f-strings before rendering to string”. But that’s subtle. I suppose it does place them in a somewhat new corner of the (growing) Venn Diagram of approaches to string formatting in Python.
in that case there’s no point in using lambdas in the template string then?
Yes. Sorry; I should have mentioned that.
Stepping back: if what we want is to re-use templates multiple times with different interpolated values, the cleanest approach is just to wrap our t-string in a callable. No need for lambdas in the t-string itself:
from templatelib import Template
def cheese(name: str, category: str) -> Template:
return t"{name} is {category}"
roquefort: Template = cheese("Roquefort", "blue")
limburger: Template = cheese("Limburger", "stinky")
# This assert passes
name = "Roquefort"
category = "blue"
assert roquefort == t"{name} is {category}"
That’s maybe not so interesting and is basically no different than how f-strings get “reused” in python today.
I think lambdas (or callables in general) in interpolations have a different set of uses. For instance, imagine you want to define a template but only later want to decide which parts of it to render to string. We could implement a format_some()
method that takes as input a “selector” and a t-string:
template: Template = t"{(lambda: 'roquefort'):blue} {(lambda: 'limburger'):stinky}"
assert format_some("blue", template) == "roquefort ***" # the second lambda isn't called
assert format_some("stinky", template) == "*** limburger" # the first isn't called
This might be useful in a logging pipeline, for instance.
I find lambdas in interpolations to be awkward syntax, but referring to callables directly seems like it might be common. Of course, t-strings are only as good as the code that does something with them (like converting them into strings, or parsing them into ASTs); the code that processes the t-string needs to expect a callable in Interpolation.value
and do something useful with it.
If we’re looking for something closer to str.format()
, t-strings (like f-strings) offer no direct analogue. t-strings, like f-strings, eagerly evaluate their interpolations and have lexical scope. Because strings sent to str.format()
are just strings, they can refer to any name, including names not in scope. (I do like the fun hacks others have cooked up elsewhere in the thread.)
Finally, an earlier version of this PEP introduced the idea of “implicit lambda wrapping”: that is, wrapping all interpolations in lambda functions without requiring the syntax. We ultimately rejected this approach as too problematic, although it did enable a number of potential use cases that the current PEP does not.
I don’t know why we want lazy evaluation for t-strings, but if you don’t mind, there is a hack even works for f-strings, and it even works on pypy:
class LazyFormatter:
def __init__(self, func):
self.code = func.__code__
def format(self, **kwargs):
return eval(self.code, kwargs, {})
def format_map(self, mapping):
return eval(self.code, mapping, {})
>>> lazy_formatter = LazyFormatter(lambda: f"hello {name:{spec}}")
>>> lazy_formatter.format(name="world", spec="s")
'hello world'
>>> lazy_formatter.format(name=42, spec=".2f")
'hello 42.00'
That’s maybe not so interesting and is basically no different than how f-strings get “reused” in python today.
Uninteresting is fine with me - being able to solve a problem with boring code reduces the risk of bugs
I like this, not just because it solves the “how do I create a template with parameters” problem, but also because it made me stop and re-think my understanding of what a template is and how it’s created. I think I was focused too closely on the idea that t-strings were the only way of creating templates, when in fact templates are a prefectly normal Python type, and t-strings are simply the literal form of a template. No-one asks “how can I make [a, b, c]
lazily evaluate the variables a, b and c?” - instead, you just write a function that returns the list you want. Template literals (t-strings) are the same as list comprehensions in that sense.
+1 on having this example somewhere in the “how to teach this” part of the PEP.
Another option to handle delayed substitutions is to capture evaluation errors in the Interpolation
object and reraise them if the .value
attribute is accessed. That way the errors still bubble out in almost the same way, but if the template wants to handle them (or ignore the captured value entirely), it can, and use the .expr
attribute instead:
(Made up code, untested)
>>> fmt = t"{a} + {b} = {a+b}"
>>> fmt.args[1].value
NameError: a
>>> fmt.args[1].expr
'a'
>>> def apply(tmpl, **args):
... for a in tmpl.args[1::2]:
... a.value = eval(a.expr, args)
>>> apply(fmt, a=1, b=2)
>>> fmt.args[1].value
1
>>> fmt.args[3].value
2
>>> fmt.args[5].value
3
I’m not suggesting that that apply
function I just invented should be part of the standard, only that the fact that an error occurs while evaluating an expression need not prevent the Template
from being created.
Contrast with the current behaviour (by my reading):
>>> fmt = t"{a} + {b} = {a+b}"
NameError: a
>>> fmt
NameError: fmt
, I use Cornelius’ suggestion elsewhere of quoting interpolations.
Note that I don’t think this is a good solution. It fails to have the benefit that proper delayer substitution would have, i.e. you can’t do more complex calculations on these values. This ability to do more complex evaluations and actually use python syntax was in my understanding the primary driving factor of this PEP; Otherwise normal string syntax with simple custom parsers would be enough.
All of these “solutions” being suggested here are just bandaids for a missing fundamental feature.