PEP 750: Tag Strings For Writing Domain-Specific Languages

Yeah, definitely understood.

I’d been sitting on a variant of those Binder/Formatter examples but hadn’t planned to include them in the repo until this conversation. I hope they do a good compact job both of demonstrating some of t-strings’ flexibility and of shining harsh light on a key limitation.

I think the earlier PEP contained multitudes. :slight_smile:

In the end, I feel we found a useful digestible increment. But the cuts were hard.

I can imagine a future PEP that introduces a new l prefix such that lt"{x+y}" is equivalent to t"{(lambda *, x, y: x + y)}". Then again, perhaps the environment() suggestion above offers a cleaner vector to a similar target?

(Another suggestion in this thread was to introduce a !() conv. It occurs to me that the current PEP might actually preclude doing so in the future since Interpolation.value is pre-conversion, not post. Hrm.)

1 Like

I agree. PEP 735 is another example of a proposal that started relatively large and got quite ruthlessly cut to produce a much more well-focused core proposal. It even included an explicit “Deferred Ideas” section listing potential future extensions. I think that lazy evaluation would make a very good addition to a “deferred ideas” section in this PEP (as opposed to classifying it as a “Rejected” idea).

If there’s a way to modify the PEP so that !() remains a potential future extension, that would be useful[1]. One advantage of having lazy evaluation as a “deferred idea” is that it clearly marks certain areas of the design space where we want to leave things flexible, without committing to solving all of the design problems right now.


  1. Not least because it’s the option I think I like the most :slightly_smiling_face: ↩︎

9 Likes

“Deferred Ideas About Deferral” :laughing:

But in seriousness, that’s a good suggestion for the next round of edits. Thanks.

4 Likes

Making !() work was the reason the last pre- withdrawal version of PEP 501 had lazy conversion, so I think you’re safe on that front.

Keeping that future syntax option available is likely worth mentioning as part of the rationale for lazy value conversion, though.

2 Likes

After some thought, I believe that the main problem with the proposal – and the cause of some reservations expressed in this thread (especially by @pf_moore and @MegaIng, if I properly understood the crux of your messages) – is that the term template promises some kind of skeleton or form, or a structure – which can be filled later with actual content.

In other words, I bet that many people, when hear about template strings, will expect that their usage, once a template string will be evaluated resulting in a template object, will then include the possibility to fill that structure (template) with some content, e.g.:

template = t"Monty {animal} and the Holy {cup}"
filled = template.fill({"animal": "Python", "cup": "Grail"})
# and maybe only then:
html(filled)  # etc.

Whereas the proposed syntax, with the associated types, offers just a way to immediately define structure and content, which then (typically) is only meant to be completed with certain presentation qualities (such as DSL-specific escaping etc.).


What I am trying to express is that IMHO the feature is interesting and probably very useful, yet using the term template to describe it would lead to misunderstanding and disappointments – at least unless it includes some actual templating possibilities, i.e., some natural way to decouple defining structure from defining content.


PS [EDIT] Let me clarify: the above code snippet is not something I propose, but something I am afraid people will expect when they hear about template strings.


PPS [late EDIT] But, said all that, I’d like to emphasize that the proposal described in the point (3) of my later post (and in further posts following that one) could help greatly reduce the above issue.

8 Likes

I agree, the term template is misleading. While following this thread, I have to constantly flip a mental switch to understand the behavior. Template is used for jinja-like tools, see https://wiki.python.org/moin/Templating

1 Like

To be more clear about what my suggestion is when it comes to my previous post…

I believe there are, generally, three options:

  • to dismiss my reservations expressed in that post as inessential – i.e., to keep the feature and terminology as they are;

  • to keep the feature as it is currently described in the PEP, but to change the terminology – including changing the class name Template (and probably also the t prefix) to something which would not suggest the templating possibilities in the meaning pointed to in my previous post;

  • to enrich the feature described in the PEP, by adding to it the templating possibilities as meant in the previous post (i.e., the possibility to decouple defining structure [what fields in which places] from defining content [field values]) as a first-class citizen.

When it comes to the latter option, I would propose something what could be used like so:

animal = "bird"
t1 = t"A {:animal_weight} {animal} could not carry a {:coconut_weight} coconut."
assert t1 == t"A {:animal_weight} {'bird'} could not carry a {:coconut_weight} coconut."
html(t1)  # raises error ("unfilled fields: 'animal_weight', `coconut_weight`")

t2 = t1.fill(animal_weight="<5 ounce")   # keyword arg.
assert t2 == t"A {'<5 ounce'} {'bird'} could not carry a {:coconut_weight} coconut."
html(t1)  # raises error ("unfilled fields: `coconut_weight`")

t3 = t2.fill({"coconut_weight": ">1 pound"})   # mapping as pos. arg.
assert t3 == t"A {'<5 ounce'} {'bird'} could not carry a {'>1 pound'} coconut."
assert html(t3) = "A &lt;5 ounce bird could not carry a &gt;1 pound coconut."

Presumably, a new type would be introduced, UnfilledField – similar to Interpolation but without value, and with name instead.

The fill() method would accept a mapping as a single positional argument or (for convenience) any number of keyword arguments. Not all unfilled fileds would be required to be filled in one call to this method – which means that the desired content could be added incrementally.

There could also exist a similar method: fillall() – which would require that unfilled fields must be specified all at once (i.e., that the resultant Template object must not include any UnfilledFields), or an error would be raised (UnfilledTemplateError, which could be a subclass of ValueError).

Rendering functions would be free to either accept UnfilledFields (and react appropriately to their specifics, e.g., providing some default values) or raise an error (as the example function html() does in the above code snippet; presumably UnfilledTemplateError would be suggested in such cases).

Note that with those {:xyz} unfilled fields we would not introduce any lazy-evaluation mechanism, but just a possibility to explicitly provide (some of) the content later (not necessarily at the moment when the template string is evaluated and the template structure is determined), possibly incrementally, and using a first-class-citizen functionality – without resorting to unstandarized hacks (like escaping field delimiters by typing {{ and }} or setting values to ...).


EDIT: slightly different, yet generally improved and more “cross-sectional” proposals are in my later post.

3 Likes

Maybe we could switch back to InterpolatedString or InterpolationTemplate which was used in an earlier version of pep 501:

If we added “real” template strings, we could maybe do something like this:

string_template = t"Hello {name}!"
interpolated_string = string_template.fill(name="World")
assert interpolated_string == i"Hello {"World"}!"
rendered_string = render(interpolated_string)
assert rendered_string == "Hello World!"

But I would that defer to a followup PEP, as it depends on this one.

1 Like

And you would also be able to fill in names in the format specifier:

assert t"Price: {price:{fmt}}".format(price=49, fmt=".2f") == i"Price: {49:.2f}"

I would not provide such a method, which IMHO could appear to be an attractive nuisance.

Templates are not strings. Let formatting their content be the job of rendering functions.

Oops, I meant this:

assert t"Price: {price:{fmt}}".fill(price=49, fmt=".2f") == i"Price: {49:.2f}"

I was looking at the behaviour of str.format() and forgot to chance the name of the function:

>>> "Price: {price:{fmt}}".format(price=49, fmt=".2f")
'Price: 49.00'

The string would still need to be rendered by a function (and there’s no need to escape format specifiers).

1 Like

PS When it comes to the second of the options mentioned in my post, i.e.:

…I agree with @Nineteendo that interpolated <something> is a good direction when it comes to finding a better term for the feature and the main type related to it.

Namely, I believe that i-strings (an i"{something}..." syntax, as initially proposed in PEP 501) and the InterpolatedContent type name (instead of t-strings and Template, respectively) would be a good choice. This way we would keep the door open for introducing “actual” templates in the future – without the need to decide now whether we really need them (and what exactly their syntax and semantics should be if we decided to introduce them…).

By the way, to keep the terminology consistent with the existing conventions (especially, those used in the f-string syntax’s docs and the relevant portions of the string module’s docs), I propose to rename the type Interpolation to ReplacementField – becase replacement field is an established term to describe all those {-and-}-delimited portions of f-strings and str.format()-processed patterns. Note that also in numerous code examples already posted in this thread (e.g., by @ncoghlan) the name field was chosen quite often as a variable name referring to an instance of that type.

2 Likes

Since the switch from “interpolation templates” to “template strings” happened in PEP 501 rather than being new in PEP 750, I can give some background on it.

The most basic reason (and the one that PEP 501 cites) is that we’re intentionally using the same name as the comparable JS feature:

The other reasons I accepted that change to PEP 501 when @nhumrich proposed it were:

  1. it’s a simpler name and the shorthand form is easier to pronounce
  2. The syntax still supports true templating (by putting an ellipsis literal in every field), it just also makes it easy to provide contents for the fields directly when it is useful to do so (and it is frequently useful to do so)

I admit I do sometimes think of them as “populated template strings” (vs the “bare template strings” that the string module offers), but they’re still template strings.

1 Like

While workarounds might be possible, I don’t think it’s a good idea to use a more verbose notation than str.format[_map]() which also doesn’t allow fields in format specifiers. We are better off adding a separate prefix to fulfil this need or new string method(s).

This PEP doesn’t need to solve everything, and it should just admit what’s impossible.

Everything described in PEP 750: Tag Strings For Writing Domain-Specific Languages - #224 by ncoghlan is possible with the PEP as written.

There are certainly rough edges in the dynamic templating use cases, but those use cases also already have good solutions (like jinja2 and string.Formatter).

So from a dynamic templating point of view, my recommended way to think of template literals is as a feature that may help make writing dynamic templating interfaces easier, and likely higher performance, and potentially offer some API design improvement opportunities, but doesn’t offer anything fundamentally new.

Instead, the most interesting use cases are those where the template is prepopulated with specific values that need to be treated with suspicion, such as subprocess command execution and SQL database queries, but the overall structure of the template itself is less likely to be data driven (unlike HTML).

This is why PEP 501 focused on the shlex.sh template processor as its primary motivation (and assuming PEP 750 is eventually accepted, @nhumrich and I intend to write a new PEP just for adding shlex.sh as a standard template processor).

2 Likes

Putting aside the terminology/naming issue for a moment, I’ll now attempt to put together the building blocks from several previous posts by various authors (including me) plus some fresh ideas – to explore, and make it easy to compare, various possibilities to add “true templating” (i.e., optional decoupling the step of defining the template structure from the step of specifying the content, i.e., actual interpolated values) on top of the current design of the template strings feature (possibly suggesting some non-fundamental changes to that design).

(Please, correct me if I omit any important elements or misrepresent somebody’s ideas.)

Variant #1: Ellipsis-based

Assuming that the use of ... (the Ellipsis literal) as a special value denoting an unfilled value (as suggested by @ncoghlan) is a satisfying syntax, we might want to use it in the following way:

animal = "bird"  # (<- only this field value will be specified immediately)
t1 = t"A {...:animal_weight} ounce {animal} could {...:neg!r} carry a {...:coconut_weight:.2f} pound coconut."
assert t1 == t"A {...:animal_weight} ounce {'bird'} could {...:neg!r} carry a {...:coconut_weight:.2f} pound coconut."
html(t1)  # Could raise: ValueError("unfilled fields: 'animal_weight', 'neg', 'coconut_weight'")

t2 = fill(t1, animal_weight="<5", neg="not")
assert t2 == t"A {'<5'} ounce {'bird'} could {'not'!r} carry a {...:coconut_weight:.2f} pound coconut."
html(t2)  # Could raise: ValueError("unfilled fields: 'coconut_weight'")

t3 = fill(t2, {"coconut_weight": 1})
assert t3 == t"A {'<5'} ounce {'bird'} could {'not'!r} carry a {1:.2f} pound coconut."
assert html(t3) = "A &lt;5 ounce bird could &#39;not&#39; carry a 1.00 pound coconut."

(Please note how conversions and format specs are supported…)

The fill() [in some of my previous posts I proposed a less suitable name, bind] function would need to be implemented somehow along the lines of the following:

def fill(
    template: Template,
    mapping: Mapping[str, Any] | None = None,
    /,
    **kwargs: Any,
) -> templatelib.Template:

    if mapping is None:
        mapping = kwargs
    elif kwargs:
        raise TypeError(
            "fill() takes either a mapping as the "
            "sole positional argument or any number "
            "of keyword arguments, but not both")

    match_spec_regex = _FILL_SPEC_REGEX.fullmatch  # (see below)

    def _gen_segments() -> Iterator[str | Interpolation]:
        for segment in template.args:
            match segment:
                case Interpolation(bultins.Ellipsis, "...", noconv, spec):
                    match = match_spec_regex(spec)
                    if match is None:
                        raise ValueError(
                          f"when `...` is used as a placeholder for "
                          f"a value to be filled-in, what needs to "
                          f"be placed to the right of the `...:` "
                          f"marker is the obligatory name part with "
                          f"an optional conversion and/or format "
                          f"specification (got: {spec!r})")
                    if noconv is not None:
                        raise ValueError(
                          "when `...` is used as a placeholder for "
                          "a value to be filled-in, the conversion "
                          "part, if any, needs to be placed to the "
                          "right of the name part (which follows "
                          "the `...:` marker)")
                    field_name = match["field_name"]
                    if not field_name.isidentifier():
                        # (keep it simple + avoid parsing ambiguities)
                        raise ValueError(
                          f"when `...` is used as a placeholder for "
                          f"a value to be filled-in, the name part "
                          f"needs to be a valid Python variable "
                          f"name (got: {field_name!r})")
                    value = mapping[field_name]
                    expr = repr(value)  # Or match["field_name"] ?!?
                    conv = match["conv"]
                    format_spec = match["format_spec"] or ""
                    yield Interpolation(value, expr, conv, format_spec)
                case Interpolation() | str():
                    yield segment
                case _:
                    typing.assert_never(segment)

    return Template(*_gen_segments())

_FILL_SPEC_REGEX = re.compile(
    r"(?P<field_name>[^!:]+)"
    r"(?:"
        r"!"                 # (no need to be strict here,
        r"(?P<conv>[^:]+)"   # as Interpolation() constructor
    r")?"                    # will validate it anyway...)
    r"(?:"
        r":"
        r"(?P<format_spec>.*)"
    r")?"
)

Perhaps that function could become a Template’s instance method, as the blessed way to process ...:-engaging templates? (not necessarily immediately, maybe in another major release of Python…)

Anyway, with this approach “true templating” can be easily added on top of the current implementation of template strings.

Variant #2: tt-strings syntax

Another approach would require some syntax extension…

One possible way to denote an unfilled (aka unbound) replacement field could be the {:name}-like syntax (proposed in one of my latest posts). However, now I think it is hardly better than the {...:name}-based approach described above (as being not much prettier than the latter one, which at least makes occurrences of unfilled fields more visible/distinguishable…).

So here I’d like to explore yet another variant (important features of which were suggested by @Nineteendo) which, as it seems, could be introduced later, in a separate PEP. To do that it would be required to add:

  • tt-strings – with the semantics drafted below (obviously some other prefix than tt might be used [e.g., t if it was decided that the base prefix is i…]),

  • a new exception type: UnfilledTemplateError (being a subclass of ValueError),

  • a new type (somewhat similar to Interpolation):

    class UnfilledField:
        name: str
        conv: Literal["a", "r", "s"] | None
        format_spec: str
    
        __match_args__ = ("name", "conv", "format_spec")
    
        def __init__(
            self,
            name: str,
            conv: Literal["a", "r", "s"] | None = None,
            format_spec: str = "",
        ):
            ...
    

E.g., we might want to use that stuff in the following way:

# Just for more consistent naming in the following examples:
ReplacementField = Interpolation  

# First, a well known PEP-750 template string:
a = "<5"
b = "not"
c = 1
immediate = t"A {a} bird could {b!r} carry a {c:.2f} pound coconut."
assert isinstance(immediate, Template) and immediate == Template(
    "A ",
    ReplacementField("<5", "a", None, ""),
    " ounce bird could ",
    ReplacementField("not", "b", "r", ""),
    " carry a ",
    ReplacementField(1, "c", None, ".2f"),
    " pound coconut.",
)  # Typically also equivalent (though not equal) to:
   # t"A {'<5'} ounce {'bird'} could {'not'!r} carry a {1:.2f} pound coconut."

# Now, the new stuff...
t1 = tt"A {bird_weight} ounce bird could {neg!r} carry a {coconut_weight:.2f} pound coconut."
assert isinstance(t1, Template) and t1 == Template(
    "A ",
    UnfilledField("bird_weight", None, ""),
    " ounce bird could ",
    UnfilledField("neg", "r", ""),
    " carry a ",
    UnfilledField("coconut_weight", None, ".2f"),
    " pound coconut.",
) != immediate
html(t1)  # Could raise: UnfilledTemplateError("unfilled fields: 'bird_weight', 'neg', 'coconut_weight'")

t2 = t1.fill(bird_weight="<5", neg="not")
assert isinstance(t2, Template) and t1 != t2 == Template(
    "A ",
    ReplacementField("<5", "bird_weight", None, ""),
    " ounce bird could ",
    ReplacementField("not", "neg", "r", ""),
    " carry a ",
    UnfilledField("coconut_weight", None, ".2f"),  # <- Note: still unfilled!
    " pound coconut.",
) != immediate
html(t2)  # Could raise: UnfilledTemplateError("unfilled fields: 'coconut_weight'")

t3 = t2.fill({"coconut_weight": 1})
assert html(t3) == html(immediate) == (
    "A &lt;5 ounce bird could &#39;not&#39; carry a 1.00 pound coconut."
)
# Note: typically, `t3` is also equivalent to `immediate`
# (though not equal to it -- because of `expr`s).

Template.fill() (somewhat similar to the fill() function from Variant #1) would accept one positional argument (a mapping) or any number of keyword arguments, and would produce a new instance of Template – with selected UnfilledField instances replaced with appropriately constructed Interpolation ones (where appropriately constructed means: Interpolation(mapping[unfilled.name], unfilled.name, unfilled.conv, unfilled.format_spec)).

Template would also provide the following methods:

  • Template.fill_all() – similar to fill(), but raising UnfilledTemplateError if any UnfilledField is remaining;
  • Template.all_fields_filled() – returning True unless args contains any UnfilledField;
  • classmethod Template.make_from() – accepting a template as an ordinary str, plus optionally such argument(s) as accepted by Template.fill(); returning a Template whose args contains appropriate string/Interpolation/UnfilledField segments (as appropriate).

Variant #3: tt-strings syntax with Template tweaks

Like Variant #2 – but with modified Template, so that:

  • the constructor Template() behaves like Template.make_from() from Variant #2;
  • the type has classmethod Template.from_segments() – behaving like the current Template() constructor;
  • the type does not have Template.make_from(), as it would be redundant (see above).

Variant #4: with a factory maker

In this variant we would have neither any additional syntax nor Variant #1-like extra conventions and tools.

Instead, the Template class would provide only one additional classmethod: Template.get_factory(); it would accept a template as an ordinary str (like the first argument to make_from() in Variant #2) as well as, optionally, the default values for any subset of the template’s fields (in such a form as for Template.fill() in Variant #2); it would return a callable which:

  • would have to be called with argument(s) (in such a form as for Template.fill() in Variant #2) specifying values for at least those fields which were not included when get_factory() was called to obtain the callable;
  • would return a Template instance whose args contains appropriate string/Interpolation segments as appropriate (given the arguments).

Exampe use:

make_grail_template = Template.get_factory(
    "A {animal_weight} ounce {animal} could {neg!r} "
    "carry a {coconut_weight:.2f} pound coconut.",
    animal="bird",
    neg="never",
)
assert isinstance(make_grail_template, Callable)
grail_template = make_grail_template(
    animal_weight="<5",
    neg="not",  # (<- overrides the default from call to `get_factory()`)
    coconut_weight=1,
)
assert isinstance(grail_template, Template) and grail_template == Template(
    "A ",
    Interpolation("<5", "bird_weight", None, ""),
    " ounce bird could ",
    Interpolation("not", "neg", "r", ""),
    " carry a ",
    Interpolation(1, "coconut_weight", None, ".2f"),
    " pound coconut.",
)  # Typically also equivalent (though not equal) to:
   # t"A {'<5'} ounce {'bird'} could {'not'!r} carry a {1:.2f} pound coconut."

Note: this variant, by itself, would not provide the possibility to fill field values incrementally (though that could be emulated with functools.partial).

Combined Variants: #1-with-#4, #2-with-#4, #3-with-#4

Combining any of the variants #1/#2/#3 with Variant #4: the callable produced by the classmethod Template.get_factory() accepts values for any subset of the template’s fields (i.e., not necessarily for all fields omitted when get_factory() was called), and returns a new Template with replacement fields filled and/or unfilled, as appropriate.

Conclusion

It seems that each of the “true templating” variants described above could be added on top of the current design of template/interpolation strings (some of them with certain non-essential modifications to that design).

Any thoughts? :slight_smile:

3 Likes

Your conclusion is the same as mine: there are a variety of ways we could improve the dynamic templating support, but it’s far from clear which of them would be the best path to take. (One simple variant that occurred to me was to have {...} and {} literally mean the same thing in template strings: if you omit the expression entirely, it is implicitly Ellipsis)

As long as PEP 750 doesn’t prevent pursuing those ideas later, then the situation feels similar to the one PEP 501 was in back when f-strings were first proposed: experience gained with the narrower proposal will help improve the expanded proposal, so waiting is a good option.

In that vein, one possibility we may want to revisit is to reintroduce a structural typing protocol for the Template interface. The protocols were taken out because we didn’t see a strong use case for them (and I still don’t see a strong use case for an Interpolation protocol), but I’m now wondering if it might be worth our while to start with this initial structure:

  • templatelib.Template (protocol)
  • templatelib.StaticTemplate (concrete PEP 750 template)
  • templatelib.Interpolation (concrete PEP 750 field)

Which would then make experimentation with dynamic templates outside the standard library easier (no need to inherit from the concrete type, just implement the structural protocol), and leave the door open to the future addition of:

  • templatelib.DynamicTemplate (concrete template optimised for dynamic value insertion)

Dynamic templates would still use the same interpolation field implementation as static templates, the meaning of their value fields would just be slightly different (either all Ellipsis to indicate their use as placeholders, or else holding default values to use when fields aren’t supplied explicitly)

Edit: Some other potential bikeshed colours for the PEP 750 concrete implementation type would be BoundTemplate, FilledTemplate, PopulatedTemplate. Essentially acknowledging that there’s a separation between the general concept of interpolation templates (the protocol), and the specific implementation backing t-strings (which produces already populated templates with default values for every replacement field)

3 Likes

Would the Template protocol cover the constructor? (That could preclude some future ideas…)

Another thing that seems to me worth deliberating: adding new methods to protocols is painful when it comes to backward compatibility (often the only option is to add a protocol’s subtype)…

Protocols can’t currently constrain the default constructor (see Enforce compatible __init__ in structural subtypes · Issue #3823 · python/mypy · GitHub).

However, as per Protocols — typing documentation, they can require that certain class methods be available:

Static methods, class methods, and properties are equally allowed in protocols.

This means a Template protocol could require that template implementations offer two standard alternate constructors (with useful default implementations):

@classmethod
def from_segments(cls, *args: LiteralStr|Interpolation) -> Self:
    return cls(*args)

@classmethod
def from_template(cls, t:Template) -> Self:
    return cls.from_segments(t.args)

This is an area where adding templatelib as dedicated support module is useful: general purpose algorithms that are valid for any Template implementation can live there, with only the parts that may be tightly coupled to a specific template implementation needing to be methods on the type.

1 Like

This is not entirely equivalent to str.format_map():

>>> "Price: {price:{fmt}}".format_map({"price": 49, "fmt": ".2f"})
'Price: 49.00'
>>> template_from_format_map("Price: {price:{fmt}}", {"price": 49, "fmt": ".2f"})
['Price: ', (49, 'price', None, '{fmt}')]

But, you’re right that this could be implemented by a third party library.


Here’s my attempt at a summary of the different proposals (assuming fields in the format specifier can be replaced):

         "Price: {          :{     }}" .tformat(      49,     ".2f")
Template("Price: {          :{     }}" ,              49,     ".2f")
       tt"Price: {          :{     }}" .fill   (      49,     ".2f")
         "Price: {    price :{ fmt }}" .tformat(price=49, fmt=".2f")
Template("Price: {    price :{ fmt }}" ,        price=49, fmt=".2f")
       tt"Price: {    price :{ fmt }}" .fill   (price=49, fmt=".2f")
 Binder(t"Price: {   'price':{{fmt}}}").bind   (price=49, fmt=".2f")
        t"Price: {...:price :{{fmt}}}" .fill   (price=49, fmt=".2f")