PEP 750: Tag Strings For Writing Domain-Specific Languages

Putting aside the terminology/naming issue for a moment, I’ll now attempt to put together the building blocks from several previous posts by various authors (including me) plus some fresh ideas – to explore, and make it easy to compare, various possibilities to add “true templating” (i.e., optional decoupling the step of defining the template structure from the step of specifying the content, i.e., actual interpolated values) on top of the current design of the template strings feature (possibly suggesting some non-fundamental changes to that design).

(Please, correct me if I omit any important elements or misrepresent somebody’s ideas.)

Variant #1: Ellipsis-based

Assuming that the use of ... (the Ellipsis literal) as a special value denoting an unfilled value (as suggested by @ncoghlan) is a satisfying syntax, we might want to use it in the following way:

animal = "bird"  # (<- only this field value will be specified immediately)
t1 = t"A {...:animal_weight} ounce {animal} could {...:neg!r} carry a {...:coconut_weight:.2f} pound coconut."
assert t1 == t"A {...:animal_weight} ounce {'bird'} could {...:neg!r} carry a {...:coconut_weight:.2f} pound coconut."
html(t1)  # Could raise: ValueError("unfilled fields: 'animal_weight', 'neg', 'coconut_weight'")

t2 = fill(t1, animal_weight="<5", neg="not")
assert t2 == t"A {'<5'} ounce {'bird'} could {'not'!r} carry a {...:coconut_weight:.2f} pound coconut."
html(t2)  # Could raise: ValueError("unfilled fields: 'coconut_weight'")

t3 = fill(t2, {"coconut_weight": 1})
assert t3 == t"A {'<5'} ounce {'bird'} could {'not'!r} carry a {1:.2f} pound coconut."
assert html(t3) = "A &lt;5 ounce bird could &#39;not&#39; carry a 1.00 pound coconut."

(Please note how conversions and format specs are supported…)

The fill() [in some of my previous posts I proposed a less suitable name, bind] function would need to be implemented somehow along the lines of the following:

def fill(
    template: Template,
    mapping: Mapping[str, Any] | None = None,
    /,
    **kwargs: Any,
) -> templatelib.Template:

    if mapping is None:
        mapping = kwargs
    elif kwargs:
        raise TypeError(
            "fill() takes either a mapping as the "
            "sole positional argument or any number "
            "of keyword arguments, but not both")

    match_spec_regex = _FILL_SPEC_REGEX.fullmatch  # (see below)

    def _gen_segments() -> Iterator[str | Interpolation]:
        for segment in template.args:
            match segment:
                case Interpolation(bultins.Ellipsis, "...", noconv, spec):
                    match = match_spec_regex(spec)
                    if match is None:
                        raise ValueError(
                          f"when `...` is used as a placeholder for "
                          f"a value to be filled-in, what needs to "
                          f"be placed to the right of the `...:` "
                          f"marker is the obligatory name part with "
                          f"an optional conversion and/or format "
                          f"specification (got: {spec!r})")
                    if noconv is not None:
                        raise ValueError(
                          "when `...` is used as a placeholder for "
                          "a value to be filled-in, the conversion "
                          "part, if any, needs to be placed to the "
                          "right of the name part (which follows "
                          "the `...:` marker)")
                    field_name = match["field_name"]
                    if not field_name.isidentifier():
                        # (keep it simple + avoid parsing ambiguities)
                        raise ValueError(
                          f"when `...` is used as a placeholder for "
                          f"a value to be filled-in, the name part "
                          f"needs to be a valid Python variable "
                          f"name (got: {field_name!r})")
                    value = mapping[field_name]
                    expr = repr(value)  # Or match["field_name"] ?!?
                    conv = match["conv"]
                    format_spec = match["format_spec"] or ""
                    yield Interpolation(value, expr, conv, format_spec)
                case Interpolation() | str():
                    yield segment
                case _:
                    typing.assert_never(segment)

    return Template(*_gen_segments())

_FILL_SPEC_REGEX = re.compile(
    r"(?P<field_name>[^!:]+)"
    r"(?:"
        r"!"                 # (no need to be strict here,
        r"(?P<conv>[^:]+)"   # as Interpolation() constructor
    r")?"                    # will validate it anyway...)
    r"(?:"
        r":"
        r"(?P<format_spec>.*)"
    r")?"
)

Perhaps that function could become a Template’s instance method, as the blessed way to process ...:-engaging templates? (not necessarily immediately, maybe in another major release of Python…)

Anyway, with this approach “true templating” can be easily added on top of the current implementation of template strings.

Variant #2: tt-strings syntax

Another approach would require some syntax extension…

One possible way to denote an unfilled (aka unbound) replacement field could be the {:name}-like syntax (proposed in one of my latest posts). However, now I think it is hardly better than the {...:name}-based approach described above (as being not much prettier than the latter one, which at least makes occurrences of unfilled fields more visible/distinguishable…).

So here I’d like to explore yet another variant (important features of which were suggested by @Nineteendo) which, as it seems, could be introduced later, in a separate PEP. To do that it would be required to add:

  • tt-strings – with the semantics drafted below (obviously some other prefix than tt might be used [e.g., t if it was decided that the base prefix is i…]),

  • a new exception type: UnfilledTemplateError (being a subclass of ValueError),

  • a new type (somewhat similar to Interpolation):

    class UnfilledField:
        name: str
        conv: Literal["a", "r", "s"] | None
        format_spec: str
    
        __match_args__ = ("name", "conv", "format_spec")
    
        def __init__(
            self,
            name: str,
            conv: Literal["a", "r", "s"] | None = None,
            format_spec: str = "",
        ):
            ...
    

E.g., we might want to use that stuff in the following way:

# Just for more consistent naming in the following examples:
ReplacementField = Interpolation  

# First, a well known PEP-750 template string:
a = "<5"
b = "not"
c = 1
immediate = t"A {a} bird could {b!r} carry a {c:.2f} pound coconut."
assert isinstance(immediate, Template) and immediate == Template(
    "A ",
    ReplacementField("<5", "a", None, ""),
    " ounce bird could ",
    ReplacementField("not", "b", "r", ""),
    " carry a ",
    ReplacementField(1, "c", None, ".2f"),
    " pound coconut.",
)  # Typically also equivalent (though not equal) to:
   # t"A {'<5'} ounce {'bird'} could {'not'!r} carry a {1:.2f} pound coconut."

# Now, the new stuff...
t1 = tt"A {bird_weight} ounce bird could {neg!r} carry a {coconut_weight:.2f} pound coconut."
assert isinstance(t1, Template) and t1 == Template(
    "A ",
    UnfilledField("bird_weight", None, ""),
    " ounce bird could ",
    UnfilledField("neg", "r", ""),
    " carry a ",
    UnfilledField("coconut_weight", None, ".2f"),
    " pound coconut.",
) != immediate
html(t1)  # Could raise: UnfilledTemplateError("unfilled fields: 'bird_weight', 'neg', 'coconut_weight'")

t2 = t1.fill(bird_weight="<5", neg="not")
assert isinstance(t2, Template) and t1 != t2 == Template(
    "A ",
    ReplacementField("<5", "bird_weight", None, ""),
    " ounce bird could ",
    ReplacementField("not", "neg", "r", ""),
    " carry a ",
    UnfilledField("coconut_weight", None, ".2f"),  # <- Note: still unfilled!
    " pound coconut.",
) != immediate
html(t2)  # Could raise: UnfilledTemplateError("unfilled fields: 'coconut_weight'")

t3 = t2.fill({"coconut_weight": 1})
assert html(t3) == html(immediate) == (
    "A &lt;5 ounce bird could &#39;not&#39; carry a 1.00 pound coconut."
)
# Note: typically, `t3` is also equivalent to `immediate`
# (though not equal to it -- because of `expr`s).

Template.fill() (somewhat similar to the fill() function from Variant #1) would accept one positional argument (a mapping) or any number of keyword arguments, and would produce a new instance of Template – with selected UnfilledField instances replaced with appropriately constructed Interpolation ones (where appropriately constructed means: Interpolation(mapping[unfilled.name], unfilled.name, unfilled.conv, unfilled.format_spec)).

Template would also provide the following methods:

  • Template.fill_all() – similar to fill(), but raising UnfilledTemplateError if any UnfilledField is remaining;
  • Template.all_fields_filled() – returning True unless args contains any UnfilledField;
  • classmethod Template.make_from() – accepting a template as an ordinary str, plus optionally such argument(s) as accepted by Template.fill(); returning a Template whose args contains appropriate string/Interpolation/UnfilledField segments (as appropriate).

Variant #3: tt-strings syntax with Template tweaks

Like Variant #2 – but with modified Template, so that:

  • the constructor Template() behaves like Template.make_from() from Variant #2;
  • the type has classmethod Template.from_segments() – behaving like the current Template() constructor;
  • the type does not have Template.make_from(), as it would be redundant (see above).

Variant #4: with a factory maker

In this variant we would have neither any additional syntax nor Variant #1-like extra conventions and tools.

Instead, the Template class would provide only one additional classmethod: Template.get_factory(); it would accept a template as an ordinary str (like the first argument to make_from() in Variant #2) as well as, optionally, the default values for any subset of the template’s fields (in such a form as for Template.fill() in Variant #2); it would return a callable which:

  • would have to be called with argument(s) (in such a form as for Template.fill() in Variant #2) specifying values for at least those fields which were not included when get_factory() was called to obtain the callable;
  • would return a Template instance whose args contains appropriate string/Interpolation segments as appropriate (given the arguments).

Exampe use:

make_grail_template = Template.get_factory(
    "A {animal_weight} ounce {animal} could {neg!r} "
    "carry a {coconut_weight:.2f} pound coconut.",
    animal="bird",
    neg="never",
)
assert isinstance(make_grail_template, Callable)
grail_template = make_grail_template(
    animal_weight="<5",
    neg="not",  # (<- overrides the default from call to `get_factory()`)
    coconut_weight=1,
)
assert isinstance(grail_template, Template) and grail_template == Template(
    "A ",
    Interpolation("<5", "bird_weight", None, ""),
    " ounce bird could ",
    Interpolation("not", "neg", "r", ""),
    " carry a ",
    Interpolation(1, "coconut_weight", None, ".2f"),
    " pound coconut.",
)  # Typically also equivalent (though not equal) to:
   # t"A {'<5'} ounce {'bird'} could {'not'!r} carry a {1:.2f} pound coconut."

Note: this variant, by itself, would not provide the possibility to fill field values incrementally (though that could be emulated with functools.partial).

Combined Variants: #1-with-#4, #2-with-#4, #3-with-#4

Combining any of the variants #1/#2/#3 with Variant #4: the callable produced by the classmethod Template.get_factory() accepts values for any subset of the template’s fields (i.e., not necessarily for all fields omitted when get_factory() was called), and returns a new Template with replacement fields filled and/or unfilled, as appropriate.

Conclusion

It seems that each of the “true templating” variants described above could be added on top of the current design of template/interpolation strings (some of them with certain non-essential modifications to that design).

Any thoughts? :slight_smile:

3 Likes