PEP 750: Tag Strings For Writing Domain-Specific Languages

When writing Feature Proposal: Multi-String Replacement Using a Dictionary in the .replace() Method - #10 by blhsing it occurred to me that it would be interesting if either string.Formatter or the literal Template type offered a way to dynamically convert runtime format strings to interpolation template instances.

The code to implement a basic version of that with only name lookups (and without dynamic field format definitions) isn’t overly complicated:

def template_from_format_map(
    fmt: LiteralString, values: Mapping[str, Any]
) -> Template:
    parser = string.Formatter()
    segments = []
    for prefix, name, field_fmt, field_conv in parser.parse(fmt):
        segments.append(prefix)
        if name is not None:
            field = Interpolation(values[name], name, field_conv, field_format)
            segments.append(field)
    return Template(*segments)

(The format string would be typed as LiteralString instead of str as a reminder that any static template content should always come from a trusted source, whether that’s an actual literal string, or something that has been explicitly cast to one)

Allowing the same level of dynamic reference flexibility as str.format and str.format_map is substantially more complicated.

Adding an actual string.Formatter.as_template() method would likely be confusing (given the ambiguity between the new Template literal type and string.Template), but I think we could unambiguously offer a string.Formatter.get_segments method that worked for both the default formatting and any custom subclasses (since it would only be calling subclass APIs that string.Formatter.vformat already calls - presumably _vformat would be redesigned to call self.get_segments instead of calling self.parse directly):

def get_segments(self,
    fmt: LiteralString, args: Sequence[Any], values: Mapping[str, Any]
) -> Sequence[LiteralString|Interpolation]:
    segments = []
    for prefix, value_ref, field_fmt, field_conv in self.parse(fmt):
        segments.append(prefix)
        if value_ref is not None:
            # This branch would do everything `_vformat` does, but
            # writing that out would make this example far too long
            value, _lookup_key = self.get_field(value_ref, args, values)           
            field = Interpolation(value, value_ref, field_conv, field_fmt)
            segments.append(field)
    return Template(*segments)

That way, any str.format_map call could be turned into an interpolation template instance by replacing:

formatted = pattern.format_map(values)

with:

segments = string.Formatter().get_segments(pattern, (), values)
template = Template(*segments)

Along similar lines, while it definitely doesn’t need to be in the initial design, we may also eventually want to add a replace_values method to template instances to make it easier to use statically defined templates for dynamic formatting tasks:

def replace_values(self, values: Sequence[Any]) -> Self:
    value_iter = iter(values)
    def update_segments():
        for segment in self.args:
            match segment:
                case str() as s:
                    updated = s
                case Interpolation(_, _, conv, fmt_spec):
                    value = next(value_iter)
                    updated = Interpolation(value, repr(value), conv, fmt_spec)
            yield updated
    return type(self)(*update_segments())

Tangent: string.Formatter.convert_field() may be worth mentioning in the PEP as the current dynamic stdlib implementation of the standard conversion specifiers (it’s an instance method rather than a static method, as it’s designed to allow Formatter subclasses to override it)

1 Like