I'd like to understand the PEP-501 array-of-2-tuples representation

I’ve been reading PEP 501 – General purpose string interpolation | peps.python.org and this confuses me:

The parsed template consists of a tuple of 2-tuples, with each 2-tuple containing the following fields:

  • leading_text: a leading string literal. This will be the empty string if the current field is at the start of the string, or immediately follows the preceding field.
  • field_expr: the text of the expression element in the substitution field. This will be None for a final trailing text segment.

I see that it’s doing that, but it doesn’t seem to explain why that particular structure.

Because it’s a way to describe the alternating pattern in an i-string: alternating literal text and substitution expressions.

It’s the same reason string.Formatter.parse() returns tuples of (literal_text, field_name, format_spec, conversion).

Is it some kind of optimization? Again, I see the what; what I don’t understand is the why.

I’m sorry, but I don’t understand your confusion. How else could it be specified? I guess it could be a tuple where you’d have to know the odd numbered elements meant something different from the even numbered ones, but that would be a hassle to operate on. What would you expect it to be?

I guess … why that format instead of, say, another one. I would like to understand what makes that a more optimal format. I don’t have expectations of what a representation would be, but positional representations like that always “feel” off to me.

I can see how it’d work, but how one arrived at that representation is unclear.

I can’t explain the thought process behind it. But it seems obvious to me that a good way to represent the structure in ABABABA is ((A, B), (A, B), (A, B), (A, None)).

1 Like