Concerned about Virality of T-Strings (PEP 750)

guido · May 20, 2025, 4:11pm

More OT: Even implicit concatenation is a footgun. IIRC while at Google we had code containing long lists of comma-separated string literals (e.g. a list of countries) and devs would occasionally update the list but forget to add the comma. For this reason Blaze’s Starlark doesn’t support implicit concatenation at all.

Liz · May 20, 2025, 4:45pm

I don’t see how that’s any different here if it can compile time concatenate with an f-string that might have a user provided value, but I also think it was inherently unsafe for webapps to pass unrestricted user input anywhere prior to basic safety checks unless the function was documented to be safe for arbitrary inputs.

dkp · May 23, 2025, 3:38pm

Revisiting older conversations about Template concatenation, my read is that there are roughly three different perspectives. From most to least permissive:

The developer experience view. Reflected in the current PEP and implementation. A Template literal looks like any other (f-)string. Python developers expect string-like values to combine with other string-like values. Prohibiting explicit or implicit concatenation of Template and str would lead to needless surprise/frustration. (A variant holds that implicit concatenation can often be a footgun, so we’ll avoid extending it further.)
The security view. Templates are a language level feature to help prevent injection vulnerabilities. Template.strings is synonymous with “trusted”; Template.interpolations with “untrusted”. Concatenating a Template with an arbitrary str is unsafe because we can’t know the str ’s trust level, so the operation should be disallowed. Developers can mark a string as trusted with Template(my_str) or untrusted with Template(Interpolation(my_str)) . The current PEP treats all str as trusted by default; from the security view, a clear footgun.
The DSL view. Templates are building blocks for domain-specific languages. Interesting template processing code parses template content against some backing grammar. Some grammars allow concatenating conforming strings; others do not. As a result, even allowing Template + Template is probably a mistake. Code that processes a Template can instead return a domain type with __add__ /__radd__ when concatenation is appropriate, or rely on other composition mechanisms.

I suppose these views aren’t so far apart: if you take a more restrictive view than what ships in 3.14, you can configure a lint rule. Likewise, if you take a more permissive view, you can write a tiny helper function. For what it’s worth, I personally align with bucket (2): I think it’s likely to strike the right balance for the 80% case and lead to the least overall dev confusion + grumblingly extra configured lint rules.

davidism · May 23, 2025, 4:27pm

As the maintainer of MarkupSafe, I support your option 2, the security view.

MarkupSafe’s model is very similar. escape(str) always produces a processed string (t"str"), and Markup(str) produces a trusted string (Template(str)). Then all string operations on Markup escape their arguments.

Unsafe: str + str
Safe: escape(str)
Safe: escape(str) + str or str + escape(str), argument to __add__/__radd__ is automatically escaped

The downside is MarkupSafe needs to implement most dunder and str methods in order to catch all the ways strings can be manipulated, and there are non-trivial aspects to that. That’s sidestepped in t-strings because Template is not a string class.

Again, I agree with option 2. I think you should modify the implementation to disallow Template + string etc, but keep Template + Template because it doesn’t result in a loss of safe/unsafe information.

nhumrich · June 10, 2025, 9:36pm

I am curious where this is going. It seems many people feel that implicit concatenation between Template and string is a footgun, and should be avoided. Is there a reason this isn’t being considered? How can I help progress this idea?

At least according to the PEP, security (preventing injection) seems to be the main motivator for PEP 750. So why is the security view #2 not the primary view?

The linting argument is somewhat valid, but I would argue that you could already get all the guarantees that PEP 750 provides by using types (with Literal) and linting. If PEP 750 only exists to allow linters to catch possible injection, I feel it has missed the mark.

MegaIng · June 10, 2025, 9:49pm

See the github issue.

You are not seeing any progress just because the Steering Council hasn’t made a ruling - they were occupied with PyCon and probably have more time now to make a decision in the next week(s).

nhumrich · June 11, 2025, 12:29am

Thank you!

dkp · June 21, 2025, 12:14am

The SC has approved changes to:

Allow Template/Template concatenation (implicit and explicit)
Disallow Template/str concatenation (implicit and explicit)

This corresponds with the “security view” above. We’ll update the PEP and cpython implementation ASAP.

Cavalierex · June 22, 2025, 3:09pm

Would it be possible to force t"str" + "str" or t"str" + f"str" to become t"str" + t"str" effectively? An error that t"" + t"" is required would be fine too, but casting a regular string or f-string to template string might be a smooth way to go, if it is possible to capture the f-string’s declaration before the interpreter handles its evaluation.

ericvsmith · June 22, 2025, 5:15pm

In the f-string case, that would be difficult to do, plus I think it would be surprising behavior with possibly different semantics from f-strings. Best to just disallow it: the fix is obvious enough.

Cavalierex · June 23, 2025, 12:02am

Thanks for the reply! I agree that an error message is the best way to go.

But in saying that a t"" + f"" might be cast to t"" + t"" and result in a t-string, I was considering how a float + int results in a float rather than produces an error message. And this functionality can be intuitive in Python, if not in other languages.

But yes, the clearest solution is to produce an error message.