More OT: Even implicit concatenation is a footgun. IIRC while at Google we had code containing long lists of comma-separated string literals (e.g. a list of countries) and devs would occasionally update the list but forget to add the comma. For this reason Blaze’s Starlark doesn’t support implicit concatenation at all.
I don’t see how that’s any different here if it can compile time concatenate with an f-string that might have a user provided value, but I also think it was inherently unsafe for webapps to pass unrestricted user input anywhere prior to basic safety checks unless the function was documented to be safe for arbitrary inputs.
Revisiting older conversations about Template concatenation, my read is that there are roughly three different perspectives. From most to least permissive:
-
The developer experience view. Reflected in the current PEP and implementation. A
Templateliteral looks like any other (f-)string. Python developers expect string-like values to combine with other string-like values. Prohibiting explicit or implicit concatenation ofTemplateandstrwould lead to needless surprise/frustration. (A variant holds that implicit concatenation can often be a footgun, so we’ll avoid extending it further.) -
The security view. Templates are a language level feature to help prevent injection vulnerabilities.
Template.stringsis synonymous with “trusted”;Template.interpolationswith “untrusted”. Concatenating aTemplatewith an arbitrarystris unsafe because we can’t know thestr’s trust level, so the operation should be disallowed. Developers can mark a string as trusted withTemplate(my_str)or untrusted withTemplate(Interpolation(my_str)). The current PEP treats allstras trusted by default; from the security view, a clear footgun. -
The DSL view. Templates are building blocks for domain-specific languages. Interesting template processing code parses template content against some backing grammar. Some grammars allow concatenating conforming strings; others do not. As a result, even allowing
Template + Templateis probably a mistake. Code that processes aTemplatecan instead return a domain type with__add__/__radd__when concatenation is appropriate, or rely on other composition mechanisms.
I suppose these views aren’t so far apart: if you take a more restrictive view than what ships in 3.14, you can configure a lint rule. Likewise, if you take a more permissive view, you can write a tiny helper function. For what it’s worth, I personally align with bucket (2): I think it’s likely to strike the right balance for the 80% case and lead to the least overall dev confusion + grumblingly extra configured lint rules. ![]()
As the maintainer of MarkupSafe, I support your option 2, the security view.
MarkupSafe’s model is very similar. escape(str) always produces a processed string (t"str"), and Markup(str) produces a trusted string (Template(str)). Then all string operations on Markup escape their arguments.
- Unsafe:
str + str - Safe:
escape(str) - Safe:
escape(str) + strorstr + escape(str), argument to__add__/__radd__is automatically escaped
The downside is MarkupSafe needs to implement most dunder and str methods in order to catch all the ways strings can be manipulated, and there are non-trivial aspects to that. That’s sidestepped in t-strings because Template is not a string class.
Again, I agree with option 2. I think you should modify the implementation to disallow Template + string etc, but keep Template + Template because it doesn’t result in a loss of safe/unsafe information.
I am curious where this is going. It seems many people feel that implicit concatenation between Template and string is a footgun, and should be avoided. Is there a reason this isn’t being considered? How can I help progress this idea?
At least according to the PEP, security (preventing injection) seems to be the main motivator for PEP 750. So why is the security view #2 not the primary view?
The linting argument is somewhat valid, but I would argue that you could already get all the guarantees that PEP 750 provides by using types (with Literal) and linting. If PEP 750 only exists to allow linters to catch possible injection, I feel it has missed the mark.
See the github issue.
You are not seeing any progress just because the Steering Council hasn’t made a ruling - they were occupied with PyCon and probably have more time now to make a decision in the next week(s).
Thank you!
The SC has approved changes to:
- Allow Template/Template concatenation (implicit and explicit)
- Disallow Template/str concatenation (implicit and explicit)
This corresponds with the “security view” above. We’ll update the PEP and cpython implementation ASAP.
Would it be possible to force t"str" + "str" or t"str" + f"str" to become t"str" + t"str" effectively? An error that t"" + t"" is required would be fine too, but casting a regular string or f-string to template string might be a smooth way to go, if it is possible to capture the f-string’s declaration before the interpreter handles its evaluation.
In the f-string case, that would be difficult to do, plus I think it would be surprising behavior with possibly different semantics from f-strings. Best to just disallow it: the fix is obvious enough.
Thanks for the reply! I agree that an error message is the best way to go.
But in saying that a t"" + f"" might be cast to t"" + t"" and result in a t-string, I was considering how a float + int results in a float rather than produces an error message. And this functionality can be intuitive in Python, if not in other languages.
But yes, the clearest solution is to produce an error message.