More OT: Even implicit concatenation is a footgun. IIRC while at Google we had code containing long lists of comma-separated string literals (e.g. a list of countries) and devs would occasionally update the list but forget to add the comma. For this reason Blaze’s Starlark doesn’t support implicit concatenation at all.
I don’t see how that’s any different here if it can compile time concatenate with an f-string that might have a user provided value, but I also think it was inherently unsafe for webapps to pass unrestricted user input anywhere prior to basic safety checks unless the function was documented to be safe for arbitrary inputs.
Revisiting older conversations about Template
concatenation, my read is that there are roughly three different perspectives. From most to least permissive:
-
The developer experience view. Reflected in the current PEP and implementation. A
Template
literal looks like any other (f-)string. Python developers expect string-like values to combine with other string-like values. Prohibiting explicit or implicit concatenation ofTemplate
andstr
would lead to needless surprise/frustration. (A variant holds that implicit concatenation can often be a footgun, so we’ll avoid extending it further.) -
The security view. Templates are a language level feature to help prevent injection vulnerabilities.
Template.strings
is synonymous with “trusted”;Template.interpolations
with “untrusted”. Concatenating aTemplate
with an arbitrarystr
is unsafe because we can’t know thestr
’s trust level, so the operation should be disallowed. Developers can mark a string as trusted withTemplate(my_str)
or untrusted withTemplate(Interpolation(my_str))
. The current PEP treats allstr
as trusted by default; from the security view, a clear footgun. -
The DSL view. Templates are building blocks for domain-specific languages. Interesting template processing code parses template content against some backing grammar. Some grammars allow concatenating conforming strings; others do not. As a result, even allowing
Template + Template
is probably a mistake. Code that processes aTemplate
can instead return a domain type with__add__
/__radd__
when concatenation is appropriate, or rely on other composition mechanisms.
I suppose these views aren’t so far apart: if you take a more restrictive view than what ships in 3.14, you can configure a lint rule. Likewise, if you take a more permissive view, you can write a tiny helper function. For what it’s worth, I personally align with bucket (2): I think it’s likely to strike the right balance for the 80% case and lead to the least overall dev confusion + grumblingly extra configured lint rules.
As the maintainer of MarkupSafe, I support your option 2, the security view.
MarkupSafe’s model is very similar. escape(str)
always produces a processed string (t"str"
), and Markup(str)
produces a trusted string (Template(str)
). Then all string operations on Markup
escape their arguments.
- Unsafe:
str + str
- Safe:
escape(str)
- Safe:
escape(str) + str
orstr + escape(str)
, argument to__add__
/__radd__
is automatically escaped
The downside is MarkupSafe needs to implement most dunder and str methods in order to catch all the ways strings can be manipulated, and there are non-trivial aspects to that. That’s sidestepped in t-strings because Template
is not a string class.
Again, I agree with option 2. I think you should modify the implementation to disallow Template + string
etc, but keep Template + Template
because it doesn’t result in a loss of safe/unsafe information.
I am curious where this is going. It seems many people feel that implicit concatenation between Template
and string
is a footgun, and should be avoided. Is there a reason this isn’t being considered? How can I help progress this idea?
At least according to the PEP, security (preventing injection) seems to be the main motivator for PEP 750. So why is the security view #2 not the primary view?
The linting argument is somewhat valid, but I would argue that you could already get all the guarantees that PEP 750 provides by using types (with Literal
) and linting. If PEP 750 only exists to allow linters to catch possible injection, I feel it has missed the mark.
See the github issue.
You are not seeing any progress just because the Steering Council hasn’t made a ruling - they were occupied with PyCon and probably have more time now to make a decision in the next week(s).
Thank you!