Hello, it’s still Daniele here, the guy who is making Python talking with Postgres since 2010
The discussion PEP 750: Please add Template.join() was based on the idea that if Template + str
and str + Template
are deemed safe operation, then an eventual t"".join()
would be safe to implement on top of them.
Turns out that Template + str
is an insecure operation, allowing to authorize whatever unsafe input:
>>> evil = "<evil>"
>>> t"<good>" + evil
Template(strings=('<good><evil>',), interpolations=())
What was the rationale to allow Template.__add__()
to accept a string? Is there any reference?
Can this footgun be defused before releasing the feature in Python 3.14?
It doesn’t seem -as proposed in the other thread- related to implicit concatenation of string and t-string literals, dis
says that it’s handled by the parser and doesn’t seem to need __add__
:
>>> dis.dis("""
... template = t"foo {name} bar" "baz"
... """)
0 RESUME 0
2 LOAD_CONST 3 (('foo ', ' barbaz'))
LOAD_NAME 0 (name)
LOAD_CONST 1 ('name')
BUILD_INTERPOLATION 2
BUILD_TUPLE 1
BUILD_TEMPLATE
STORE_NAME 1 (template)
LOAD_CONST 2 (None)
RETURN_VALUE
Just trying to figure out t-string safety implications before building a database driver on top of it. If an user takes a field name from unsafe input and wants to build a query:
field = input()
currently they have to use a sql.Identifier
object before being able to merge it to a string:
query = SQL("SELECT {field} FROM table WHERE id = %s").format(field=Identifier(field))
cur.execute(query, [id,])
If anyone forgot the Identifier()
wrapper and passed field=field
, the field name would be inserted as harmless and well escape string literal: SELECT 'field_name' FROM table...
.
With t-strings the same operation could be, for example:
cur.execute(t"SELECT {field:i} FROM table WHERE id = {id}")
This is safe for the same reason as above: if anyone forgot the :i
“identifier” format, the field name would be passed as literal
But someone can write too easily:
cur.execute(t"SELECT " + field + t" FROM table WHERE id = {id}")
which is a door open to injection. This is not worse than just using strings to compose queries, which we have actively discouraged for years, but definitely not better and definitely worse than the safety we currently have in place thanks to the psycopg.sql
objects.
I am afraid this veers towards being a deal breaker.