PEP 750: Please add Template.join()

Hello, I am the psycopg maintainer. :waving_hand:

Template strings look the most exciting thing happening since DBAPI. I have seen they have been just merged in Python 3.14, thank you very much!

I have just started to add support in psycopg 3, and it looks pretty good! Please see here the current unit tests for an idea of how it looks:

    vstr = "hello"
    cur = await aconn.execute(t"select {vstr}")
    assert await cur.fetchone() == ("hello",)

So far so good. Often times people have a list of fields and values to work on, and it’s easy to compose them into a query. Before that, we would have used the psycopg.sql objects. Now it.s easy… almost?

    fields = t""
    for i, name in enumerate(("foo", "bar", "baz")):
        if i:
            fields += ", "
        fields += t"{i} as {name:i}"

    cur = await aconn.execute(t"select {fields}")
    assert await cur.fetchone() == (0, 1, 2)
    assert cur.description[0].name == "foo"

This is not a new pattern: it is exactly str.join(), but taking a sequence of Templates or strings and returning a Template. It comes over and over: join a list of fields by commas, join a list of conditions by AND, join a list of snippets by empty string…

I have big expectations that the end user will be able to express the above with a much more succint:

    fields = t','.join(
        t"{i} as {name:i}" for i, name in enumerate(("foo", "bar", "baz"))
    )
    cur = await aconn.execute(t"select {fields}")

However Template.join() is left out of the interface. I find this bizarre: Template already supports the + operator and joining a sequence is pretty much a generalization of the concatenation.

The above pattern is so useful that, if not available, I will stretch the existing sql.SQL object to allow to take templates as join() input:

    fields = sql.SQL(',').join(
        t"{i} as {name:i}" for i, name in enumerate(("foo", "bar", "baz"))
    )

This is a worst option for the end users because:

  • with string templates, the need of using psycopg.sql object pretty much disappears
  • a Template as a value in psycopg.SQL object is only handled as special case in .execute(Template), using it as execute(Composable) would try to adapt template to Postgres syntax and fail.

Therefore, I don’t ask it for myself, because the problem is solved on my part, but I ask it for the end users of whoever will use template strings: please considering adding Template.join() before feature freeze.

Thank you very much for your amazing work :smiling_face:

10 Likes

Please don’t just repost discussions where you didn’t like the result without even a link to the previous iteration:

You have gotten different suggestion as to what to support instead; Why are they not an option?

1 Like
  • Because I have been told that PEP was not the right audience
  • Because as of yesterday I have actually used the feature and I am reporting results
  • Because your answer was wrong, I am not sure you read the problem (a separator is needed)
  • Because using the format as separator is wrong: we need the format for the format (binary, text, identifier) not to specify the joiner
  • Because x for f in fields_list for x in (*f, ', ')][:-1] is not simple to understand by any definition.

In practice, I am reposing the question to an audience who might see the problem.

7 Likes

That’s rather unfriendly. The result of the prior discussion was a request to post to Ideas:

(Process note aside: the PEPs category is for posting PEPs, not discussing them after they’re complete. Use Core Development, Help or Ideas once a PEP has been accepted.)

6 Likes

The thought process that led to the lack of both join() and split()->list[str] isn’t very obvious to me, either. Especially in the light of the PEP 787 idea to make subprocess interpret t-strings … wouldn’t join provide a Template.split() method provide the equivalent functionality?

2 Likes

Yes; All this means is that the post should have been moved, there wasn’t a need to post a new discussion.

I have read the problem. You appear to not be open to other, even simpler (for users), solutions.

And then someone else posted an extension of my idea to which you don’t appear to have even considered.

Noone, at any point, has suggested this as an alternative. Edit: It was suggested, I just overlooked it. I agree that this is not a sensible alternative for end users.

No, this is an excellent example: Template.split is incapable of doing to the correct thing for shell-like processing and would just be a dangerous distraction that will lead to surprises. It shouldn’t be provided so that authors aren’t tempted by it.

I agree that the argument against Template.join is weaker - but I still think most of the time it’s more useful for the library to support sequences of templates directly and that actual usecases for Template.join (where it’s clearly the best solution) are very rare. (I am still convinced what is present in OP isn’t one of those rare cases - supporting t"select {fields}" seems way nicer from a user perspective.)

It was suggested here:

1 Like

I am slightly confused why the need for a separator prevents processing lists.

PS: I am using psycopg, and I thank you for your work :slight_smile:

Is the problem that psycopg can’t know whether commas or "AND" is needed as a separator?

In that case, could you process templates styled as t"select {[t'{foo:i}', t'{bar:i}', t'{baz:i}]:AND}" ?

2 Likes

I do want to add, supporting .join() would look like a good feature to me.

If ",".join([t"{x}", t"{y}"]) doesn’t work, that’s definitely going to trip me up at some point. And if that does work but t",{z},".join([t"{x}", t"{y}"]) doesn’t, that’s going to cause someone a lot of stupid trouble.

I don’t see how it would have disadvantages either.

OTOH, supporting the .split() method would be a nasty footgun IMO. I have several intuitions about how that would work when the token you split on exists inside a t-string, and I’m pretty sure some of them are wrong. What would happen to t"{x=}".split("=") for example?


If you support .join() but not .split(), it might be slightly unintuitive that for a list including a t-string splitter.join(my_list).split(splitter) == my_list throws an error instead of returning False. But it already raises an error if an int is in the list, so it’s not a big deal to me.

2 Likes

split() should probably be a separate Idea forum post, I think. (which is why I deleted my prior post).

Yes, that’s correct. Joiner strings are likely different in different contexts:

fields = t", ".join(list_of_fields)
conditions = t" AND ".join(list_of_conditions)
cur.execute(t"select {fields} from mytable where {contditions}")

This was proposed, yes, but it seems an abuse of Interpolation.format_spec. One need we have is to specify the parameter format - akin to using %s, %b, %t in psycopg. In the first experiment with template strings I am using:

  • :s or no format: bind the parameter server-side use the best format for the type, text or binary (similar to %s).
  • :b: bind the parameter server-side, send it in binary format
  • :t: bind the parameter server-side, send it in text format
  • :i: bind the parameter client-side with identifier escaping rules (similar to sql.Identifier, e.g. "foo-bar")
  • :l: bind the parameter client-side using SQL literal rules (similar to sql.Literal, or similar to using psycopg2, where all binding was client-side. e.g. 10, or 'foo''bar' or '2025-01-01'::date)

It would seems strange to me, in the same “laguage” (template strings to generate SQL), to have in certain places symbolic format specs and in other literal string to operate on. What tells the two apart? The fact that the value is a list? Psycopg can adapt lists too as Postgres arrays, so we should single out lists of templates and special-case their processing. And what if a list is empty? Is it an empty list of conditions to render client-side as an empty string or an empty array value to pass to the database as '{}'? It’s pretty brittle.

In this proposal, AND or , are the parameter of an operation (template joining), and the thing that I find unsatisfactory is that the operation is implicit. This burns all the bridges to be able to perform any other operations. If we want to be explicit, we should put the operation too in the format spec, so it might be:

t"{filters:join(' AND ')}"

but now I’m starting to need to write a Python parser to parse that, or to come out with an entirely new language… Yes, I can parse:

t"{filters:join~ AND )}"

or whatever else but that doesn’t make it a good idea :slight_smile:

Not to mention that, unlike in the value, I can’t reuse the same quotes of the t-string…

>>> t"{[]:join(" AND ")}"
  File "<python-input-1>", line 1
    t"{[]:join(" AND ")}"
          ^^^^^
SyntaxError: t-string: expecting '}', or format specs.

Therefore my preference is to describe the operation “join these snippets using this separator” as an external operation rather than cramming more than symbolic info in the format_spec.

4 Likes

Even if Template.join is available, I still think you should support passing in a list of templates in these places - because that is the more user friendly option. It strictly reduces the burden on users to be careful around the potentially dangerous topic of string interpolation which is exactly the goal of PEP 750.

I personally would not suggest having just the literal in the specifier - use some more syntax to communicate the operation, but don’t use the word join or try to parse string syntax.

Here are a few ideas you could use:

  • t"{filters:!AND}", t"{filters:! AND }" (! means join, the parameter format would be added between : and !.
  • same idea with a variety of different symbols like ?, /, +, -, &[1], %, $
  • `t"{filters:[AND]}". Similar, but with explicit parentheses action as a reminder that a list is expected and as delimiter on both sides in case you want to expand this language even more.

Are you not doing any syntax analysis? Because if yes, surely you could figure out which options are valid? Or alternatively, “Refuse the temptation to guess” and raise an error - I find it unlikely that this is going to come up often in user code.


  1. I think this is my personal favorite ↩︎

I am afraid this would serve no practical purpose. If SQL supported a dangling comma then [t"{field:i}," for field in fields] might have a marginal utility, but here we are dealing with a 1970 language (:man_dancing:). Joining a sequence of templates by a blank joiner is a more marginal use case than the one I’ve regularly encountered composing dynamic SQL. So much that we can refuse the temptation to guess and reject it as an error (the library will go “the user wants to adapt a list of object. These objects are templates. I don’t know how to adapt a template as a database value. Throw an exception”).

It still feels like an abuse, but if I had to paint my own bikeshed I’d go for + because that’s what used in Python for concatenation of strings and templates.

Of course not. First: here we are still trying to construct a statement, so whatever we have is not valid yet, until after having merged all the parts. But, fundamentally, syntactic analysis is the role of the server accepting or discarding the request. The server adds new statements at every new version; Postgres-compatible databases have their own syntax. The parser would need to be kept in lockstep with the server parser and precisely identify vendor and version, or it would easily reject valid statements.

2 Likes

Well, I guess we just disagree here.

  • I still think that Template.join is a footgun that will either lead to surprise errors or confusing situations where it doesn’t work or does the wrong thing. (It would have to reject all non-template strings as argument to be anywhere near safe - which will surely lead to complains about it not working correctly. Or even worse, people blindly casting stuff to Template with map(Template, iterable_of_string) which completely defeats the purpose.)
  • I still think that you, as the consumer of the templates, should do the heavy lifting
  • I don’t think using a format specifier string to … specify the formatting of the input is abuse, but you haven’t really argued why you think this and I don’t think we are going to agree on this anyway.

I am ok with polite disagreement :slight_smile:

Concatenation is an operation based on a more fundamental operation, which is well-defined: Template.__add__ or __radd__. Whatever is not accepted of the RHS of t"" + should throw an exception.

Speaking of which, this is not currently doing the right job:

>>> acc = t""
>>> acc += 1
>>> acc
NotImplemented

but maybe I’m using an old version of the code. I am testing with the koxudaxi/python:3.14-rc-bullseye image, I’ll wait for 3.14a8/b1 to test more.

I expressed the fact that we need to specify an operation + arguments, so that’s two objects to parse, and I will add that expressing a string into a string requires an extra layer of escaping.

Again, not a problem about it, but I’d like to hear about other opinions. My instinct says that, as soon as people start using template strings, if Template.join() is not available in 3.14 it will be vocally requested in 3.15.

2 Likes

But these strings being complex is pretty normal; Look at the “default” formatting spec for ints and floats: string — Common string operations — Python 3.13.3 documentation. The most unusual requirement you have is that it needs to support an arbitrary string instead of just an arbitrary single char. But this is a solvable problem. (I also don’t think you need to support escaping, but I might be wrong. AFAICT most joiners are going to be very simple strings)

I don’t think minilang-ifying the format specifier is the answer here, and a bare “AND” specifier doesn’t look quite right either. I think both of these violate the principle of least surprise - they feel dirty. To me, a simple utility function seems infinitely better than any variation of the format specifier that’s been proposed so far:

cur.execute(t"select {psycopg.join(fields)} from mytable where {psycopg.join(conditions, "AND")}")

I believe this is how Prisma does it as well in JS via GitHub - blakeembrey/sql-template-tag: ES2015 tagged template string for preparing SQL statements, works with `pg`, `mysql`, `sqlite` and `oracledb`.

2 Likes

Could you explain why

x = t", {y}, 3,".join(["4", t"{z}", t"{w}", "5"])

would be unsafe?

Or for that matter why allowing

x = ", ".join(["4", t"{z}", t"{w}", "5"])
x == t"4, {z}, {w}, 5"

would be unsafe?

1 Like

How would you allow this without also allowing

x  = t", ".join(["4", z, w, "5"])

?

And if you don’t see how that is unsafe, you missed the fundamental ideas behind PEP 750. (assume that z and w are untrusted user input strings)

Pardon me, but what could be “simpler” than an interface which’s been in use for decades? (str.join) ?

3 Likes