How to do multi-line f strings?

Rosuav · February 26, 2024, 11:14pm

Oh yeah, definitely. I just don’t want to promise something that mightn’t work, given that I’m not sure which backend it is.

Case in point: psycopg2.paramstyle is pyformat, but it also happily supports format (using %s). It does NOT support any of the others. I’m not sure how, other than trial and error, we’re supposed to know which other formats are supported.

c-rob · February 28, 2024, 10:21am

I’m glad we’re having a discussion about this so I can learn more. While I’m learning the technical aspect of Python I have no idea of what the best practices are and this thread has taught me the best practices for writing and using SQL.

Thank you for helping me learn!

vernondcole · February 29, 2024, 11:13pm

Unfortunately, all of the suggestions for using parameterized queries (except SQLalchemy) suffer from the problem that PEP 249 permits four different mutually incompatible ways of formatting such a query. In order to write a generalized example, all four methods have to be taken into account.

In my humble opinion, there should be a v3.0 update to PEP 249 which permits the programmer to specify which of the four formats she is using, rather than the database adapter telling her which one it demands. It’s really not that hard to code such an adapter. The version in adodbapi, which has been in commercial operation for ten years now, has needed only one patch to date.

blhsing · March 1, 2024, 2:14am

Chuck R.:

The teacher has suggested this method for parameterizing stuff for long SQL statements.
# Use ? as placeholder for parameters 
sql = 'UPDATE products SET name=?, price=?, description=? WHERE id=?' 
# Pass the parameters as a tuple 
params = (name, price, description, product_id) 
cursor.execute(sql, params)
I think I mentioned a single SQL statement that I use in production can be 1000+ characters, or even 80-100 lines long with 2-3 fields per line. So I needed a way to make the SQL readable in the code for debugging purposes.

It sounds like you have a database backend that does not support named parameters, in which case you can create a helper function that converts an execute method that accepts only positional parameters into a wrapper function that accepts named parameters. Use a regex pattern that grabs the name after a colon but ignores quoted strings and comments:

import re
import sys

def to_positional(func, _pattern=re.compile(r"'(?:''|[^'])*'|--.*|:(\w+)")):
    def execute(operation, namespace=sys._getframe(1).f_locals):
        def replacer(match):
            if name := match[1]:
                params.append(namespace[name])
                return '?'
            return match[0]
        params = []
        return func(_pattern.sub(replacer, operation), params)
    return execute

so that you can call the wrapped execute method with a query with named parameters. The wrapper uses names and values from the current local namespace by default so it behaves more like an f-string:

sql = """\
UPDATE products
SET
    name=:name,
    price=:price,
    description=':description ''' || :description || ''''
WHERE id=:id; -- :comment
"""
id, name, price, description = 123, 'foo', 99.99, 'foo description'
to_positional(cursor.execute)(sql)

But you can also call the wrapper with your own namespace:

to_positional(cursor.execute)(sql,
    {'id': 123, 'name': 'foo', 'price': 99.99, 'description': 'foo description'})

To more easily test the transfromed query without involving the database, you can make it wrap print instead of cursor.execute:

to_positional(print)(sql)

so it would output:

UPDATE products
SET
    name=?,
    price=?,
    description=':description ''' || ? || ''''
WHERE id=?; -- :comment
 ['foo', 99.99, 'foo description', 123]

capymind · March 1, 2024, 3:48am

Hi, Although I am still python beginner…

How about this?

def update_product():
  """Updates product."""

  # not from the user input
  # from the trusted sources(developer?)
  trusted_condition = "AND specific_column < 10"

  sql = f"""
      UPDATE products
          SET   name          = :name
              , price         = :price
              , description   = :description
      WHERE 1=1
          AND id = :production_id
          {trusted_condition}
  """
  
  cursor.execute(
      sql,
      dict(
          name=name,
          price=price,
          description=description,
          production_id=production_id,
      ),
  )
  
  cursor.commit()

The point is that just using multi-line string(""") with f-format feels no problem for me.

CAM-Gerlach · March 1, 2024, 7:35am

Sorry, but I strongly suggest reading what others already wrote on this thread, most of which describing the very severe security problems with this very approach (which, in any case, was already proposed in the very first post). I understand you’re a beginner, but that means you should pay particularly close attention to what those with expertise take their volunteer time to try to explain to others here; and be particularly careful to do your research before giving advice to others, especially so when many other more experienced users already have.

cameron · March 1, 2024, 9:18am

Actually, @capymind’s latest post looks pretty good to me:

:name parameters for the supplied values, with a very nice accompanying dict
the only {} format insertion is {trusted_condition}, which is clearly
flat SQL with no weird embedded values

I think the OP’s taken the suggestions on board, and constructed a nice
parameterised SQL string which is very readable as a multiline format
string. The leading 1=1 condition is typical of constructed SQL
conditions where one might want to incremental add AND foo additional
constraints to a base string.

blhsing · March 1, 2024, 9:59am

I don’t think the OP has clarified whether their database backend supports named parameters at all. From the OP’s teacher’s suggested use of qmarks, I’m assuming not, hence the question to begin with.

capymind · March 1, 2024, 10:25am

Oh, sorry for careless wording about using f-format with raw sql string.

I read the sqlite3 doc saying about sql injection and requiring(at least recommending) placeholder approach which many others have said. I totally agree with them.

I have experiences in some cases that I need to adjust(add or remove where conditions etc.) raw sql statements based on other inputs. If other inputs are not related to user input or be restricted to very specific values (such as it should be a member of my Enums or in predefined set), I have used f-format. Other cases such as user input binding, I always have used placeholder approach. I am not sure I have to avoid any cases using f-format…

Anyway, what I said

The point is that just using multi-line string(“”") with f-format feels no problem for me.

is careless and misleading…

Thank you for sharing thoughts and enlightening me.

MFH · March 1, 2024, 12:05pm

You forgot the ‘f’ in front of the second string, so the second one is just a normal string and {variables} don’t get expanded/interpolated.
As others showed:

it does work with + if you prefix the 2nd string also with an ‘f’
it works without + : two (or more) strings, one “next” to the other, are concatenated
it works with a multi-line string (all lines inside one string started and ended with a
triple “”" or ‘’', prefixed by an ‘f’).

EDIT: sorry, I think I replied in the wrong place, I thought the question was about OP’s initial error. Might delete my reply as a whole, but maybe an additional explanation won’t harm…

CAM-Gerlach · March 2, 2024, 3:59am

Hey, no, I’m sorry, it turns out I was the one who didn’t read the code in your reply carefully—my bad! My mistake too for replying at 1:30 am when I was dead tired…big oops. Please accept my sincerest apologies!

And it sounds like you’ve been quite careful thinking through your code to try to ensure it is safe, to a level that is quite impressive indeed for a self-described beginner—I do worry, though, about the many users who are not as careful as you, and who (like myself) might skim your post and just see the immediate standout item—using multiline f-strings—and not pay attention to the key details you’ve included in the code comments about how to handle safe versus unsafe values, or might think a value is safe when it isn’t. Particularly in such scenarios, or to avoid going to such lengths, using parameters consistently (or an ORM) rather than undertaking the cognitive load of carefully determining which parameters are safe to inject directly and which aren’t. Though, if you are, LiteralStr with a type checker can help considerably in tracking that and avoiding mistakes.

capymind · March 2, 2024, 12:14pm

No problem!

Actually, I should have emphasize the danger of careless usage of f-strings with user inputs and say that if you want to use it, use it in very restricted ways with knowing what your code actually do at your own risk. I missed that point making readers who skim my snippet and words take wrong impressions that multiline(""") with f-strings is just okay (No!) and easy to use (Maybe). At least the code snippet and one line sentence explanation doesn’t match quite well. I think it is always better to explain in concise and completed ways, considering various situations and levels of readers. I’ll try to do that at next time!

Oh… I don’t know what LiteralStr is. If it is LiteralString (PEP 675 – Arbitrary Literal String Type | peps.python.org), it’s also new to me. Thank you I have learned something new. PEP 675 Motivation part actually talk about this situation…awesome (although I don’t use static type checkers because I have a few personal opinions mentioned in Reasons to avoid static type checking — typing documentation which can be changed at anytime and eventually use it…)

Your comments totally make sense.
Thanks a lot. I have learned some lessons from what you have said.

c-rob · March 6, 2024, 10:46am

I’m still pretty new with Python. Can someone explain what a backend database is that some people have mentioned in this thread? Is it components or modules on the Python side?

Because I’ll be likely connecting to an old Postgresql database (it’s a black box and we cannot upgrade it), and an MS SQL database (also a black box which cannot be upgraded).

These databases are associated with vendor software and the software is only tested with one version of the database. Upgrading the database would cause numerous problems, which is why the vendor just doesn’t update the db for any reason, an upgrade requires extensive testing as the software is really large with many components. In fact I think there are at least 80 tables in the Postgresql database alone.

kknechtel · March 6, 2024, 11:21am

“backend database” means a database that you use on the back end of your program - i.e., the part that stores the information behind the scenes, so you can present it to the user on the “front end” after doing whatever computation.

“database backend” means the program that you use to read this database - the one that you have to interface to from the Python code.

Rosuav · March 6, 2024, 11:36am

Both PostgreSQL and MS SQL are backend databases. They’re entirely separate from your Python app, and you communicate with them using the database connector modules (psycopg2 for Postgres, not sure what for MS SQL but there’ll be one).

Depending on HOW old that database engine is, that might not be a big deal. Currently PostgreSQL 16 is the latest, but you could be running something as old as PG 12 and it’ll generally be fine (albeit with fewer features - I’ve pushed for PG 16 because I specifically want one of its features). And then when you DO upgrade, you can jump a lot of versions all at once.

In any case, everything we’ve said in this thread applies to virtually every database engine and every version of them.

blhsing · March 7, 2024, 1:49am

Even the latest version of Postgresql still has no support for named parameters in its prepared SQL statement so you would be well served by my wrapper function. Postgresql does provide named parameters for a user-defined function since verison 9.2 so people have been using it as a workaround:

MS SQL on the other hand has supported named parameters for a long time and I don’t know how old your MS SQL database has to be to not have such a support, but since it sounds like your project needs to support Postgresql at the same time, it’s probably easier to use my wrapper function anyway for a consistent code base.

Rosuav · March 7, 2024, 3:58am

Yes, but that’s the job of drivers like psycopg2. It’s really not that difficult to switch from named to positional. For the person writing the application, this should be invisible - you can use named parameters just fine:

https://www.psycopg.org/docs/usage.html#query-parameters

And when I say it’s not difficult, I mean it. I wrote a PostgreSQL wire protocol connection library for another tool (not in Python) and the transformation from named parameters to positional is six lines of code, including all error handling. (It transforms the parameters into $1 $2 $3 etc which is what the PG “Bind” command will fill in.)

Perhaps you’re conflating different concepts here? You don’t have to use prepared statements to take advantage of the parse/bind sequence. Simply using psycopg2 in the most obvious and normal way WILL work, and will be safe.

blhsing · March 7, 2024, 9:18am

I think you wrote your reply without reading the wrapper function I was talking about, which does exactly what your reply says.

But thank you. I was indeed not aware of psycopg2 already capable of performing the same transformation.

c-rob · March 7, 2024, 10:37am

That’s what I thought but since I’m new to Python, I didn’t want to assume.

Rosuav · March 7, 2024, 12:02pm

Definitely possible, but IMO that’s the job of the driver, not a wrapper. I’m not familiar with any Python database drivers that don’t support named parameters in some form; certainly all the best and most popular ones do.