Sqlite3: consider deprecating combining named placeholders with sequence of params

erlendaasland · January 4, 2023, 11:40pm

Currently, sqlite3 disallows using nameless (aka qmark) placeholders with params supplied as dicts:

>>> import sqlite3
>>> cx = sqlite3.connect(":memory:")
>>> cx.execute("select ?", {"a": 42})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
sqlite3.ProgrammingError: Binding 1 has no name, but you supplied a dictionary (which has only names).

OTOH, named placeholders combined with sequences are, unfortunately, silently allowed:

>>> cx.execute("select :name", [42]).fetchall()
[(42,)]

This is possible, because when you supply a sequence, sqlite3 simply maps the items in the sequence onto the placeholders by index.

Now, let’s introduce a common source of confusion: numeric placeholders. PEP-249 uses the following format for numeric placeholders: :<number>. SQLite, OTOH, uses the following format: ?<number>. What happens if you pass PEP-249 numeric placeholders to the SQLite library? The placeholders are interpreted by SQLite as named placeholders:

>>> cx.execute("select :1", {"1": 42}).fetchall()
[(42,)]

Unfortunately, if one misses the fact that these are named, and not numeric, placeholders, bugs may sneak in:

>>> cx.execute("select :1", ("first",)).fetchall()
[('first',)]
>>> cx.execute("select :1, :2", ("first", "second")).fetchall()
[('first', 'second')]
>>> cx.execute("select :2, :1", ("first", "second")).fetchall()  # Unexpected result follows
[('first', 'second')]

IMO, it would better if sqlite3 disallowed combining named placeholders with sequences of params, similar to how it disallows nameless placeholders and dicts. Of course, we’d have to introduce a deprecation warning, wait a couple of releases, and only then start raising an exception.

Now, this might break some code out there (Hyrum’s Law), but there is value in preventing new bugs to sneak into people’s code bases.

Thoughts?

gpshead · January 5, 2023, 9:04pm

Overall I agree, alleviating this API confusion and conflation of concepts in both directions would be ideal and could be accomplished through a deprecation period.

Q: How do other database frontends in Python fare?

PEP 249 – Python Database API Specification v2.0 | peps.python.org doesn’t really specify what constitutes a :name which I guess means that implementation(s) probably treat name as any sequence of non-space non-colon characters? yuck.

Alternatively to accomplish a lesser disambiguation if your original larger idea doesn’t pan out: Issue a DeprecationWarning when a numeric name is detected for a couple of releases before making a numeric :name an error. I’m assuming it will be rare for people to intentionally use numbers as names and that forcing the issue may expose problematic code.

erlendaasland · January 6, 2023, 10:35am

I haven’t investigated that yet, but I’ll have a look at the most popular ones. I’ll also hear with the SQLAlchemy folks; I assume they have a pretty good picture of this.

For sqlite3, we don’t parse the placeholder; the entire query string is just passed onto SQLite, so a name is what the underlying SQLite library says is a name.

I don’t plan to detect PEP 249 style numeric placeholders specifically; I want to (in the end) raise an error if any named placeholder is used together with parameters supplied as a sequence.

+1

steve.dower · January 6, 2023, 12:03pm

Does this require us to start parsing the SQL statement in order to provide the validation? If we aren’t already (reliably) parsing the statement, I’d rather not start doing it.

We’d have to be very cautious to not reject valid statements (including ones that haven’t been invented yet… not that SQL is known for moving quickly), or to false-positive on comments or string literals. Such a parser could become complex very quickly.

Perhaps we should just document the inconsistency between SQLite and PEP 249?

Erlend E. Aasland:

Currently, sqlite3 disallows using nameless (aka qmark) placeholders with params supplied as dicts:
>>> import sqlite3
>>> cx = sqlite3.connect(":memory:")
>>> cx.execute("select ?", {a: 42})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined

This is a NameError because you didn’t quote 'a' in the dict definition. The result actually reads:

>>> import sqlite3
>>> cx = sqlite3.connect(":memory:")
>>> cx.execute("select ?", {"a": 42})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
sqlite3.ProgrammingError: Binding 1 has no name, but you supplied a dictionary (which has only names).

This error is clearly in response to sqlite3_bind_parameter_name failing after sqlite3_bind_parameter_count succeeded, which means SQLite has determined the parameters. This is fine.

Unfortunately, I don’t see any bind_* functions from SQLite that may help determine if parameters were named before you set them by index. The best option seems to be to request the index of parameter '1' and if it’s found, warn if passed a sequence rather than a dict.

I don’t see any reason to ever promote this to an error. If linters reckon they can detect it reliably enough, I’d leave it to them, but I suspect most aren’t going to flow around enough context to handle anything but trivial cases.

malemburg · January 6, 2023, 12:38pm

There are plans to simplify the binding parameter styles for DB-API 3 and it’s very likely that we’ll only continue to support named (dictionary parameters) and qmark (sequence parameters) going forward, perhaps even requiring support for both (but this still has to be discussed).

A consequence of this would be to have database modules raise an error in case they get passed in a sequence for named style or a dictionary for qmark style.

Note that parsing for these two styles is not hard. You don’t need a full SQL parser, just have to pay attention to standard SQL string literal quoting.

erlendaasland · January 6, 2023, 1:26pm

No, it does not; we can use the SQLite C API to easily detect if a given placeholder is nameless or named. I’ll go to great lengths to avoid having to parse SQL in the sqlite3 module

Thanks! I’ve updated the OP.

Yes, we can use sqlite3_bind_parameter_name for this. It would work roughly like this: if a sequence of params was passed and sqlite3_bind_parameter_name returns non-NULL for any of the params, we issue a warning.

I would be fine with sqlite3 simply issuing a warning.

storchaka · January 6, 2023, 7:19pm

It clearly looks like a bug to me. I think that a deprecation warning is not necessary, we can just raise an exception (although it may be not worth to backport this change). If you want to be a super-nice and emit a deprecation warning, it is good too.

erlendaasland · January 6, 2023, 7:25pm

I want to be super-nice.

erlendaasland · February 9, 2023, 7:53am

I created an issue and a PR.