I commented earlier about a relatively specific issue with this proposal, but in truth I had a bad gut feeling about the proposal in general but couldn’t put my finger on why.
However, while I was replying to a different t-string-related thread, I figured it out.
I originally posted in the PEP 750 thread:
The syntax should be
foo(t'bar')
, notfoo'bar'
.
and this led onto a discussion about using “the function that a t-string is passed to” as a language tag, which could be used by editors with appropriate support to determine the language that the contents of the t-string is supposed to be parsed as; here, foo
would be the language and bar
the source in that language.
Therefore
subprocess.run(t"some_command {message_from_user}")
would actually be far better written as something like
subprocess.run(shlex.Command(t"some_command {message_from_user}"))
i.e.
subprocess.run
would accept either ashlex.Command
object, or astr
or sequence-of-str
for backward compatibility, but would not accept a t-string itself- (bikeshed the spelling of
shlex.Command
if you like, I’m not precious about that)
- (bikeshed the spelling of
- the constructor of
shlex.Command
would accept a t-string - the constructor of
shlex.Command
would be typed/annotated to declare that its argument will be parsed like a POSIX shell command line - editors with appropriate support would then be (theoretically) able to syntax-highlight the contents of the t-string in the appropriate way.
Now I do understand that the whole idea of using language-tagged t-strings in this way to perform syntax highlighting in editors was set aside to be considered separately in future (and if it has been further discussed then I’ve totally missed this) but I do think we should be careful about what precedent we set for exactly what should take a t-string.
I would much rather see the following pattern come about
cmd = shlex.Command(t"some_command {message_from_user}")
html = some_html_lib.HTML(t'<a href="{link_target}">{link_text}</a>')
sql = some_sql_lib.SQL(t'select foo from bar where id={bar_id}')
p = subprocess.run(cmd)
foo, = db_connection.execute(sql).fetch_row()
return render(html)
than this:
cmd = t"some_command {message_from_user}"
html = t'<a href="{link_target}">{link_text}</a>'
sql = t'select foo from bar where id={bar_id}'
p = subprocess.run(cmd)
foo, = db_connection.execute(sql).fetch_row()
return render(html)
I also do realise that just as one could annotate the constructor of shlex.Command
so that editors could discover the language of the content of the t-string, one could also annotate subprocess.run
in the same way, so the case where the t-string was being immediately passed to subprocess.run
would work just as well.
But the pattern I’m suggesting here is for the t-string to first be passed to some constructor (or regular function) that would merely parse the t-string without any side-effects (other than raising an appropriate exception if it’s invalid), such that the result could be stored in a variable and then used later (possibly multiple times, possibly not at all depending on the result of a conditional, possibly only after running some other code that needs to know that the t-string was valid, etc.).
I feel there’s enough benefit to this pattern that it could be worthwhile to require it, which would mean not allowing “functions that do things”, like subprocess.run
, to accept t-strings directly. If the first significant use of t-strings in Python is to write subprocess.run(t'prog arg1 arg2')
then I fear the idea of “t-strings get passed immediately to something which validates them, which also acts as a language tag for static analysers” might vanish into thin air without us realising it.