Allow for arbitrary string prefix of strings

Yes, I have a draft PR version, Initial specification PEP by jimbaker · Pull Request #17 · jimbaker/tagstr · GitHub. Besides needing some more filling in, it is way too long, being bogged down by discussion of quoting and injection attacks. Perhaps the motivation section could simply be replaced by the link to the Bobby Tables injection attack in xkcd: Exploits of a Mom ??? :wink:

@guido linked the sql tag example, but I will add that this example does exactly what you suggest, including providing support for DB-API 2 and SQLAlchemy with binding parameters in the SQL object it contructs. The example is of course not fully developed out, but I think it illustrates some interesting points in how tag functions would be implemented.

So a tag function simply is a callable that has this type spec:

    class Tag(Protocol):
        def __call__(self, *args: str | Thunk) -> Any:
            ...

where a Thunk has this definition (I’m leaning to a named tuple approach, especially since this is typical for other Python internals):

    class Thunk(NamedTuple):
        getvalue: Callable[[], Any]
        raw: str
        conv: str | None
        formatspec: str | None

OK, with those formalities in place, all a tag function does is the following:

  1. Parse the template strings. This could be super minimal as with the fl tag I showed earlier; or actually more involved with HTML. It’s your callable with a simple API :slight_smile: Note that such parses should be very memoizable and could also do fun stuff like codegen.

  2. Do something with the interpolations, such as evaluating them, wrapping them in bind params, applying formating, etc.

  3. Return some object. Note that best practice is that this object doesn’t have side effects, but it instead is a filled-in template like PEP 501 – General purpose string interpolation | peps.python.org 's InterpolationTemplate. See https://github.com/jimbaker/tagstr/blob/main/examples/interpolation_template.py, which shows how to implement PEP 501’s i prefix/tag with this approach.

With the sql tag example, we see how this works in especially its analyze_sql function in https://github.com/jimbaker/tagstr/blob/main/examples/sql.py#L66, which matches on SQL text, Identifier, nested SQL objects, and expressions which will then be set up as binding parameters. The returned SQL object can then be executed with respect to a specific database library.

It’s not a great example, but the fact that this recursive common table expression composed of nested SQL fragments and interpolations works interchangeably with DBI2 or SQLAlchemy is pretty nice (I should refactor the code to make it more obvious that only the execution API changes, not the SQL and any interpolations):

        num = 50
        num_results = 9  # actually using num_results + 1, or 10
        
        # NOTE: separating out these queries like this probably doesn't
        # make it easier to read, but at least we can show the subquery
        # aspects work as expected, including placeholder usage.
        base_case = sql'select 1, 0, 1'
        inductive_case = sql"""
            select n + 1, next_fib_n, fib_n + next_fib_n
                from fibonacci where n < {num}
            """

        results = cur.execute(*sql"""
            with recursive fibonacci (n, fib_n, next_fib_n) AS
                (
                    {base_case}
                    union all
                    {inductive_case}
                )
                select n, fib_n from fibonacci
                order by n
                limit {num_results + 1}
            """)

So hopefully I addressed that!

So standard Python name semantics are used for binding a tag name to its function - they can be imported, defined, patched, manipulated by looking up the namespace with globals(), etc as usual. So no specific registration required.

3 Likes