Proposal: Implement builtin way to stringify expression nicely (#2)

Meta

This is a continuation post of Introduce 'expr' keyword - #28 by JoniKauf .

It should have been clear to me from the start that introducing a new keyword for the proposed idea is not a viable solution.
I’m opening this new topic because the original expr keyword idea has been eliminated almost instantly and because it was the main point of that topic, I think starting over in a new topic about an actually possible implementation is more organized.
The second reason is because this is now a way more baked idea ( Meta: Lifetime of an idea. 1, 2 & 3 ) and the possible list of implementations has been stripped down immensely until I could finally decide for a final proposal.
I hope these are valid enough reasons for opening a second topic.

Finally:
Thank you all so much for all the feedback and suggestions in the other topic. There were a lot of ideas and many of those were very nice solutions!

On-topic

I looked at all suggestions and have two favourites I would like to focus this topic on. The first on is my actual proposal, the second one is an honourable mention that doesn’t quite cut it. If something is not discussed here, check out the previous topic!

The proposal: An expr function

First off, I feel a bit lousy for seeing all these good suggestions and picking mine as the ‘best’. But I think in terms of what the goal is, this is the only solution that sets out to fix all issues mentioned below.

The problem

Sometimes we need to turn an expression (most often identifiers) into the format how it is written in-code, which will from now on be called its expressive form. The current solutions and their issues are:

  • Copy the expression into a string manually
    Issue: Code duplication, which is dangerous and tedious when code is refactored to not be accidentally forgotten
  • Use the f-string debug syntax f'{expr=}.rpartition('=')[2]'
    Issue: Cannot be fully generalized by a function and worst of all: evaluates the given expression, no matter how expensive
  • Use my 3.14 way mentioned here by using template strings.
    Issue: Still evaluates the expression.

Solution requirements

  • Lazily turn an expression into its expressive form
  • Reduce code duplication
  • Do so in a concise way
  • IDE tools, like renaming, should work nicely with it
  • Other obvious goals, like backwards compatibility, non disruptiveness… (they will not be further mentioned, as they should be obvious)

The proposal:

Add the expr function into builtins. The function definition would look something like this:

from typing import Any, Literal, LiteralString, overload

_EvalResult = Any  # Placeholder, should be the result of eval inferred by the type checker

@overload
def expr[T: LiteralString](e: T, /) -> T:

@overload
def expr[T: LiteralString](e: T, evaluate: Literal[False], /) -> str:

@overload
def expr[T: LiteralString](e: T, evaluate: Literal[True], /) -> tuple[T, _EvalResult]: ...

def expr[T: LiteralString](e: T, evaluate: bool = False, /) -> T | tuple[T, _EvalResult]:
    if evaluate:
        return (e, eval(e))
    return e

This function would need to be special cased by linters, type checkers and more sophisticated than my 3 lines of code:

  • Linters would need to understand that e is an expression and it should be treated like it was outside of a string, so renaming and other features would work like in other parts of the code. You can imagine it similarly to how types hints wrapped in strings are often also treated specially (for forward refs).

  • Type checkers would need to understand that the returned tuple’s second value, when eval is true, is the expression in the string evaluated, like if it was outside of a string. Optionally, the string passed

  • When eval is true, we want to evaluate the expression. The problem is that in many scenarios we would require globals() and locals() to be passed. This would be rather clunky. I do not know, but assume, this is possible to implement in C such that it is a special function that automatically takes these in.

Goal completion

Looking at all proposed solutions, this is the only one that covers all issues.
It is:

  • short
  • has no duplication
  • is lazy
  • can relatively easily be supported by the type checkers, linters and IDEs

The only real issue I could see is the support from outside tools needed. Similar constructs, like types wrapped in strings, are already supported though, therefore I think this should not be an issue.
Escaping inner strings of the expression should not be an issue, because other wrapper quotation marks could just be chosen and it is highly unlikely that all four possible string definition formats (", ', """, ''') are used in a single expression. If absolutely required, escape sequences of strings could just be detected by the tools as well.

Why expr as a name

I follow the naming convention of the exec and eval function:

  • exec: Execute statements
  • eval: Evaluate an expression
  • expr: Turn an expression into its expressive form and optionally eval it.

They all have in common that they can execute a string as Python code (though the others can also execute code), so I think this would be a nice fit for the name.

This is my proposal. Following up is an honourable mention and is not part of it technically, but could have potential to still be fixed somehow (awaiting ideas).

An almost perfect solution: !e conversion

My favourite idea by somebody else was proposed by @dg-pb:
The !e conversion syntax that could be introduced to f-strings (post: Introduce 'expr' keyword - #19 by dg-pb ).

Example:

>>> f"{int('10')!e}"
>>> int('10')

Compared to other solutions it would be a minor language change, it has no backwards incompatibility (to my knowledge) and I just think using a conversion just feels like it fits perfectly for the use case. Obviously, the other conversions (a for ascii, s for str and r for repr) turn the value of the expression into the specified format, not the expression itself, but I still think it fits nicely and feels like a natural solution.

But sadly there is a major problem with this solution, which is that the expression cannot be evaluated at the same time too, so if the value and the string is needed as well, one has to repeat the expression, which defeats the entire purpose of this proposal.

1 Like

Possible issue with this is that converting input syntax to string is done at parser level.

print(ast.dump(ast.parse('f"{1=}"'), indent=4))

Having a builtin there is no way for parser to know whether expr function has been overridden.


I think some methodology to tell whether some function call refers to the builtin would be useful for optimizations, but I don’t think there currently is one.

And even if there was, I don’t feel this is well applicable to this problem.

Such would be more suitable for optimization hacks - something that can be added or removed without breaking stuff.

While the solution path for this already exists in f-strings.

Nice writeup!

I think a more exhaustive explanation of why f'{expr=}.rpartition('=')[2]' is not suitable would help to provide a more convincing argument to those who don’t understand the motivation for this idea.

1 Like

One reason would be in case the expression itself contains an equals sign:

>>> f"{1==2=}"
'1==2=False'
2 Likes

Good point! Though I already mentioned it in the previous post, it won’t hurt re-stating it in short.

So why is f'{expr=}.rpartition('=')[2]' ‘bad’?

  1. It is extremely long and tedious to write out for its intended purpose
  2. The most compact way to write it is by defining a function that does the partitioning, the rest must be written out, otherwise we would lose the ‘expressive’ form. So at best, it would look like this: express(f'{expr=}'). A bit shorter, but could be better.
  3. If the value is needed as well, you only get it in its string form, so at worst the expression must be repeated entirely, which intern…
  4. …would mean that there is no single point of truth anymore. So in that case there is barely an advantage over just entering it in a string manually. The only thing better is that the IDE/linter/type-checker validate that it is a valid expression.

I don’t quite understand what you are going after?
Are you saying that making the function understand that locals and globals should be taken automatically is impossible? If so, then please let me know.

Because all the function does is to either:

  1. take in a string and give it back out or
  2. take in a string and evaluate it with all globals and locals, so not directly with the builtin eval() but in a special way.

Are you saying that this part is where it can’t work?

Ok, so I am not an expert here. I just have a rough intuition. So if someone can correct me, prove me wrong, please do. But to my best understanding:

  1. Parser parses the syntax to AST
print(ast.dump(ast.parse('expr(a[1])'), indent=4))
Module(
    body=[
        Expr(
            value=Call(
                func=Name(id='expr', ctx=Load()),
                args=[
                    Subscript(
                        value=Name(id='a', ctx=Load()),
                        slice=Constant(value=1),
                        ctx=Load())]))])

When parsing, parser has no idea what expr is. It could be builtin, global, local or might not exist at all. And it does not matter at this stage.

  1. The ast is evaluated in a certain context, where all variables are picked up from appropriate frames as evaluation happens.

This is what happens for anything that follows function call syntax.

So in short, the desired expression needs to be saved at parsing stage (1), while the nature of expr can only be determined at stage (2).

I am not saying that it is impossible to make it work. But the costs of such endeavour is unlikely to be justified by benefits of this.


Now, I see that I am not 100% sure what your proposal is (at least one example of usage in proposal would be good). My bad for not double checking before commenting, but it is a bit late now:

expr(a[0])    # 1. This? Then what I wrote above.
expr('a[0]')  # 2. Or This?

The second one has its own issues, such as manual frame backwards recursion to find variables is not made to be a performant and robust thing. Also IDE can not easily know whether expr is builtins.expr or some newly defined function. Some advanced IDE can of course try to infer it from imports, etc. But import dependent syntax highlighting is quite a strain on IDE and also very unreliable. And simpler editors would not be able to do it at all.


Soft keyword would also suffer from similar issues. E.g. expr (a[0]). Is it localy defined expr function call? expr( a[0] ) or expr soft keyword with expression = (a[0])?


So the way I see it, both of the above are high effort - questionable quality outcomes.

The straight forward way to do this is to have a dedicated syntax, which can signal parser and IDE in unambiguous manner. This is what f-strings did with f'{expr=}, where “expr” string can be unambiguously reconstructed from AST. At least this is my best guess that this is how it works.

Can’t give a big answer right now, just want to mention to you that the second way, expr('my_expr') is my proposal, so the argument is wrapped as a string (and then if True is also passed evaluated).

1 Like

One more thing. f'{expr=}' checks for validity of syntax at parser level.

ast.parse('f"text {a[0=} text"')
# SyntaxError:...

Which is a good minimal validity check given this is a feature which is to some degree syntax-integral.

Otherwise, expr("a[0="), being unable to determine the origin of expr would not be able to do it if evaluation is not required. In this case, if not even validity of syntax is ensured, then I would say code-strings is the feature which would do the same, but would cover a much larger ground.


Your criticism of f'{expr!e}' is the need to type things twice. But as I showed earlier, this is often not an issue, when a value is pre-stored:

d = something[0].attr    # type(dict)
print(f'{something[0].attr["key"]!e} has value {d["key"]!r}')

Or the possible solution to this:

f'{(a := 1 + 1)=}'
# '(a := 1 + 1)=2'

While !e could be made to drop the wallrus on its existence:


f'{(a := 1 + 1)!e} has value {a}'
# "1 + 1 has value 2"

@JoniKauf Can you give some concrete examples of where you’d use this in real(ish) code, please?

I’m pondering something to do with (what I think is) your idea and t-strings, but I want to know I’m understanding your idea properly before I go any further with my thought…

You still have to access something[“key”] twice. Even if not, you need to type it twice, making it more easily possible to forget the other.

I will more simplified state my feature suggestion, as it seems to be not 100% easily understandable yet:

Implement a new builtin function called expr with the signature at the bottom of this post, so it takes in a literal string (that represents a valid expression) and optionally a boolean.

  1. If the function is called with the boolean being false or not given, then the function simply returns the string back (we use a TypeVar to know it is the same literal string).
  2. If the function is called with the boolean being True, it returns a tuple with the string (like in step 1) and as the second tuple element, the string evaluated. The second element of the tuple is not exactly just an eval(…) call, because the eval should know about the globals and locals the evaluate of the string automatically.
    (Now that I think about it, wouldn’t this also be possible in pure python by jumping out the current frame and getting the globals and locals of the outer function call? Then this would be rather easy to implement without special treatment)
    That’s everything that the function is about.

To make it work nicely we get to step 2, which is support from type checkers, linters, IDEs… What they would have to do is 2 things:

  1. The first argument, the literal string, is supposed to be a valid expression. Therefore it should be highlighted and supported as if it was an expression, just wrapped by a string.
  2. If the second argument is True, meaning we return a value with the string and the string expression evaluated, then the tools should understand that the returned tuple’s second argument is not just Any, but whatever the expression in the first argument would evaluate to.

Your proposal is clear and understandable. That’s not the issue. The problem is that people don’t think it’s worth doing.

1 Like

A couple of points here.

There’s no “code duplication”. The difference between expr(a[1]) and "a[1]" is not that the expression doesn’t need to be written again, it’s that you expect tools to recognise the first as an expression (and so a is recognised as an instance of the variable a) whereas they won’t for the second. Yes, value, text = expr(a[1], True) avoids repetition, but it’s pretty clumsy, and you can get the same result already using text = "a[1]"; value = eval(text).

Given that this is therefore all about tools recognising the expression, you still have an issue:

# This code calculates something from a[1]
value, text = expr(a[1], True)
result = value + 7
print(f"From {text}, we calculate the result {result})

If you rename the variable a, your refactoring tool won’t recognise the comment, so that would still need to be changed manually.

I can see the value of making it easier for refactoring tools to spot all uses of a variable name, but your proposal doesn’t do that. It helps with one, relatively rare, case (strings) and does nothing for the much more common case (comments).

A tool specific convention[1] would be a much more effective solution to this problem. And it doesn’t need a language change.


  1. Which could be standardised between tools, if it became popular ↩︎

Comments have nothing to do with my proposal and just because comments don’t work, even if they are used more, doesn’t mean my proposal is less useful. Like you said, such a proposal could be done, does not have anything to do with mine though.

Also, as I realized in my last post, there would not need to be any builtin magic. So this function could be put into any module, probably inspect or typing, which would not make this as hard to add.

Here is the new code:



from typing import Any, Literal, LiteralString, overload
import inspect

_EvalResult = Any  # Placeholder, should be the result of eval inferred by the type checker

@overload
def expr[T: LiteralString](e: T, /) -> T: ...

@overload
def expr[T: LiteralString](e: T, evaluate: Literal[False], /) -> T: ...

@overload
def expr[T: LiteralString](e: T, evaluate: Literal[True], /) -> tuple[T, _EvalResult]: ...

def expr[T: LiteralString](e: T, evaluate: bool = False, /) -> T | tuple[T, _EvalResult]:
    if not evaluate:
        return e
    
    curr_frame = inspect.currentframe()
    
    if not curr_frame or not (caller_frame := curr_frame.f_back):
        raise LookupError("Could not find outer frame")
    
    return (e, eval(e, caller_frame.f_globals, caller_frame.f_locals))



if __name__ == '__main__':
    my_list: list[int] = [10, 20, 30]

    ex1 = expr('my_list')
    assert ex1 == 'my_list'

    ex2 = expr('my_list[2]', True)
    assert ex2 == ('my_list[2]', 30)

The only thing missing is just tool support for highlighting the string.

OK, cool. I suggest that you publish this implementation as a 3rd party library on PyPI. If it’s sufficiently beneficial, people will use it (you can advertise it on social media and similar channels to get the word out if you’re concerned that a 3rd party library is less discoverable than a builtin). If your library is useful enough, linters, IDEs and other tools will be willing to recognise it (and if needed, special case it). (Edit: The experience with attrs and the dataclasses stdlib module demonstrates that “being in the stdlib” isn’t needed for tool support - tools do support popular 3rd party libraries if the case is good enough).

Once that has happened, there is a case for moving the implementation into the standard library, or maybe even into the builtins. But I can’t see that happening without prior evidence that the function is useful, gathered from experience with it as a 3rd party implementation.

Personally, I don’t think a 3rd party implementation will succeed to a point where it’s suitable for stdlib inclusion. But I don’t want to discourage you from following up with your idea if you want to (after all, I’m simply stating my unsubstantiated opinion, just like you are). So by all means follow that route if you want to.

3 Likes

Consider this example and test your function:

def foo():
    a = 1
    def bar():
        print(expr('a', True))
    bar()

foo()

There are ways to make it work better.

But this is what is called “black magic”. Even if such was made to work, it will not be performant/robust. It is just not well integrated into the way robust Python code is designed to work.

In short, no solution that depends on such machinery will end up in standard library.

Especially builtins.


And just to repeat myself. Even if the above was ok, IDE will not be able to know whether expr is a builtin function or something else in a robust manner, which kind of kills half of the benefits.

1 Like

I published my first package on PyPI about this topic.

It has one function, the expr func that works as described above. The first argument it takes is Template[Callable[[], T], so an interpolation which a no arg lambda, to allow laziness. It has all features I would want from such a function, except that it is a bit verbose, which is due to the lambda being needed, because interpolations are not lazy.

2 Likes