Introduce 'expr' keyword

Proposal

I propose adding a new keyword expr that returns the actual string representation of an expression and optionally also evaluates it. This would work similar to how the f-string expression debugging syntax (e.g. f"{x[0]=}) works.

Specification

  • The expr keyword expects one expression after it, of which it is supposed to return the exact string as written out in the code, excluding surrounding whitespace.
  • No part of the expression is actually executed, but actual identifiers must be referenced.
  • If an equals sign (=) is placed after the entire expression, similar to f-string’s debugging syntax, a tuple of two elements is returned, where the first element is the expression and the second element is the same expression evaluated.
  • Inside the expr body, it is possible to place an exclamation mark (!) instead of a dot (.) between each part of a fully qualified name. This has the effect that only the part after the exclamation mark of that fully qualified name is included.

Example code:

>>> stars = [1, 2, 3, 4, 5]
>>> expr stars[0]  # Get string representation
'stars[0]'
>>> 
>>> expr stars[0]
('stars[0]', 1)
>>> 
>>> import math
>>> 
>>> expr math.pi.as_integer_ratio
'math.pi.as_integer_ratio'
>>> 
>>> expr math!pi.as_integer_ratio
'pi.as_integer_ratio'
>>> 
>>> expr math.pi!as_integer_ratio
'as_integer_ratio'
>>> 
>>> 

Current Alternatives

Technically it is already possible to get the string representation of an expression with some f-string (ab)use, like this:

>>> expr, _, v = f"{stars[0]=}".partition("=")
>>> expr
'stars[0]'

But this has two major drawbacks:

  1. The expression is always executed, no matter if desired or not. In many scenarios this is unacceptable, as the expression could run expensive code.
    Using the generated value is also unpleasant or impossible, as it is converted to its __str__ format already. Depending on the context, with some workarounds the value could be extracted, but that is far from an ideal solution for the given problem.

  2. The expression itself is cumbersome to write out every time it is needed. It also cannot be fully refactored into a function, as the actual expression cannot be passed. The best we can do is the following:

>>> def get_expr(debug_f_string: str) -> str:
>>>     expr, success, _ = debug_f_string.partition("=")
>>> 
>>>     if not success:
>>>         raise ValueError("No '=' found")
>>> 
>>>     return expr.strip()
>>> 
>>> get_expr(f"{stars[0]=}")
'stars[0]'

Why not just write out the expression as a string manually?

In this relatively popular StackOverflow question (“Getting the name of a variable as a string”, with 474 upvotes as of writing this), the top solution proposes a similar solution to mine, stated above.
The top comment to that answer asks “How is this useful in anyway ? One can only get the result “foo” if one manually write “foo”. It doen’t solve OP problem.” and has almost as many upvotes as the actual answer.
So it seems this is a common consensus.

There are multiple advantages over just manually typing out a string:

  1. If at any point the expression is changed, there is no direct error, possibly leading to bigger issues in the future. Because the new expr body must reference valid variables, a SyntaxError is raised if that case is not given.
  2. If an IDE is used, renaming is usually done with refactoring tools from that IDE. This approach automatically renames the string representation of the given variables for us. If the variable does not exist anymore, because it got renamed or deleted, an error will be shown.

Use cases

With Python being a highly dynamic language, often as soon as some dynamic features come into play, it becomes necessary to represent some expression as a string itself.

Example 1: __all__

It always bothered me that __all__ references strings of identifiers, because it just looked so error prone. With the new expr keyword, we could reduce its “error proneness”.

class _Base: ...

class A(_Base): ...

class B(_Base): ...


__all__ = [
    expr A,
    expr B
]

Example 2: Metaclasses

class PersonMeta(type):
    category: str

    def __prepare__(cls, name: str, bases: tuple[type, ...], category_name: str) -> Mapping[str, Any]:
        return {expr cls!category: category_name}
        # same as {"category": category_name}

class Baker(PersonMeta, "food"):
    ...

Example 3: Free to use debug strings

exp, val = expr 2**32=
print(f"The calculation '{exp}' resulted in the value {val}")

What do you think?

4 Likes

Is that useful? ANY syntactically-valid name is a valid variable, even if it isn’t being used. Until you try to evaluate it, it won’t be proven to be valid. Nothing checks that the attribute names are correct (eg math.pi.as_integer_ratio) until you evaluate it, either.

It is useful for not accidentally using a wrong identifier.

This seems like an extremely niche feature. “Just quote the expression” seems perfectly sufficient to me. The advantages you claim for the expr keyword don’t convince me at all, I’m afraid.

Not even remotely enough benefits to justify adding a new language keyword, IMO. Especially not a name as common as expr (this change would break the currently valid code prev_expr = expr - 1).

4 Likes

At what point would a wrong identifier be detected? Does it, or does it not, evaluate the expression?

Wouldn’t this make more sense as an IDE feature? If I worry about e.g. __all__ elements being misspelled, that’s something I’d want my IDE to tell me when I’m actually doing the typing, not the compiler when someone is later executing the code. Same with renaming things, while making it an expression makes it so that current IDE features will recognise it as something they should include in linting, that’s not really a requirement. An IDE could instead just recognize that certain strings are meant to map to bound variable names or similar and lint them accordingly. AFAIK vscode even already does that for __all__.

2 Likes

I know that Pyright/Pylance does that for __all__ in some way but I wasn’t sure about other type checkers, so I didn’t want to mention it. But that was just an example where it could be used of course.
I myself stumbled many times onto places where I had to use both the string representation of an identifier and in other places I could write it as a variable (metaclasses for example). For these scenarios such a feature would be very useful and remove the possibility of errors, without having any additional runtime overhead or additional complexity.
I also just think it is kind of silly that Python with its dynamic nature has no way of converting expressions to strings (for most cases). Sure, functions, classes and modules have __name__ and __qualname__, but that is not enough in many cases. Even C# has such a feature (though more limited), which is the nameof operator.

It only evaluates the expression if “=” is present at the end, which does evaluate it of course.
But you might be right by saying it is not able to check which identifiers are valid, now that I think about it more. Maybe we could leave that part of linters/type checkers to warn/give an error when invalid identifiers seem to be used. These obviously won’t be perfect, but they aren’t perfect usually either way, like when you dynamically change stuff in globals and locals.

To answer your second paragraph: We could call the keyword express instead or something less used. I assume express will be far less used than expr and expression and fits nicely. I just went with expr for now because it felt it fit nicely with eval and exec, being the first 4 letters of the actual word they describe.

It’s still a breaking change, though. Breaking changes need to offer significant benefits, and this proposal simply doesn’t meet that bar.

2 Likes

Exactly - if it’s not evaluating it, it can’t know which identifiers are valid. This is particularly true of attribute lookups, which can be completely dynamic. So if it’s just going to be up to linters, how is this different from a string literal?

1 Like

This feels more like the idea of wanting format tags for strings, similar to the original proposal of template strings. Then you can mark any arbitrary string as an ‘expression string’ and you’re IDE would attempt to warn for syntax errors, name resolution errors, etc. just like it currently could for __all__. The key thing here though is that it is still up to your IDE/linter to perform the analysis and no change at runtime, due to the good points that @Rosuav and @pf_moore have made.

Perhaps starting with some sort of comment based syntax and experimenting with a simple tool to identify name resolution errors would be a good start? Then if there is enough interest it could be included in popular tools and eventually move to its own syntax not based on comments (a similar transition to what type comments made to annotations).

foo(time=7, expr="Bar['baz']", value=Bar["baz"])  # str: [expr,str]
1 Like

There is prior art for this kind of thing in form of the C# nameof() operator: The nameof expression - evaluate the text name of a symbol - C# reference | Microsoft Learn

Rust has a related feature as stringify!(): stringify in std - Rust – can be used to write assertion macros without that unholy reflection that Pytest has to pull off in Python.

I regularly miss a nameof-style feature in Python when writing docstrings or error messages. The __all__ example is also very good. It is tedious to manually search and replace all occurrences of a name, so it would be convenient if there’s a language-level mechanism to automatically keep literal strings in sync with a variable. (More precisely, a mechanism that allows third party type checkers to understand that a literal string is no longer in sync with a name that is in scope).

I don’t like the syntax of this expr operator proposal, but the general idea is sound.

Lamentations on keeping docstrings in sync with variable names:

A workaround that I have used to keep docstrings in sync is to process them via a documentation generator that resolves links. For example, we might say in Sphinx:

def some_function(x):
    """Process data from :py:func:`another_function`."""

Similarly, we can use Markdown link anchors in mkdocstrings. We can then check the docs for broken links.

But this only helps with things that have a global name, not for things in local scopes (e.g. function parameters).

Linters like Pydocstyle / Ruff can detect some mismatches between the docstrings and parameter list (rule D417), but this only helps in certain positions and only when the docstring follows a specific format.

def foo(y, z):
    """Does something.

    Args:
        x: this typo CAN be detected!
        y: bla bla

    Returns:
        sum of x and y - this typo CANNOT be detected.
    """
    return y + z

String escape codes might be more realistic than new operators.

However, I don’t see any good solution that’s reasonably Pythonic. An expr operator is difficult to introduce in a compatible way, especially if it would also require inserting a ! in various points of the expression. That would break a lot of downstream tooling! Function-looking compile-time operators as in C#'s nameof() would be completely novel in Python, so those are probably not a great choice either.

Given that the output of an expr-stringification should be a literal string, it might be more appropriate to put this into string syntax. For example, we have the ability to insert named Unicode characters in a string literal: "\N{SNOWMAN}" is "☃". By analogy, we might introduce \E{…} as way to interpolate (but not evaluate!) a stringified Python expression. This would be mostly backwards compatible, with some caveats regarding nested strings. Here’s an example how that might look:

__all__ = ["\E{my_function}"]

def my_function(x, y):
    """Add \E{x} and \E{y}."""
    if x < 0:
        raise ValueError("\E{x} must not be negative")
    return x + y

Now that I look at it, this looks a lot like TeX :confused:

Unfortunately, I’m not able to come up with reasonable semantics for this.

  • It would make sense if string literals with \E{…} elements would still be literal strings, with the escape replaced by its contents.
  • Static analysis tools like typecheckers (or anything else that does its own parsing) would be able to verify that all names resolve.
  • I’m not sure if Python should check at runtime that the referenced names exist – would be tricky in expressions with control flow like "\E{a or b}". It would also break the forward reference in the __all__ example above.
  • If the escapes are replaced in the value of the string, then relevant information would be inaccessible to tools that are based on reflection, e.g. documentation generators that import the module. (This could perhaps be fixed if string literals with stringified expressions get a dunder-property that describes the spans/offsets of these interpolations, but that’s a lot of complexity with unclear value.)

At this point, the feature is nearly useless. It doesn’t strike me as quite compelling enough for Python as a whole. But personally, I’d still love it, also because it completes the f-string and t-string universe with the ability to only print an expression without evaluating it.

4 Likes

Thank you all for the kind and useful feedback on my suggestion!

I definitely agree now that introducing a new keyword for this use case is just too breaking for such a relatively niche feature is not viable and should not happen!

I like the approach from @CarrotManMatt and it is an interesting idea, similar to how types were first introduced to the language (via comments). This comment syntax could be highlighted by IDEs and supported for renaming variables and similar features.

I also like the ideas and brainstorming @latk brought to the table, but like they said themselves, I also don’t quite like the \E way of escaping expressions.

New ideas

So I went back to the drawing board and thought about it some more and have
three things I could think of:

The best solution for Python 3.14

I found a very nice way of using Python 3.14’s templates for our purpose.
These have many useful features, but the one that interests us is the expression field of interpolations. We can use the soon supported generic support of templates and interpolations, combined with the previously mentioned expression field to create this:

Code sample in pyright playground

from string.templatelib import Template
from typing import Literal, overload, reveal_type

@overload
def expr[T](template: Template[T], eval: Literal[True], /) -> tuple[str, T]:
    ...

@overload
def expr[T](template: Template[T], eval: Literal[False], /) -> str:
    ...

@overload
def expr[T](template: Template[T], /) -> str:
    ...

def expr[T](template: Template[T], eval: bool = False, /) -> str | tuple[str, T]:
    if len(template.interpolations) != 1:
        raise ValueError("expected exactly 1 interpolation")
    
    if template.strings[0] or template.strings[1]:
        raise ValueError("strings must be empty")
    
    interp = template.interpolations[0]

    if interp.format_spec or interp.conversion:
        raise ValueError("no format spec allowed")

    if eval:
        return (interp.expression, interp.value)
    
    return interp.expression

# Python 3.14rc1 does not yet have generics for Template and Interpolation yet,
# which is why it doesn't work YET
reveal_type(expr(t'{10}'))
reveal_type(expr(t'{10}', False))
reveal_type(expr(t'{10}', True))

So yes, we can use very concise syntax to create dynamic expressions. There are two caveats though:

  1. Because templates are evaluated eagerly, the expression will always be evaluated and can therefore cause major runtime costs if only the string is needed.
  2. Minor, but the syntax could still be shorter if possible.

If templates were actually lazy, then I would say this implementation is definitely sufficient and no new feature is needed (except maybe making it a builtin function). It is even a rejected idea of PEP 750, but another PEP in the future might bring something similar to the table.

So, we either hope that happens, otherwise I have another idea.

e-strings

e-strings are very similar to to all the concepts we previously discussed. They would look something like this:

expr: str = e'stars[0]' 
result: tuple[str, T] = e'stars[0]='   # T is the type of the evaluated expression, here int

But honestly, this is not a good solution for multiple reasons. I just wanted to mention it, because I liked the idea at the start.

A typing approach

I suggest we could add a new function called expr to either the typing, inspect or builtin module. It would have the following signature:

def expr(e: str, eval: bool = False, /, globals: dict[str, Any] | None = None, locals: dict[str, Any] | None = None) -> str | tuple[str, Any]: ...
  1. It has similar overloads as mentioned in my templates approach above, so if eval is True, a tuple with the string and value is returned, otherwise just the string (but lazy!).
  2. This function is special cased by type checkers, so instead of Any the type checker and IDE understand that the passed e string is supposed to represent be a valid expression and possibly highlight it accordingly. If the expression seems invalid, this should also be highlighted of course. Renaming and other features should possibly also be supported for this string.
  3. The code of this function is simply this:
def expr(e: str, eval: bool = False, /, globals: dict[str, Any] | None = None, locals: dict[str, Any] | None = None) -> str | tuple[str, Any]:
    if eval:
        return (e, eval(e, globals, locals))

    return e

I would totally love this approach, if globals and locals didn’t have to be explicitly passed if needed, but oh well.
I also just need to mention how smoothly this would fit next to the functions eval and exec in terms of naming and functionality.

So let me know what you think of my 3 ideas!

1 Like

Re: 3.14 Templates

The templates idea is a nearly workable solution in the short-term as you mentioned, however it feels very much like a hack to me; only a few, very specific, template states are valid for this function. Having so many checks for invalid templates very much suggests to me that they are the wrong tool for this and moulding them to be closer to this functionality would be the wrong path to take IMO.

Re: e-Strings

The new string type will be very difficult to convince others that it is worth the overhead of whole new syntax & updates to tooling for this feature only.

Re: expr() Function

I agree that this fits very nicely with the existing builtins eval() & exec()! Getting a proposal for a language change of the scale of a builtin will be a big ask, but due to being a function there is more flexibility/freedom in an initial implementation that is less impactful (yet will still prove its usefulness).

As you showed, the runtime function implementation is not that tricky, but getting tooling support may be. I would suggest a good first step may be to develop and share a mypy plugin that can correctly understand the typing implications of this function.

Getting a typing PEP written and approved would be a lot easier with some sort of implemented solution for a static analysis tool like mypy.

Evaluation Scope

Function Signature Bikeshedding

Due to your runtime implementation of expr() calling eval() inside its internal function scope I would suggest that it is rather unlikely that you would ever want to call expr() with eval=True but globals=None or locals=None as you would have none of the expected names available for evaluation. I would suggest that the signature of expr() could combine globals and locals into a single argument with type tuple[dict[str, Any], dict[str, Any]] | None and determine whether to perform evaluation based upon whether this argument is None or not. Feels a bit of a cleaner signature to me.

Of course this is just opinionated bikeshedding, and not a reflection on the quality of the idea itself.

Improvements After a Transition to builtins

The major benefit I see of eventually moving the expr() function to builtins would be the ability to remove the globals and locals arguments, and truly evaluate the expression in the outer scope. I hope this would be possible by implementing the function in C after it is moved to builtins, but I’m not sure whether that is my misunderstanding of how the compiler/runtime interact and that this improvement wouldn’t be possible without a new keyword, rather than just a function, which we have already ruled out.

pylint is very helpful for addressing this issue.

Can you share some examples?

Ideally, this would be an IDE feature.

The issue is that IDE by default can only know the obvious places to do this, such as __all__.

Thus, the need for standard way to hint IDE.

Mechanics:

  1. It wouldn’t evaluate the expression.
  2. It would do a parsing for valid syntax

So minimal correctness check by parser.
But yes, would not do the same amount of validation as if expression was evaluated.

Possible solutions:

name = <keyword> expr
name = f'{=expr}'        # This was my proposal some time ago
name = nameof(var)       # This is C# solution (which is only for variable name)
name = e'expr'           # another string type...
name = f'{expr!e}'

I like last one most:

  1. Doesn’t introduce new keyword
  2. Works with arbitrary expression (as opposed to nameof)
  3. Does not introduce new string type (too many of those already, reserving whole string type for a feature like this is too much IMO)
  4. Localises it nicely to be part of f-strings as opposed to something completely new (I don’t think it deserves its own thing - string type, keyword, …)
  5. Can be used as part of f-strings in combination to all of the other nice features of it

What is the goal in this thread? I’ve read it all, some of it twice, and I see a lot of solutions being batted around, but I still feel very unclear about what the problem is.

Classically, this might be called an XY Problem, but I actually saw several problems stated which seem somewhat disparate.

One thing which has been mentioned a few times is deferred evaluation. Is this the primary goal here? If so, it would help to state it, since it’s a complex topic in its own right.

As a reader, I’m still unclear what problem is being addressed.

2 Likes

The OP does list 3 use cases: names in __all__, special namespace of metaclasses, and names in friendlier debugging messages.

Names in __all__ (and __slots__) are a legimate use case but can and are special-cased by all linters so they aren’t a real problem.

Special namespace of metaclasses is needed rather rarely so it isn’t a problem big enough to justify this new feature either.

Friendlier debugging messages can be produced with a function over an f-string or a t-string so it isn’t a problem either.