Add optional parameter max_depth to ast.literal_eval

I propose adding an optional parameter max_depth to the ast.literal_eval function to remove the possibility to crash the python interpreter.

The ast.literal_eval eval function fulfills a nice use-case where one wants to safely parse a literal expression easily in Python, without having to worry about possible security risks when using eval or exec. The function itself is safe in most regards, but can still be crashed with a large and complex enough expression (ast.literal_eval).

This basically makes it unusable for most cases where not just very small literals are expected (where one could just limit the length of the string passed).

This is why I propose adding a max_depth parameter to the ast.literal_eval function, which would just limit the depth of the literal expression and if it is surpassed, it raises a RecursionError instead of continuing to parse and cuase the interpreter to crash.

The function signature would then look like this:

def literal_eval(node_or_string: ast.AST | str, max_depth: int) -> Any:
    ...

Max depth must be an integer smaller or equal to 0, otherwise a ValueError is raised.

Alternatively, make parsing an expression in general raise a RecursionError instead of the interpreter crashing.

The part of ast.literal_eval() which is potentially unsafe is ast.parse(). Since some parts of the parser and the code which converts the AST tree from internal represenation to Python are recursive, they consume the C stack and can potentially overflow it. There are several guards for this, and you get RecursionError for too deep recursion:

>>> ast.parse('+0'*1000000)
Traceback (most recent call last):
  File "<python-input-6>", line 1, in <module>
    ast.parse('+0'*1000000)
    ~~~~~~~~~^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/ast.py", line 46, in parse
    return compile(source, filename, mode, flags,
                   _feature_version=feature_version, optimize=optimize)
RecursionError: Stack overflow (used 8148 kB) during compilation

There is no way to guarantee that there will be no stack overflow in C, this guard uses platform-specific tricks.

Specifying the maximal depth manually would not guarantee this either. It depends on how much of the C stack was already consumed and how much is consumed by every level of recursion. These details are far beyond the scope of interest and competency of the average Python user.

5 Likes

Obviously setting the limit to 1 billion is not gonna be safe, but being able to set some sensible limit (like 128) should be safe and, in my opinion, useful. If one sets Python’s max recursion limit to 1 billion, then it also becomes pretty useless, as the interpreter probably crashes way before reaching it.

This would basically act as a short circuit as soon as some part of the parser reaches the specified depth.

1 Like

I don’t believe this is the right tool here, and it should be more prominently noted that nothing in the ast module should be assumed safe on untrusted arbitrary inputs, even if you expect it to be a literal. The ast module is for parsing source code, not for handling untrusted input, and it isn’t designed with guarantees you would want for a parser of untrusted inputs.