Proposed change to AST node constructors

Currently, the constructors for AST nodes accept arbitrary keyword arguments and don’t enforce any value:

>>> node=ast.FunctionDef(what="is this")
>>> node.what
'is this'
>>> node.name
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'FunctionDef' object has no attribute 'name'

Problems with the current situation:

  • To make a correct AST node, you have to pass every attribute, including ones that could sensibly default to an empty list or None. This is very tedious for AST nodes with a lot of optional fields, such as ast.arguments.
  • You cannot rely on attributes like name being present
  • It’s possible to create useless, broken AST nodes like the above. Such nodes are likely to cause errors when passed to e.g. ast.unparse or compile.
  • Introspection tools can’t easily tell what the signature of the constructor is supposed to be
  • Adding new fields to an AST node causes various compatibility issues (issue relating to PEP 695)

I implemented the following proposed solution (PR, issue):

  • Default all fields to some sensible value if they are not provided to the constructor: an empty list for list fields, and None for other fields.
  • Emit a DeprecationWarning if a field without a sensible default (e.g. FunctionDef.name) is not provided. In 3.15, this will raise an error.
  • Also emit a DeprecationWarning if the constructor is passed arguments that it doesn’t recognize (e.g., what above). In 3.15, this will raise an error.
  • Add a __text_signature__ to the AST classes indicating the expected signature.

I feel it’s clear that the proposed change improves usability, but there is a chance that people are relying on the existing behavior. If you maintain a tool that creates AST nodes, would this change be helpful to you?

11 Likes

While these are potentially just next steps and don’t have to happen now, wouldn’t it make sense to also block the “abstract” types from being instantiated (e.g. expr, lowercase) and to verify the types of passed in argument values? This would prevent invalid ASTs from being constructed, further reducing the potential errors.

I noticed the former edge cases because from what I can tell the _construct_ast_class relies on it.

1 Like

I’m in general agreement with Jelle’s improvements, except maybe “extra” attributes should be supported – in part because historically they have been, in part because I feel it’s a useful feature to be able to add metadata to an AST without having to create a parallel class hierarchy. (Wise users would use a naming scheme here that’s unlikely to clash with future evolution of the AST.)

Maybe it would make sense to have a dedicated metadata attribute on all nodes that can be anything, but still emit a warning if anything else is used. That would guarantee that there are no future conflicts.

It’s still possible on my branch to set new attributes after constructing the object:

>>> node=ast.FunctionDef("x", args=ast.arguments())
>>> node.metadata = 42
>>> node.metadata
42

Maybe that’s enough? We could also allow setting such attributes through the constructor, but then you lose protection against typos:

>>> node=ast.FunctionDef("x", args=ast.arguments(), typeparams=[ast.TypeVar("T")])
<stdin>-7:1: DeprecationWarning: FunctionDef.__init__ got an unexpected keyword argument 'typeparams'. Support for arbitrary keyword arguments is deprecated and will be removed in Python 3.15.
  node=ast.FunctionDef("x", args=ast.arguments(), typeparams=[ast.TypeVar("T")])
>>> node.type_params
[]
>>> node.typeparams
[<ast.TypeVar object at 0x101858b00>]
4 Likes

I think prohibiting instantiation of the “abstract” types makes sense, but I don’t think I’d do type checking of all arguments passed to the constructor. That’s generally not the approach Python takes for this kind of thing, and feels like it would add some needless complexity.

1 Like

Maybe I am wrong there, but isn’t this a slightly special case because duck typing isn’t allowed here? I would compare this to something like FunctionType, which does do verification of the passed in arguments. Checking the arguments when calling the constructor shifts the verification that has to happen anyway when calling compile to an earlier point in the process, closer to where the actual mistake happened.

Yeah, that sounds like a good compromise. Constraining the constructor arguments makes sense.

1 Like

A compiler I maintain produces Python ASTs (and deconstructs them with match), and these changes would indeed help simplify our code and catch some easy-to-make mistakes. Thanks Jelle!

6 Likes

Given the positive feedback here, I just merged the PR implementing this change. (And in the process I broke all the buildbots, but I’m hoping to be able to address that soon.)

I filed followup issues for three related possible improvements:

Let me know if you have feedback on any of these! I haven’t started working on any of these, but if feedback is positive i’ll try to get all three into Python 3.13.

6 Likes