Dear Python community,
I’ve run into limitations related AST processing decorators,
compile(), and nested functions-- I would really appreciate the opinion of experts here.
Context: I created a decorator that rewrites the AST of functions to facilitate certain operations in a tracing just-in-time compiler (it adds syntax sugar). The specifics of those transformations aren’t particularly relevant for the issue. The main point is that
inspect.getsource() to fetch the AST of the function
f, parse it, make a few modifcations, and then run
compile() to create a code object that is then attached to a new function object. It works beautifully.
This approach unfortunately runs into limitations when the decorators is selectively applied to a local function.
x = ...
While all steps of the AST transformation and re-compilation work fine, the inner function unfortunately loses its coupling with the parent function. For example, when
x, Python would normally create a closure cell to make this variable accessible in the sub-function. But this of course does not happen when
g is separately compiled.
(Naturally, I am aware that AST-transforming both
g at the same time would resolve the issue)
The issue as described above seems hopeless, but perhaps someone here has run into this and has an idea?
Yeah, I think that’s your only option.
I assume you have a pipeline something like:
new_code = compile(ast.dump(do_modifications(ast.parse(f))))
return types.FunctionType(new_code, globals())
In this case, you should be able to just provide the
closure argument to the function constructor:
| function(code, globals, name=None, argdefs=None, closure=None)
| Create a function object.
| a code object
| the globals dictionary
| a string that overrides the name from the code object
| a tuple that specifies the default argument values
| a tuple that supplies the bindings for free variables
Thus, fully fleshed out:
types.FunctionType(new_code, globals(), f.__name__, f.__defaults__, f.__closure__)
(Of course, this could fail in any number of ways, if the original closure is no longer suitable for the new code. But the closure object in question is just a tuple of
cell objects, which can be trivially constructed to wrap what they need to; and the
cell type is available as
types.CellType. So if you really need to do this kind of surgery, it isn’t easy, but it’s straightforward.)
Otherwise, how exactly are you creating the new function object and “attaching” the code object to it?
@kknechtel I saw that I can attach a closure cell when constructing the
FunctionType object. But what I don’t understand is: how can I get
compile()-generated code to access this provided closure object?
Ah, I see the issue. Right,
compile doesn’t know you’re doing to apply the closure, and will emit bytecodes that try to use locals instead. Theoretically, you can retrieve the actual raw bytecode data from the code object, parse it, replace the necessary opcodes (you know which accesses to replace because you can cross-reference
f.__code__.co_cellvars and reconstruct everything.
It won’t be easy, and it will potentially need to be redone for every bytecode version (which potentially changes every minor Python version). But it’s possible, and probably still easier than making your own bytecode compiler from scratch
It is possible to do this: don’t get just the source of the function, get the source of the entire file (or at least the top most compound statement), traverse it to find the correct node, modify it. Then compile the entire AST, and traverse the code objects in the co_const tables till you get the correct one and replace it inside of the function object you got instead of creating a new function object. Do a bit of sanity checking to make sure your AST edits didn’t introduce anything new, but it should work.