This is an evolution of my idea “Pass module to exec()”. In summary: I think it would be better if Python did not use a dict object as the global namespace for exec/eval. Rather, it should use a module instance as the namespace (also known as the environment in other language implementations).
The change would affect a number of different things and so I’m struggling to flesh out the complete design. I should write a PEP eventually. I think the backwards compatibility impact can be kept quite low. Performance should not really be affected. I think this change makes things easier for alternative Python implementations. Having dict as the “uber” namespace object in Python has some nice properties but I think it overly constrains implementation choices. We can mostly have our cake and eat it too. E.g. we can still allow dict to be used as namespaces (pass to exec()) but internally not require that namespaces are actually dicts.
Here is a description of my prototype implementation:
func_globalson function objects with
func_namespace. Rather than pointing to the globals dict, we point to the module. I don’t call it
func_modulebecause we already use
__module__as a string containing the module name. So, I use
__namespace__as a property that returns the actual module object. Having this property avoids crawling into sys.modules to lookup the module by name. There is a number of places within Python we do that and it could be eliminated.
For backwards compatibility, functions still have a
__globals__property. It returns the dict from the
f_globalsfrom the frame object. Replace with
f_namespacewhich is a reference to the module object. This change requires some surgery to ceval but the performance impact seems negligible.
This does mean if we want to exec() a code object, we need to have a module for it, not a dict as globals. I address this in two ways. First, globals for existing modules contain a
__module_weakref__entry that points back to the module. So, if we get the dict, we can get back to the proper module. That entry is an internal implementation detail, like
__builtins__. Other Python implementations might not have it. For dicts that don’t have
__module_weakref__, I create an anonymous module to wrap them in. I think the performance impact of this allocation should be okay since using exec() that way is not fast anyhow.
f_builtinsis gone, the behavior of builtins is slightly changed. When you create a module, the value of the builtins is captured at that point. You can’t replace
globals()['__builtins__']and have ceval pickup that change. Personally I think this is a cleaner design and I think the amount of code affected should be very small.
There are a number of cleanups to importlib and related logic that could be done but I haven’t made those. Passing around modules rather than the dict for the module would be cleaner.
The code implementing this prototype is in my github repo. I have all tests passing except test_gdb. That should be fixable.
I think this change could also lead to cleaner implementation of PEP 573 – Module State Access from C Extension Methods. If all function objects have a
__namespace__ attribute that is the module object, it becomes easy to have a METH_X flag that makes that object be passed to the function implementation. I have a rough implementation of that idea but it is not in good enough shape to share. I think the idea works.
To me, having all functions refer to their containing modules (via
__namespace__) is a nice design. Whether the function is implemented in Python or in C, that reference could exist. The proposal in PEP 573 takes what I consider a more roundabout solution (types referring to containing module, methods referring to their types). The PEP 573 design looks like it works but I think it makes the Python runtime model more complicated.