When dropping support for older Python versions I want to move to the 3.7+ module __getattr__
feature (PEP562) instead of the existing hack - a module which replaces itself with a class instance in sys.modules
. However, there was a problem, minimally reproducible like this:
# in foo/__init__.py
def __getattr__(name):
print("__getattr__", name)
if name == "bar":
return 1
raise AttributeError
The __getattr__
hook is invoked twice:
>>> from foo import bar
__getattr__ bar
__getattr__ bar
In my app that is problematic because the import statement may actually need to make a network request.
I don’t totally understand the reason for this double invocation - although I think it may be in importlib’s _handle_fromlist
where it wants to check whether bar is perhaps a submodule (i.e. see if foo/bar.py
exists). The same double invocation doesn’t happen in a __getattr__
defined in a module like foo.py
, only in a package like foo/__init__.py
, and it doesn’t happen for direct attribute access - only for “from” imports.
I’m currently using this workaround, adding at the end of the __init__.py
an attribute deletion:
del sys.modules[__name__].__path__
At this point, all relative imports have already been resolved, so it doesn’t seem like a __path__
attribute is needed anymore. This hack prevents _handle_fromlist
being called and avoids the double invocation. But it seems fragile, and I’m not sure it is safe to “pretend you’re not a package” in this way.
Can anyone suggest a better approach? Did I miss something? Is the workaround removing __path__
problematic - might have undesirable side-effects? Perhaps I should just go back to the old way (using a class with __getattr__
) which is more complicated but at least it does not get called twice.
Finally, would this be considered a bug in importlib (it seems to me like it could be refactored to avoid the double call)?