Module __getattr__ invoked twice - importlib bug?

When dropping support for older Python versions I want to move to the 3.7+ module __getattr__ feature (PEP562) instead of the existing hack - a module which replaces itself with a class instance in sys.modules. However, there was a problem, minimally reproducible like this:

# in foo/__init__.py 
def __getattr__(name):
    print("__getattr__", name)
    if name == "bar":
        return 1
    raise AttributeError

The __getattr__ hook is invoked twice:

>>> from foo import bar
__getattr__ bar
__getattr__ bar

In my app that is problematic because the import statement may actually need to make a network request.

I don’t totally understand the reason for this double invocation - although I think it may be in importlib’s _handle_fromlist where it wants to check whether bar is perhaps a submodule (i.e. see if foo/bar.py exists). The same double invocation doesn’t happen in a __getattr__ defined in a module like foo.py, only in a package like foo/__init__.py, and it doesn’t happen for direct attribute access - only for “from” imports.

I’m currently using this workaround, adding at the end of the __init__.py an attribute deletion:

del sys.modules[__name__].__path__

At this point, all relative imports have already been resolved, so it doesn’t seem like a __path__ attribute is needed anymore. This hack prevents _handle_fromlist being called and avoids the double invocation. But it seems fragile, and I’m not sure it is safe to “pretend you’re not a package” in this way.

Can anyone suggest a better approach? Did I miss something? Is the workaround removing __path__ problematic - might have undesirable side-effects? Perhaps I should just go back to the old way (using a class with __getattr__) which is more complicated but at least it does not get called twice.

Finally, would this be considered a bug in importlib (it seems to me like it could be refactored to avoid the double call)?

1 Like

Why is it a bug that __getattr__ is called more the once?
Surely it is called every time a variable is accessed on the module?

Because it does not need to be called twice here, only once would be sufficient (with some refactoring). This also makes the package __getattr__ not equivalent to a module or class __getattr__.