Determining the chain of imports from a top-level module to a given module

When I do import mypackage.mymodule, something in the gargantuan import tree imports a certain third-party package that I do not want to be imported unless I explicitly call a function that depends on it. How can I find out what module is importing the third-party package and what module is importing that module all the way up to mypackage.mymodule?

  • Most of the modules involved are third-party dependencies outside of my control, so I can’t edit them.

  • The only thing close to a solution I’ve found is to run python -v -c 'import mypackage.mymodule', but the output from the -v switch is borderline garbage.

What you see is each attempt to import and which files are loaded.
To solve your problem look at the module that was imported just before the module you are concerned about.

That is likely to be where import is coming from.

Alternative is to search all the modules source code for the import that concerns you.

Do you mean that you don’t have any access to the code on your system? Or just that they aren’t your project. Because a simple, hacky way to do this is:

  • find the source of the 3rd-party package in your site-packages directory
  • modify its __init__ to raise an exception
  • import your module and read the stack trace

This won’t work if you can’t do the first step, e.g. if you don’t have access to the package’s code, but for most packages the actual python source will be present in site-packages.

Alternatively, just uninstall it, and you’ll get an ImportError when someone tries to import it.

2 Likes

That might work if the import “depth” was indicated in the output, but it’s not. For example, the first occurrence of import 'pandas.... in the output comes right after import 're._compiler'. I’m assuming that re._compiler is at the end of an “import branch” and that pandas is actually being imported by something higher up, but I can’t tell what.

Another way is to set sys.modules["name.of.package"] to None, in your initial startup script (before importing any other code). That will cause imports to immediately fail, giving you a traceback.

3 Likes

I ended up doing sys.modules["package"] = None for most of the problem imports, but I had to resort to editing numpy/__init__.py to add a raise RuntimeError in order to catch one library that was doing try: import numpy; except ImportError: ....

Thanks for the help!

Getting much deeper into the import system, we can install a hook into sys.meta_path (which controls the ways that Python searches for modules and loads them, rather than simply the places it looks for absolute imports), like so:

import sys
# This is a *real* hack; you're meant to define a class following
# the API laid out in the `importlib.machinery` documentation.
# (It's not really clear to me whether you're meant to use the class
# itself with a `@classmethod` or `@staticmethod`, or instantiate
# it and put an instance in `sys.meta_path`...)
from types import SimpleNamespace

def reject_numpy(fullname, path, target=None):
    if fullname == 'numpy':
        # This way, we can customize the exception
        # so that it can be something that isn't caught by the library.
        raise RuntimeError('Importing NumPy is disallowed.')
    # by default, this falls through and returns `None`, which in turn
    # lets the other module-loaders on `sys.meta_path` have a go at it.

# Adapt it to the loader protocol and put it before the defaults.
sys.meta_path.insert(0, types.SimpleNamespace(find_spec=reject_numpy))

(This still needs to happen before the attempt to do the normal imports.)

I usually look at the output of python -Ximporttime -c "import module" and then the indentation indicates which module is importing which other module (at least for the first time it is imported).

1 Like