Import Hook (Custom Loader) for __main__

I’m trying to implement an import hook / custom loading machinery that applies to as many modules as possible, including whichever is loaded as __main__. Specifically, I need to be able to control exec_module when loading __main__, but I haven’t been able to get anything working. Here are the scenarios I’m interested in:

  1. When invoking a script directly: python foo.py
  2. When invoking a script with runpy: python -m foo

Is the import hook machinery designed to be able to handle this case? I haven’t been able to find any documentation or discussions about it, and I’d really appreciate any advice or background knowledge. Thanks in advance!


Example

Python 3.13.4

directory structure:

.
├── bar.py
├── foo.py
└── sitecustomize.py

sitecustomize.py:

import sys
import importlib.abc
import importlib.machinery


class MyLoader(importlib.abc.Loader):
    def __init__(self, loader):
        self.loader = loader

    def __getattr__(self, name):
        return getattr(self.loader, name)

    def create_module(self, spec):
        return None

    def exec_module(self, module):
        print(f"MyLoader: loading {module.__name__}")
        return self.loader.exec_module(module)


class MyFinder(importlib.abc.MetaPathFinder):
    def find_spec(self, fullname, path, target=None):
        print(f"MyFinder: finding {fullname}")
        for finder in sys.meta_path:
            if finder is self:
                continue
            spec = finder.find_spec(fullname, path, target)
            if spec and spec.loader:
                spec.loader = MyLoader(spec.loader)
                return spec
        return None


sys.meta_path.insert(0, MyFinder())

foo.py:

print("in foo.py")
import bar

bar.py:

print("in bar.py")

Procedure

Run PYTHONPATH=. python foo.py. Expected output:

in foo.py
MyFinder: finding bar
MyLoader: loading bar
in bar.py

In this case, bar loading works as expected, but neither MyFinder nor MyLoader is invoked for foo.

Now run PYTHONPATH=. python -m foo. Expected output:

MyFinder: finding runpy
MyLoader: loading runpy
MyFinder: finding foo
in foo.py
MyFinder: finding bar
MyLoader: loading bar
in bar.py

This is slightly more interesting; MyFinder is invoked for foo, but not MyLoader.

1 Like

I can answer at least part of your question: python foo.py has no relation to the import machinery. foo.py is resolved as a relative path and loaded as __main__ without even touching sys.path or any other hooks. You wont be able to wrap the file like that.

Why MyLoader isn’t invoked for the later case I am less sure about - I suspect runpy (the other module you can see getting accessed) is doing special casing here to be able to set the name correctly (and maybe control some other aspects of execution?). You should look at the source code of that module.

1 Like

Thanks for the quick response, and that’s great to know for the python foo.py case. I’m taking a look at runpy and it looks like this module is intentionally circumventing most of the usual import machinery, including loader.exec_module. I’m still trying to determine why would be desirable / necessary, but it gives me something to work with.

In my Ideas library, I use

import argparse
from importlib import import_module
...
args = parser.parse_args()
module = import_module(args.source)

And then, instead of

python foo

I have

python -m ideas foo

Perhaps you could do something similar.

Note that that is very different from normal execution, most notably foo is not being executed with __name__ == '__main__', meaning this approach is incompatible with most scripts.

Actually, there is a way to make it compatible. For my ideas library, which uses import hooks to modify Python’s syntax, when a transformation is specified explicitly on the command line, the source module foo will be executed with the proper naming convention.

Code of foo.py:

print(f"{__name__=}")

Execution with a sample transformation (enabling to write function as equivalent to def)

(venv) C:\Users\Andre\tmp
> py -m ideas -a function_keyword foo
__name__='__main__'

As you can see, foo’s name is like one would want for the main Python source.

However, I just noticed that I forgot to do the same when no transformation is specified on the command line.

(venv) C:\Users\Andre\tmp
> py -m ideas foo
__name__='foo'

I should fix this …

Yes,but at that point you are just reimplementing runpy - which is fine, bit not particularly interesting for this discussion.

I might be able to come up with a workaround using some of the strategies from ideas - thanks for introducing me to this library.

It still seems like a somewhat curious design decision for __main__ not to use as much of the import machinery as possible, but I probably don’t have enough context to fully understand the benefits and drawbacks of such an approach.