`multiprocessing` 'spawn' treatment of `sys.meta_path`

When using the 'fork' multiprocessing start method, finders that have been added to the sys.meta_path are carried over, but when using 'spawn' multiprocessing start method, they are not.

For instance, the following code:

import multiprocessing
import sys
from concurrent.futures import ProcessPoolExecutor
from importlib.abc import MetaPathFinder


class MyFinder(MetaPathFinder):
    def find_spec(self, fullname, path, target=None):
        return None


def worker_function(message):
    print(message, sys.meta_path[0])
    print(message, sys.path[0])


if __name__ == "__main__":
    sys.meta_path.insert(0, MyFinder())
    sys.path.insert(0, "example-sys-path")
    print("parent", sys.meta_path[0])
    print("parent", sys.path[0])

    for method in ["fork", "spawn"]:
        ctx = multiprocessing.get_context(method)
        with ProcessPoolExecutor(max_workers=1, mp_context=ctx) as executor:
            executor.map(worker_function, [method])

Prints out the following when ran with Python 3.13.1.

parent <__main__.MyFinder object at 0x102e62e40>
parent example-sys-path
fork <__main__.MyFinder object at 0x102e62e40>
fork example-sys-path
spawn <class '_frozen_importlib.BuiltinImporter'>
spawn example-sys-path

We see that the sys.path entry is carried over in both cases, but the sys.meta_path entry is only carried over in the 'fork' case.

The impact of this is that imports are not handled properly in multiprocessing 'spawn' contexts.

Some relevant sys.path logic in multiprocessing in the 'spawn' case is present here.

I believe the multiprocessing 'spawn' context should also at least attempt to include sys.meta_path finders.

This is because you set up the meta path entry in an if __name__ == "__main__" block. The multiprocessing documentation explains the limitations of the spawn start method, and one of them is that the program’s main module is imported before the target function is run. When you import a module, the if __name__ == "__main__" block isn’t run.

So this is the expected behaviour.

You can fix this yourself by moving the sys.meta_path logic outside of the if block.

1 Like

You can fix this yourself by moving the sys.meta_path logic outside of the if block.

I agree that this works as a workaround, but it feels strange to me that there is special logic for handling sys.path in multiprocessng even if it is guarded by the if __name__ == "__main__" block, but there is not special logic for handling sys.meta_path in the same circumstance.

Additionally, there are other use cases where it is not super convenient to handle the sys.meta_path in a special way in multiprocessing 'spawn' contexts. For example, in ipython notebooks or interactive mode.