Why does __name__ == '__main__' in my subprocess

Hello all, just a quick one.

My understanding is that a) when the Python interpreter imports a module, the variable __name__ within the new modules namespace is set to the name of the imported module (minus the .py extension). And b) when a subprocess is created, a new interpreter instance is created, which then runs the original module.

I believe I’m right in thinking this is what the if __name__ == '__main__': conditional is for; in this case so the subprocess does not run the original module as a script when importing it, as it could cause recursive behaviour.

But I’ve obviously missed something because when I spawn a subprocess using the default spawn type of fork, the value of __name__ is still '__main__', although it does not seem to execute the code protected by the if __name__... idiom. Example:

from multiprocessing import Process

def f():
    print(f'{__name__=}')
    
if __name__ == '__main__':
    print(f'{__name__}')
    Process(target=f).start()                                 

I’m very confused and would appreciate any help given :slightly_smiling_face:

I run that code I get this output:

True
__main__
__name__='__mp_main__'

This might be because of a difference in OS, POSIX systems excluding MacOS default to the fork spawn type, whereas Windows and MacOS default to spawn.

If I explicitly set the start method to either 'spawn' or 'forkserver' using set_start_method() then I get the same output as you do.

Looks like one of the many pitfalls of fork, then :slight_smile:

haha, fork is doing his best. What I find confusing is based on my current understanding, when I run the code in my example the body of the if statement should also run in the subprocess, as it literally says if __name__ == '__main__':, and according to the output produced in the subprocess, it does :thinking:

But I must have got it all wrong. I’ve read somewhere online that the newly created interpreter imports the module from which it was created and then runs the callable.

okay so mid-post I did some more digging; for those interested, forget everything previous and read on:

An adjustment to my train of thought: I’ve found this old stack overflow post where what I thought was standard behaviour for all start methods is discussed, It’s old, almost 11 years, and relates to Python2, but it describes how because os.fork() is not available on Windows, on that platform a parent process can’t be forked, so a fresh interpreter instance is created instead. Consequently it has to parse __main__ again, which makes it necessary for the original module to include the if __name__ == '__main__': clause to prevent an infinite recursion type situation.

I don’t know if os.fork() is available on Windows presently, the current multiprocessing documentation does say 'spawn' is still the default (and only) start method available on Windows, and also warns to protect the entry point of the main module using if __name__ == '__main__': when using 'spawn' or 'forkserver' as a start method.

So now I’m guessing that when using 'fork', the namespace for the module is the same in the subprocess interpreter as it is in the main one? And I’m also guessing that with 'fork' the subprocess interpreter doesn’t parse the entire, original module again? Either way it seems with fork, __name__ can still be '__main__' without causing any undesirable behaviour.

That doesn’t happen with fork(). Instead, you get a complete copy at the moment when the fork happens, without rerunning anything. This is normally a good thing (it’s very efficient!), but it can be confusing.

Yes. It’s a complete copy.

Ah I see, thanks. multiprocessing seems like a bit of a baffling module. Looks like there is a lot of nuance buried in the docs. I’ll just read the entire thing and identify the bits that are relevant to myself.