I have an issue with multiprocessing and I would appreciate if you could kindly make some clarification.
Consider the following code:
import time import concurrent.futures def say_hello(proc_num): time.sleep(5) print("Hello World") time.sleep(5) def main(): with concurrent.futures.ProcessPoolExecutor() as executor: results = executor.map(say_hello, [1, 2, 3, 4, 5, 6, 7, 8]) for result in results: pass if __name__ == "__main__": main()
The above code creates 8 processes, where each one waits 5 seconds and then prints a “Hello world” message and again waits five seconds. This is indeed just a toy program because the real project is very huge to write here, nonetheless it reflects the idea.
This program works perfectly well and runs as expected on a normal PC, a laptop with:
OS version : Windows 11, Memory : 16 GB RAM Python version: 3.7 (64 bits) import os; os.cpu_count() -------> 8
So when I run the above program, if I do CTRL+ALT+DELETE right after that, I can see the 8 created processes in the task list during the execution.
Now if I run the very same program on our datalab environment which is a far more powerful shared environment with the following characteristics:
OS: Windows Server Memory: 320 GB RAM Python version: 3.7.9 import os; os.cpu_count() ------------> 64
Then I see a very strange behaviour, instead of 8 processes, dozens of processes are created and non of them do anything, also I get the following error message:
Exception in thread QueueManagerThread: Traceback (most recent call last): File "C:\Program Files\Python37\lib\threading.py", line 926, in _bootstrap_inner self.run() File "C:\Program Files\Python37\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "C:\Program Files\Python37\lib\concurrent\futures\process.py", line 361, in _queue_management_worker ready = wait(readers + worker_sentinels) File "C:\Program Files\Python37\lib\multiprocessing\connection.py", line 869, in wait ready_handles = _exhaustive_wait(waithandle_to_obj.keys(), timeout) File "C:\Program Files\Python37\lib\multiprocessing\connection.py", line 801, in _exhaustive_wait res = _winapi.WaitForMultipleObjects(L, False, timeout) ValueError: need at most 63 handles, got a sequence of length 63
After Googling about this error message, it seems that there is an issue in Python in Windows Server with machines having a processor with a lot of (logical) cores. There is already an issue: n° 71090 on the Github.
But I’m not sure whether I understood properly. In this page, someone had suggested to use
multiprocessing.Pool instead of
concurrent.futures.ProcessPoolExecutor. Actually I gave it a try but the result was not good. The new version with
multiprocessing.Pool was extremely slow compared to
concurrent.futures.ProcessPoolExecutor. Besides among created processes, almost only one process was working all the time and others remained rather idle.
Therefore, I preferred to ask the question here, and to see, whether others have already encountered the same problem and what solutions they might have found so far to tackle this problem.
Thanks in advance