ProcessPoolExecutor() with max_workers set to default crashes on Rocks 7.0 (Manzanita)

pri1712 · June 10, 2024, 4:40am

I’m using Python 3.9.19 on Rocks 7.0 (Manzanita) , and when I do not manually set max_workers to a value less than 16, it crashes. I have 64 cpu cores. Could the issue be that it is just using too much memory and then crashing?

While debugging I set the number_workers parameter to 2 and it was working fine, just too slow, so using only 2 is not practically feasible for me.

I am currently running it with 8 workers and it is running fine for now.
I tried using 16 workers, it leads to the same issue.
The issue does not seem to be with cpython because when I run it in the base conda environment , it runs fine, atleast till my breakpoint in the code.On further testing if i receive any other crashes I will update them here.

“Default value of max_workers is changed to min(32, os.cpu_count() + 4).”

(Quoted from the official documentation)

In my case it is leading to the error message below with a value of 16 or more for max_workers.

Traceback (most recent call last):
File "/data1/xyz/Displace2024_baseline/speaker_diarization/SHARC_check/wespeaker/diar/spectral_clusterer.py", line 264, in main
for (subsegs, labels) in zip(subsegs_list,
File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
for element in iterable:
File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator
yield fs.pop().result()
File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 446, in result
return self.__get_result()
File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

Above is the error message I receive when I set the max_workers parameter to "default" or more than(or equal to) 16 while using my conda environment. If needed I can list the libraries installed in my environment.

When I logged my CPU usage , I got the following message:

Edit: Even in the base environment , it leads to the following issue , the CPU usage in my log does not even cross 55% so I really do not think its due to CPU overloading.

2024-06-10 10:18:34,426 ERROR:Error occurred: A process in the process pool was terminated abruptly while the future was running or pending.
Traceback (most recent call last):
  File "/data1/xyz/Displace2024_baseline/speaker_diarization/SHARC_check/wespeaker/diar/spectral_clusterer.py", line 287, in main
    for (subsegs, labels) in zip(subsegs_list,
  File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
    for element in iterable:
  File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator
    yield fs.pop().result()
  File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
2024-06-10 10:18:35,938 INFO:CPU usage: 50.2%

csm10495 · June 10, 2024, 5:30am

ProcessPoolExecutor might be using spawn this could lead to a lot of memory being needed (basically amount used by one process: M times the number of workers N).

Do you also hit issues using the executor with a simple hello world or only when using a big array?

From your image, it looks like gigabytes are needed. Are you running out of RAM?

pri1712 · June 10, 2024, 5:50am

I’ll try both of those things and let u know. I have not tried using the executor for a simple hello world yet , but I’ll try it out.

Rosuav · June 10, 2024, 6:22am

Have you tested this with any model other than multiprocessing? numpy is multithreading-friendly.

pri1712 · June 10, 2024, 6:35am

I do not understand what you mean exactly by “Have you tested this with any model other than multiprocessing?”, could you clarify?

Rosuav · June 10, 2024, 6:45am

The multiprocessing module is only one way to spread work over multiple CPU cores, and it’s the most isolated - which also means the most memory-hungry. Alternatives include the multithreading module (or ThreadPoolExecutor), and simply letting numpy itself handle the threads (write your own code as single-threaded). It’s entirely possible that one or both of these will be capable of doing what you want; it’s also entirely possible that they won’t, but IMO it’s still worth testing.

pri1712 · June 10, 2024, 7:48am

I am using the ThreadPoolExecutor library currently , I was thinking of trying out the multiprocessing module at a later time, due to the issue mentioned in my post which I cannot seem to solve.

pri1712 · June 11, 2024, 3:48am

I am not running out of RAM , I have logged my RAM usage and it is all around 2.5GB , the system I am working on has 100’s of GB. I also noticed this issue to be probabilistic, the program crashes sometimes with max_workers set to 8 but does not during another run with the exact same parameters.

ProcessPoolExecutor() with *max_workers* set to default crashes on Rocks 7.0 (Manzanita)

ProcessPoolExecutor() with max_workers set to default crashes on Rocks 7.0 (Manzanita)