Running Bat files in parallel

Rasmus · October 12, 2022, 12:28pm

Hey All

I am trying to run a series of .bat files (+1000) in parallel. But firstly I am a bit unsure how to do this, my guess is to use the Joblib Parallel together with a subprocess.Popen(). But the problem is how do I measure then the bat file execution is completed? so I can begin a new round of .bat files.

I am running my python script from a windows system. Hope you have some advice how to solve this and please write if you need more information.

Best Rasmus

tjreedy · October 12, 2022, 4:05pm

Have you considered using multiprocessing and its pool and queue mechanisms?

barry-scott · October 12, 2022, 4:34pm

If you want to do this yourself then I would be using subprocess.run() not Popen.
I used concurrent.futures.ThreadPoolExecutor() for my tool that runs commands in parallel.

You know that the .bat file has completed because it exits usually.
It can get more complex if you are starting GUI programs on windows.
You can test this from the REPL with this:

import subprocess

subprocess.run(['your.bat'])

Rasmus · October 14, 2022, 8:36am

Hey Thanks for the tips i will look into, concurrent.futures.ThreadPoolExecutor() and subproces.run. However I have been quite successfully with this setup, but is still struggling with keeping the processors live, because right now it will activate N=num_cpu and shut them down then the process is finished. But really I want to keep them running a start a new part of the sublist.

    list_example = [["a bat file.bat","a bat file.bat","a bat file.bat"],["a bat file.bat","a bat file.bat","a bat file.bat"]]
    
    num_cpu = 6
    
    run = Parallel(n_jobs=num_cpu)(delayed(subP)(list_example[i]) for i in range(len(list_example)))

def subP(batpath):
    
    os.system(batpath)

Best Rasmus

barry-scott · October 14, 2022, 10:01pm

Using os.system() opens you up to security issues.
Recommend you use subprocess.run() with command as a list.

I do not know what you mean by keep the processor alive.

Can you explain you goal in more detail?

Rasmus · October 17, 2022, 8:11am

Hey Barry,
The main goal is to run a series of a bat files that will start another program and run some code. To achieve this, am I trying to use parallel for loop.

My problems lie in then I input a list with all my bat files in

    run = Parallel(n_jobs=num_cpu)(delayed(subP)(simFiles[i]) for i in range(len(simFiles)))
    
def subP(batpath):
    
    subprocess.run(["start", batpath], shell=True)
    
    #or
    #os.system(batpath)

Will the program start all bat files because the python script does not recognize that a bat file is running on one processor. To counter this did, I split the list of bat files into smaller lists with the length of the number of processors. This way, I could control that only six bat files start on the six processors. However, the problem is that the parallel function ends after only running 6 bat files, and I, therefore, need to restart the parallel loop in a normal for loop or something.

And again thanks, hope this makes more sense

barry-scott · October 18, 2022, 5:45pm

start is a cmd.exe command to run a process in the background.
That does not sound like what you want to do.

See start /? for all its many options - but I do not think you need to use it at all.

I would have expected you only need:

subprocess.run([batpath])

Are you using concurrent.futures.ThreadPoolExecutor() as I suggested?
How is your Parallel implemented?

eryksun · October 20, 2022, 8:09pm

Generally you should use a command-line string with shell=True. On POSIX using an argument list with shell=True is generally wrong because the arguments are passed as parameters to the shell. On Windows it’s generally wrong because subprocess.list2cmdline() only supports argument quoting and escaping that matches WinAPI CommandLineToArgvW(), but the CMD shell uses different rules, and in general multiple rule sets may have to be supported (e.g. a complex pipeline).

By default, CMD’s internal start command executes the command line in a new console session and a new console process group that ignores Ctrl+C. If the /B option is used, then it creates a new process group in the current session. The start command doesn’t wait unless /W is used, e.g. start "" /B /W "command".

In Python, the effect of the /B option is equivalent to calling subprocess.Popen with creationflags=subprocess.CREATE_NEW_PROCESS_GROUP. If you also want a new console session, add the flag subprocess.CREATE_NEW_CONSOLE, but generally I don’t recommend that.

If a process group is in the current console session, and only if it’s in the current console session, you can kill it via p.send_signal(signal.CTRL_BREAK_EVENT). Never call the latter if the process is not the lead process of a process group that’s attached to the current console session; the results can be buggy in ways that break the entire console session due to bad design in the session host (i.e. conhost.exe or openconsole.exe).

FYI, subprocess uses WinAPI CreateProcessW() on Windows. If a filename has the extension “.BAT” or “.CMD”, CreateProcessW() executes it using the shell that’s set in the “ComSpec” environment variable, e.g. "%ComSpec%" /c "path\to\batch\file". If “ComSpec” isn’t set, the default value is “%SystemRoot%\System32\cmd.exe”.

HarryMansfield · June 7, 2023, 9:51am

hey @Rasmus, did you have a finalised python code that worked for you to run a series of bat file simultaneously? I’m very much interested in the same thing. Thanks, Harry