How to get output from subprocess.Popen along with visible execution?

AnirudhG07 · July 22, 2024, 9:01pm

I am trying to get the output by running below dummy code for my project -

#test.py
import time

for x in range(5):
     print(i)
     time.sleep(1)

This is test2.py

import subprocess
cmd = "python3 test.py"
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=true, text=true)
output, error = process.communicate()
print(output, error)

Now usually the execution will be like printing numbers every second, but here all the output is dumped to terminal at once instead of one by one.
But then if remove stdout part, the regular execution pattern will work but then I wont have the output.
So its like if I put stdout, I dont get execution pattern while if I dont put it, I get the execution pattern I want but no output.

Is there a way I can do both of them ? I have tried all sorts of gpt/copilot solutions, tried the docs, didnt help me.
Or maybe some other way instead of subprocess.Popen?

MRAB · July 22, 2024, 9:53pm

Your example code has some errors and isn’t runnable as-is.

The process’s communicate method waits for the subprocess to finish before returning the output.

To read the output while the subprocess is running, read from the process’s stdout:

process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, text=True)

while True:
    line = process.stdout.readline()
    if not line:
        break
    print(line.rstrip(), flush=True)

Also, the output to stdout and stderr are usually buffered, so periodically flushing the output might be in order. Here I’m flushing each line as it’s printed:

for i in range(5):
     print(i, flush=True)
     time.sleep(1)

AnirudhG07 · July 23, 2024, 5:01am

Thanks!
So I added a output list to capture each line of the output just before the print statement, which is great.
But how do I catch errors? Say my process had errors, I would like that same error we see in terminal to be in the variable error.

If I add err = process.stderr.readline(), it gives the same issue as before, all the output dumped at once.

AnirudhG07 · July 23, 2024, 5:31am

Also your provided code doesnt work for the case when print(i,flush=True, end=""), this way that line itself is printed together instead of each numbers printed one by one on the same line. Do you have any way out for this

PopGreene · July 23, 2024, 11:43am

You have a choice. You can’t read more than one stream without some sort of concurrency. You can either merge stderr into stdout and read stdout, or you need to read the stderr and stdout concurrently.

To merge stderr into stdout you specify “stderr=subprocess.STDOUT” on your subprocess.Popen() call. You can then simply read process.stdout, print it, and store it:

    stdout_buffer = []
    for l in process.stdout:
        ls = l.rstrip()
        print(ls)
        stdout_buffer.append(ls)
    process.wait()

Important: you MUST wait() the process if you don’t use communicate()

Your other choice is to use concurrency. That’s not trivial, so if you can avoid it by merging stderr into stdout, you want to do that.

Your best bet, if you want to use concurrency is to look at the Python code for the subprocess module. The source code is there. You want to look at the process.communicate and process._communicate methods. I took a look and I can tell you it uses the threading module and creates a thread each for stdout and stderr.

AnirudhG07 · July 23, 2024, 1:14pm

Yeah I’ll use concurrency. Somehow I managed to make it work for Mac but then when I tested the same on windows, bam!! error.
I changed to make it work for windows! bam, output dump in Mac. But atleast I know concurrency is the way to do it.

Thanks!

cameron · July 23, 2024, 10:50pm

That’s because @MRAB’s code used readline() which reads an entire line
before returning. You could use stdout.read(n) (for some n > 0) to
collect data as it arrives. I’m unsure how well that works if the
stdout stream is presented as text, because that involves a decode
step in the I/O layer - I’d be more confident reading bytes and doing
the decode myself.