Why subprocess child process data is not being captured or shown

I have a python child process which creates a dictionary and output the dictionary using JSON. process.stdout appears to be empty. Data from the child process is not being captured.

json_str = data # dictionary
sys.stdout.write(json_str)
sys.stdout.write(‘\n’)

The python child process is invoked by python parent process. See below code. I went old school and put in display statemenst to track the logic execution.
<>
import json
import subprocess

file_path = “/content/Data_Structure.ipynb”

def main():
command = [‘python’,file_path]

Start the child process

process= subprocess.Popen(command, env=env,stdout=subprocess.PIPE, stderr=subprocess.PIPE, encoding=‘utf-8’, text=True)

Read and parse JSON data from the child process

for line in process.stdout:

try:
   print('Print made it to A')

   json_data = json.loads(line)

   print('Received data:', json_data)

except json.JSONDecodeError:

  print('Invalid JSON: ', line)

  print('Made it to B')   

process.wait() # Wait for the child process to finish

if name== “main”:
main()

print('Made it to C')

<>

Output:

Made it to C

Are your pipes mixed up? I use:



def _output_from_cmd(cmd: str) -> tuple[str, subprocess.CompletedProcess]:
    result = subprocess.run(
        cmd,
        stderr=subprocess.STDOUT,
        stdout=subprocess.PIPE,
    )
    output = result.stdout.decode(encoding="utf8")
    return output, result

Sometimes I need to set shell=True. Try replacing stdout.write with a simple print statement too. And if needs be, look into buffer flushing.

I appreciate your reply. I tried your suggestion however there is still no output displaying

If you’re just passing python to it, I think shell=True is needed, as the it must look for a python on the $PATH. Alternatively just pass in sys.executable to use the same Python runtime for the subprocess. I’ve only been able to use shell=False (the default) on Windows too.

I did a quick test. I created a subprocess, labeled #1 to simply list content of a directory, it worked as expected. I rewrote the original subprocess labeled it #2. It still not working as the variable “line” is empty.

import os
file_path = “/content/Data_Structure.ipynb”
print(os.getcwd())
print(os.chdir(‘/content’))
print('Basename: ',os.path.basename(file_path))
print('Directory: ',os.path.dirname(file_path))
print(‘Absolute Path:’,os.path.isabs(file_path))
import subprocess

file_path = “/content/Data_Structure.ipynb”

#1
result = subprocess.run([“ls”, ‘-l’], capture_output=True, text=True)
print(‘Output:’, result.stdout)
print(‘Error:’, result.stderr)
print(‘Return code:’, result.returncode)

#1 worked as expected. The directory information is displayed and the return code is 0

#2

command = [‘python’,file_path]

Start the child process

process= subprocess.Popen(command, env=env,stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True, text=True)

Read data from the child process

while True:
line = process.stdout.readline()
print('Line: ',line)
if not line:
break
print('Output: ', line) # Read data from the child process

2 is not working. Line is empty.

the output should look like this

{‘table’: ‘client’, ‘weight’: ‘poids’…]]}

No, this is not the case.

[…]

I have a python child process which creates a dictionary and output the dictionary using JSON. process.stdout appears to be empty. Data from the child process is not being captured.

Since you get to “made it to C” we know:

  • your program runs to completion
  • there’s no invalid JSON in the standard output

But you do not look at stderr at all. I’d leave it alone (remove the
stderr= parameter so that it displays on your terminal).

I do not believe you can run .pynb files with the python executable -
you need Jupyter or the like for those.

1: Run python /content/Data_Structure.ipynb directly at the command line, look for errors.
2: Put your python from the .ipynb file into a .py file and run that instead.

Are your pipes mixed up? I use:


def _output_from_cmd(cmd: str) -> tuple[str, subprocess.CompletedProcess]:
   result = subprocess.run(
       cmd,
       stderr=subprocess.STDOUT,
       stdout=subprocess.PIPE,

This is almost always a bad idea - you do not want your error messages
mixed in with your standard output. Particularly in an example like
Vincent’s, where the output is expected to be clean JSON. Chucking error
messages and other junk into the output is called a “plug” in the
taxonomy of bugs.

Sometimes I need to set shell=True.

There are concrete reasons for doing this, and you should never just
toss it in “sometimes”; we discourage shell=True because of its
propensity for quoting issues. You shouldn’t do it without a reason.

No, this is not the case.

OK. What exactly, is not the case? I’d love some help to get my tests working without shell=True.

This is almost always a bad idea

Sure. In my case I’m running tests of an external program via a CLI. I’m not running “assertRaises” style tests, so if there’s an error in stderr, it will indeed cause the combined stdout/stderr to be garbled. For my purposes, such garbling triggers the desired test failure.

Isn’t a combined stdout/stderr exactly what the user sees from the terminal? What they see in the terminal, is exactly what I want to test for.

It’s a quick and dirty approach, and I could make the tests more precise, admittedly. But once I’ve got parametric or property based tests up and going, and hitting the code under test, I don’t strive for perfection in the test code (I do too much of that already). Quantity is not necessarily better than quantity, in this case.

Regarding shell= True

you should never just toss it in “sometimes”

Absolutely. That’s why I removed it from the code block you quoted. Effectively getting OP to try it first without it. Otherwise in the hope of forcing them to look it up in the docs for themselves to figure out what I was talking about, and then either notice or ignore the security warnings, as they’re free to choose for themselves.

You shouldn’t do it without a reason.

I’d very much like to avoid it. But I want to launch exactly the same test command as a developer would run, or exactly the same CLI command as a user would run.

1 Like

No, this is not the case.

OK. What exactly, is not the case? I’d love some help to get my tests working without shell=True.

You wrote “If you’re just passing python to it, I think shell=True
is needed, as the it must look for a python on the $PATH.”. The
subprocess stuff calls the *p versions of spawn or exec, which consult
the $PATH. The OP even showed an example of this invoking ls.

Here’s an example invoking pwd:

 >>> from subprocess import run
 >>> run(['pwd'])
 /Users/cameron
 CompletedProcess(args=['pwd'], returncode=0)

Thanks. It wasn’t a single word command with no whitespace, but I’ll look into why it didn’t work without shell=True for me.

Scratch that. They worked with 10 examples, but not with 250 (with shell = False on max and Linux. Widnows is fine (using cmd not pwsh)).

I’d gotten FileNotFoundError: [Errno 2] No such file or directory: 'node arg1 arg2, but then as well as setting shell=True, I’d stupidly made a second change in a later commit and forgot about it(*), , and mistakenly believed the former was the reason the tests started working.

Thanks though, Cameron.

(*) quoting the args to the CLI (some contained whitespace). Ironically, calling subprocess.run with a list instead of a string would’ve handled this automatically.

A “command with whitespace” sounds to me like a string containing a
command. Parsing that string is what a shell does.

But you can just:

 run(["echo","this","and","that","...     and the other  !"])

without shell=True.

This:

 run("echo this and that '...    and the other   !'", shell=True)

is the same as:

 run(['/bin/sh', '-c', "echo this and that '...    and the other   !'")

and /bin/sh parses the shell command you supply. (On UNIX. The route
on Windows is a bit different.)

But the first argument in the [executable,....] stuff does not need to
use an absolute path - it will consult $PATH from the environment.

I know - thanks. subprocess.run(, … ) does a whole bunch of other parsing that I wish to avoid. I want to use a string that I can copy and paste into a shell, and reproduce the results.

MRX on a fresh Ubuntu 24.04 VPS (a third party rebuild of Python 3.12.3):

sudo apt update
sudo apt install python3.12-venv
python3 -m venv venv
. venv/bin/activate
python
Python 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> import sys
>>> res = subprocess.run(f"{sys.executable} -X utf8 -c 'import sys; print(sys.argv[1:])' hello world",shell=True)
['hello', 'world']
>>> res = subprocess.run(f"{sys.executable} -X utf8 -c 'import sys; print(sys.argv[1:])' hello world")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.12/subprocess.py", line 548, in run
    with Popen(*popenargs, **kwargs) as process:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.12/subprocess.py", line 1955, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: "/root/venv/bin/python -X utf8 -c 'import sys; print(sys.argv[1:])' hello world"
>>> exit()
(venv) root@ubuntu-4gb-fsn1-1:~# /root/venv/bin/python -X utf8 -c 'import sys; print(sys.argv[1:])' hello world
['hello', 'world']

You wrote:

 >>> res = subprocess.run(f"{sys.executable} -X utf8 -c 'import sys; print(sys.argv[1:])' hello world",shell=True)
 ['hello', 'world']

This passes that string to /bin/sh for interpretation.

 >>> res = subprocess.run(f"{sys.executable} -X utf8 -c 'import sys; print(sys.argv[1:])' hello world")
 Traceback (most recent call last):
 [...]
 FileNotFoundError: [Errno 2] No such file or directory: "/root/venv/bin/python -X utf8 -c 'import sys; print(sys.argv[1:])' hello world"

Here, shell=True is not specified, so the leading args parameter is an executable name. From th docs:

 On POSIX, if args is a string, the string is interpreted as the name 
 or path of the program to execute. However, this can only be done if 
 not passing arguments to the program.

So it’s trying to find an executable with the name "/root/venv/bin/python -X utf8 -c 'import sys; print(sys.argv[1:])' hello world".

Neither of these uses the shell an an intermediary.

Oh cool thankyou - that’s really helpful. I have read the docs… …some time ago. Until now I was a little mystified by the FileNotFoundError. You’ve explained it perfectly - nice one :slight_smile:

TBF, I did not know this “single string with shell=False” mode existed. I discovered it from you.

1 Like