Inspired by a recent SO question, I’d like to ask for more details in explanation of a possible deadlock in asyncio.subprocess.Process.wait(). This function is defined the usual way “Wait for the child process to terminate.”:
Documentation quote:
Note
This method can deadlock when using stdout=PIPE or stderr=PIPE and the child process generates so much output that it blocks waiting for the OS pipe buffer to accept more data. Use the communicate() method when using pipes to avoid this condition.
My understanding was: When the executed process is blocked by a full pipe, it cannot finish its work and then to exit. In other words: the fact the process does not terminate makes the process.wait() block. The wait() does the right thing and not reading the pipe (i.e. not communicating properly) is the primary cause.
However, the process.wait will block even when the process in such state receives a signal and terminates. A full buffer in the process.stdout stream buffer blocks the process.wait. Draining the buffer unblocks the wait. I think that in this case wait() does not behave correctly. It should have returned when the process had exited - regardless of buffer full condition.
My untested assumption is that if you kill the process then wait() will return as wait() is not interested in the stdin/stdout pipes.
As you describe the issue is the process writes to stdout and after that completes will exit. But if the pipe is not read the write never completes in the process and you wait for ever.
The subprocess.Popen.communicate function exists in the sync world to fix this issue. And is used by subprocess.run I assume.
I would assume that you can setup the async process to have its stdout/stderr read as well as waiting for process exit. That would
fix the problem as well.
My assumption was the same. But in reality it behaves in the opposite way. wait does not return when the process is killed (and the buffer is still full) which means it cares about the buffer being full even after the process is gone. This was the reason I posted here.
The output + blank lines where are delays + my comments:
-----
process started
wait() start
proc.returncode=None # process 'cat' exists
signal sent
proc.returncode=-15 # process 'cat' terminated by signal 15
# (and the Python is aware of that)
reading pipe buffer # <-- without this the 'wait' blocks
wait() stop
-----