If the process object is garbage collected while the process is still running, the child process will be killed.
However, I cannot get this to work. For example, with the following sub.py script
from time import sleep
for i in range(0, 20):
print(f"Step {i}")
sleep(1)
and the following main.py
import asyncio
import gc
import sys
async def main():
process = await asyncio.create_subprocess_exec(sys.executable, "sub.py")
await asyncio.sleep(3)
del process
gc.collect()
if __name__ == "__main__":
asyncio.run(main())
the “Step” output keeps going after main.py has finished. Calling process.kill() manually works, of course.
This is fine, I can call kill or terminate manually – I would prefer to, actually. However, I would like to understand the circumstances under which I need to expect the subprocess to be killed by garbage collection, so it does not come back to bite me.
In main, the subprocess coroutine or task is only being killed after it’s already been awaited (i.e. finished), not while it’s already running. And moreover, only after a further three seconds.
Perhaps swap asyncio.run for asyncio.gather, and call that on both main and process (create process outside main, e.g. in the guard if clause).
Sorry, I don’t quite understand what you’re getting at. In the code I posted, the sub.py script is started right away and prints “Step i” every second. After three seconds, main in main.py deletes and garbage collects the Process object process, and then terminates – I get back control of my shell. (So even if there were a dangling reference to process in my main.py script, it should be garbage collected on termination, no?)
If I understand the documentation correctly, the garbage collection of process should kill the sub.py subprocess, but it does not (it keeps on printing “Step i” for 17 more seconds until its for loop finishes).
That’s because it doesn’t get garbage collected until after the whole process has been awaited. (And you’re only del ing the returned value).
Await waits - it blocks the coroutine it’s in, allowing something else, network latency, IO, or another coroutine.
Your code’s put everything in the same coroutine, so if network or IO aren’t important, it all runs sequentially in a single thread, pretty much the same as non-async, blocking code (albeit less of a resource hog for other processes outside of Python).
I now notice that I did not really ask the question that I want to have answered: I would like for the subprocess to keep running even after main.py finishes. Currently, this happens: I still get sub.py’s “Step” output even after main.py is done.
However, the line from the documentation about garbage collection of the Process object makes me worry that this might not be reliable and that there might be circumstances under which my subprocess is killed, instead. However, I do not understand the circumstances under which this garbage collection would happen, given that main.py finishing does not even do it.
I get your meaning. I wasn’t trying to answer that question, FWIW. I was trying to critique the code example, as an attempt to reproduce the issue described in the docs. For this, what I had in mind was:
del_async_process.py
import asyncio
import gc
import sys
async def schedule_subprocess_then_delete_it(tg: asyncio.TaskGroup):
process = asyncio.create_subprocess_exec(sys.executable, "sub.py")
tg.create_task(process)
await asyncio.sleep(3)
del process
gc.collect()
async def main():
async with asyncio.TaskGroup() as tg:
task1 = tg.create_task(schedule_subprocess_then_delete_it(tg))
if __name__ == "__main__":
asyncio.run(main())
I haven’t thought of a simple way with asynio.gather. Using the TaskGroup approach instead, passing tg in to the other task, allowed creating and then deleting the subprocess task without creating new references to it.
However, I -think- to answer your question, to avoid this issue, I think all you have to do is not delete process. Just keep a reference to it in an active scope somewhere.