Improving support for non-daemon background threads

ncoghlan · October 30, 2024, 5:51am

Continuing on from this post in the core dev thread about whether or not we could reasonably drop support for daemon threads. That clearly isn’t feasible, but two of the recurring themes that came up were:

there’s no convenient atexit hook to invoke to request shutdown of non-daemon background threads (as registered atexit handlers are called after the main thread waits for all non-daemon threads to terminate)
there’s no nice way to “throw” an exception into a running thread the way we can with generators and coroutines (since there are no reliably defined suspension points in threads the way there is for code blocks that use await and/or yield)

I think the first item can be improved relatively easily: add a new atexit.register_early callback list that gets executed before the sys.is_finalizing() flag is set and the main thread joins all the non-daemon threads. Callbacks registered this way would be called “early exit callbacks” (executed before finalization starts), while callbacks registered the traditional way would be “late exit callbacks” (executed just before the interpreter is marked as no longer initialized). Unlike late callbacks, early callbacks would be able to tell non-daemon threads to shut down, and they would be allowed to register new late exit callbacks to run. The interpreter would still be fully configured, so the only thing disallowed would be calling atexit.register_early itself. Doing just this bit would technically provide all the essential pieces needed for developers to write their own gracefully-shut-down non-daemon background threads. (Edit: feature proposal filed: `atexit.register_early` pre-finalization callback API · Issue #126168 · python/cpython · GitHub)

We know this approach is useful because threading._register_atexit already works that way (that internal API exists at least as far back as Python 3.9).

The second item is more complex, but also potentially more interesting (and would require a PEP to make a genuine attempt at resolving). As a hypothetical idea (that gets increasingly more radical as it goes), consider the following:

First, we define a new ThreadExit exception, and a new threading.exit function that throws it. ThreadExit would be defined as a subclass of SystemExit so threading.exit has the same effect in the main thread as sys.exit does, and so anything which already handles SystemExit (like threading.Thread.excepthook) automatically handles ThreadExit as well.

So far, not so interesting (as it’s just a respelling of what calling sys.exit in a thread already means).

What gets more interesting is if we define new threading.get_exit_monitor() and asyncio.enable_exit_monitor() APIs. (Edit: register → enable in suggested async function name)

I’m not sure what the exact APIs of a synchronous and asynchronous exit monitor would look like (that’s why the idea would need a PEP to work out those details), but the gist would be:

the synchronous exit monitor would at least expose a threading.Event like API (to get an event notification when the system is exiting), but also expose a wrapper around os.pipe to get a file descriptor that can be used with the select module, as well as a simple way to say “throw ThreadExit if thread shutdown has been requested”.
the asynchronous exit monitor wouldn’t need to be handled directly. Instead, when the exit monitoring is enabled in a thread, the event loop in that thread would add a suitable atexit.register_early callback that threw ThreadExit into any still running coroutines, waited for them to terminate, and then terminated the event loop (or something along those lines)

For the synchronous case, something would also need to be done with queue.Queue to come up with an interruptible version that can trigger exceptions in waiting threads (and concurrent.futures switched over to using those instead of regular uninterruptible queues).

Edit: the synchronisation primitives in threading would also need variants that supported being interrupted. Actually making this work is complicated though - it can’t be a simple flag in the interpreter state, as that means any attempt to access such resources while finalizing a non-daemon thread would just throw ThreadExit again, so it needs to limited to cases where it can be worked out that all the required exceptions have been raised in all the relevant threads.

Still, even if interruptible versions aren’t practical, a combination of the existing timeout support in various APIs and an Event-like API would allow synchronous threads to periodically check if they should exit (without needing to rely on application-specific exit flags).

ncoghlan · January 19, 2025, 4:44am

I ended up going in a different direction for solving this problem in my own use cases. See Daemon threads and background task termination for details on that idea.