I’ve been writing an application that makes extensive use of threads (I’d love to use virtual threads someday), and I’ve been really wishing that I had a reliable way to gracefully cancel threads. In prior discussions, it has been rightly noted that, ultimately, threads cannot be forcibly aborted and keep the process stable.
I want the process to be stable, however I don’t strictly need my threads to be forcibly aborted. I can live with forcible abortions only being possible with isolated processes. What I’m really going for is allowing the code that is executing in the thread to avoid deep knowledge of the environment that is running it, and the environment to need to be unaware of the specifics of the implementation of the thread’s code. Cancellation exceptions as an API seems to fit very nicely to me as having minimal cognitive overhead and leakage.
To the extent that I followed the most significant conceptual hurdles to cancelling threads, it was on two fronts:
- C calls cannot be interrupted without cooperation from the C library.
- Cancellation exceptions might raise in resource cleanup handlers, and break cleanup.
For the first challenge of interrupting C libraries, I think that limitation is at least as acceptable for cancelling threads as it is for raising KeyboardInterrupt. We might want to expose it differently to C extensions, but the general principle of the limitation seems entirely reasonable to me.
The second challenge of cancellation exceptions potentially raising during resource cleanup, however, seemed more troubling, and I didn’t see any suggestions that seemed to adequately solve that challenge. But I’ve been thinking it over today, and reflecting on the main motivation I have for cancellation – allowing the thread creator to handle the thread lifecycle without explicit cooperation with the thread’s code – and I think there may be a reasonable approach.
As commenters have noted, existing solutions that try to solve the problem are inadequate. As a particular example, one suggestion was to surround uninterruptible code to disable cancellation, but no matter how you slice the python code, there’s no way to avoid unintentionally having cancellation disabled either too broadly or insufficiently. It seems to me that this would require cooperation from the interpreter itself.
This got me thinking about where this type of unsafe-to-interrupt code generally is, and what patterns might be available to cooperate with the interpreter to ensure that cleanup code won’t be abandoned. I see three ways that resources are cleaned up in Python code, and that we might wish to avoid being interrupted:
- Explicitly called cleanup functions (e.g.
File.close(),Lock.release()) finallyblocks (and perhapsexceptblocks as well)__exit__methods of context managers (withblocks)
For my own code, I have moved to avoiding explicitly calling cleanup functions except in finally blocks, and I get the sense that it may be common practice in the Python community as well. It seems reasonable to think that we should encourage using with and finally more. So then, the two ways that I would bless for doing this type of resource cleanup already cooperate with the interpreter by the language syntax of with blocks and finally blocks.
What if the interpreter tracked when it was entering cleanup code from a with or finally block, and waited to raise any cancellation until they completed? Because it would be done by the interpreter, it wouldn’t have the opportunity to raise, for example, immediately inside the finally block, where necessary cleanup code might be interrupted.
To my eyes, I think the C-code limitation, and this proposed limitation to avoid interrupting cleanup code, would both be reasonable for my uses.
Prior and related discussions: