Cancelling threads

gcewing · December 19, 2025, 1:24am

I would be against that, for the reason above. Instead there would be an call or some such made by the thread code itself.

mikeshardmind · December 19, 2025, 1:25am

One of the simplest ways to hand users a worker they can interact with is subclassing thread and adding a method to that subclass that submits a job to the thread’s work queue.

It’s extremely easy for someone other than the thread creator to be holding a thread object, because the stdlib’s design and documentation both encourage using thread subclasses to represent workers.

ryanhiebert · December 19, 2025, 1:28am

An example here would be great, I’m having trouble understanding what you’re thinking this could look like.

ryanhiebert · December 19, 2025, 1:37am

OK, I’ll try my hand at an example.

def create_a_worker_thread() -> WorkerThread:
   """This creates a worker thread with blessed methods of cancellation."""

I think you’re saying in this scenario that there may be some or no blessed methods of cancelling the thread, but interruption is definitely not one of them. Yet, this is a common enough pattern that the thread is likely to be passed back to the caller so the caller can check on the status of the thread, rather than wrapping it in some other interface.

I’m personally of the feeling that this is not a common way to do things, and that “you didn’t call interrupt before, please don’t call it now” is a pretty reasonable way to handle it. Still, it’s at least a theoretical possibility, and judgements will vary on if this scenario warrants the overhead of an additional type.

gcewing · December 19, 2025, 2:12am

#------- Thread code --------

def thread_main():

can put setup code here that must run before enabling

interrupts if desired

threading.enable_interrupts()

from here on we can be interrupted

…

#------- Spawning code --------

t = threading.Thread(target = thread_main)
t.start()
…
t.interrupt()

ryanhiebert · December 19, 2025, 3:22am

I can get on board with that approach, I think. It means that there’s a global concept of the current thread interruptibility state, but the interpreter is going to have to have some concept of that anyway. If I want to have a special “interruptible” thread that’s always ready for interrupting, I can do that pretty easily by making a subclass of Thread that calls that first.

That all said, I’m feeling really drawn to the conceptual simplicity of the “interruptibility” of a thread being managed implicitly by language constructs. I think with (and finally) make really good candidates for that. I feel like if I was designing it without compatibility constraints (a useful exercise, but not necessarily where it should land when considering all factors), I think I’d make all threads interruptible by default with the with block special protections that applied to KeyboardInterrupt as well.

Tinche · December 19, 2025, 9:39am

I don’t really have a strong opinion one way or another but the following comes to mind:

you could use a consenting adults approach, where you make it clear to your users the thread is not interruptible. For example, thread instances have methods like .start() and .run() which I’m pretty sure your users know not to call.
you could give them an instance of a Thread subclass, where .interrupt() has been overridden to be a noop
you could give them a different object that has the thread reference in a private variable; this is usually good enough for debugging

So to me it doesn’t seem the end of the world if all threads are interruptible. I don’t mind the approach where we introduce a thread subclass (InterruptibleThread?) and stick the .interrupt() method there and only there, though.

This approach with special casing with and finally seems very awkward to me. Not all with and finally clauses are used for critical code, and not all critical code is in with and finally blocks. A better approach would be an explicit context manager, something like:

with threading.shield_interrupts():
    <critical section>

Out of curiosity, what do you propose should happen if a thread is in a “critical” section and an interrupt is requested for it? Should the interrupt be swallowed permanently or delivered to the thread as soon as it exits the critical section?

ryanhiebert · December 19, 2025, 12:53pm

I don’t think this will work generally, for the same reasons that merely calling a function to start and stop the critical section wouldn’t work. Let’s consider the example from the signal handler docs, and modify it to use this approach:

class SpamContext:
    def __init__(self):
        self.lock = threading.Lock()

    def __enter__(self):
        # If KeyboardInterrupt occurs here, everything is fine
        with threading.shield_interrupts():
            self.lock.acquire()
        # If KeyboardInterrupt occurs here, __exit__ will not be called
        ...
        # KeyboardInterrupt could occur just before the function returns

    def __exit__(self, exc_type, exc_val, exc_tb):
        ...
        # A KeyboardInterrupt could occur here and escape without __exit__
        with threading.shield_interrupts():
            self.lock.release()

As far as I can see, this approach has all the same problems as we currently have, unfortunately. The only alternatives I’ve seen are to have alternative versions of with and finally that have this special handling, because it needs to be handled atomically with language-level events that start or end context managers or start finally blocks that may be initiated by exceptions elsewhere.

It should be delivered as soon as the critical section ends. These critical sections should be as short as possible, but we’re also at a good place for “consenting adults” with this, because there’s no way to force, for example, that an entire program’s normal operation isn’t happening in a finally block. That this feature would be a forcing function to avoid non-critical code in these path seems like a feature to me, rather than a bug.

pf_moore · December 19, 2025, 1:00pm

If it’s written in C, KeyboardInterrupt cannot get raised at the points you note^[1], so everything is safe.

Because even if the signal occurs, the exception is raised between opcodes in the interpreter core loop ↩︎

Tinche · December 19, 2025, 1:46pm

Hm, interesting. asyncio programs only get cancellation errors at suspension points so there it’s easier to reason about things like this.

It’s very common to do I/O in __enter__ and __exit__ (like opening a connection in a connection pool). Suppressing thread interruption indiscriminately in __exit__will go against the goals of this change, I think. I will think on this more.

Maybe one of the directions could be to only raise interruptions while a thread is performing a syscall from a list of approved syscalls (like doing network or file I/O, sleeping)?

ryanhiebert · December 19, 2025, 2:49pm

Right now I think that’s the exact reason for this change, and not against the goals at all. If __enter__ is able to run and do IO, it’s critical that __exit__ run to clean up, and if you allow interruptions in __enter__, you won’t ever have the __exit__ run, because __exit__ is generally only safe to run if __enter__ has completed successfully.

I think when you get to the C level, the interrupt, like KeyboardInterrupt, will only happen between bytecodes in the interpreter, or with the cooperation of the C-level code which by default will not handle those interrupts.

This should mirror existing behavior of KeyboardInterrupt, which can, for example, cancel a sleep, but IIUC cannot always interrupt an arbitrary call to a C library.

ngoldbaum · December 19, 2025, 2:54pm

The C library needs to acquire the GIL (if it’s been released) and call PyErr_CheckSignals

Tinche · December 19, 2025, 2:56pm

I/O can hang indefinitely, and it’s not at all unusual (in relative terms) for it to do so. If we design a thread interrupt system that can’t handle a bad network call that happens to run without a timeout, I don’t think we’ll have accomplished a lot.

mikeshardmind · December 19, 2025, 3:15pm

Tin Tvrtković:

mikeshardmind:

As a library author, I should be able to hand users an object that wraps interaction with a thread (especially since the current best way on this is just subclassing Thread…) without having to hide the thread object away so that users don’t interrupt it.

I don’t really have a strong opinion one way or another but the following comes to mind:

you could use a consenting adults approach, where you make it clear to your users the thread is not interruptible. For example, thread instances have methods like .start() and .run() which I’m pretty sure your users know not to call.

you could give them an instance of a Thread subclass, where .interrupt() has been overridden to be a noop

you could give them a different object that has the thread reference in a private variable; this is usually good enough for debugging

So to me it doesn’t seem the end of the world if all threads are interruptible. I don’t mind the approach where we introduce a thread subclass (InterruptibleThread?) and stick the .interrupt() method there and only there, though.

Generally speaking, adding methods to existing types that are intended to be subclassed is breaking. What if someone already has an interrupt method?

On top of it, there’s long-existing code that may be in a finished & stable state. The only reasonable guidance here would be “Unless a library has documented that it’s threads are interruptible, you shouldn’t call .interrupt()”, but at that point, we may as well convey that with types.

Tinche · December 19, 2025, 3:37pm

Their method will still get called

But I have a more serious question - would you be OK with the default asyncio threadpool being changed to use interruptible threads?

mikeshardmind · December 19, 2025, 3:51pm

No, for similar reasons that I don’t think it’s appropriate for this behavior to be the default for just plain threads.

Threads are and have been documented as having specific behavior. People can reasonably have built things that rely on that behavior. Changing it out from under them isn’t a good idea for something that’s been this way for as long as it has.

ryanhiebert · December 19, 2025, 4:01pm

I think I’m making sense of this idea. For example, I might have a protocol that requires waiting on a response before it is connected, and requires confirmation before it is gracefully ended, and both of those could be IO operations that take some time.

I think it’s useful to draw a distinction between a thread cancellation and an interrupt for process termination. Thread cancellation is a convenience, where we want to cancel the thread, but we wish to be able to choose to continue the process running indefinitely. An interrupt means that we no longer care if the end state is stable in the current process, we just want to be as graceful as possible, but end the process immediately.

If we want the thread to be capable of running indefinitely, cancellation alone shouldn’t forcibly terminate threads. It’s true that these __enter__ and __exit__ may be doing significant IO, but the whole process is at risk unless everything is exited cleanly.

On the main thread, I can imagine wanting safe cancellation when pressing Ctrl-C at the CLI, which would not automatically propagate to threads, but that the main thread could propagate as it wants to. Then a second Ctrl-C (with some big fuzziness over exactly what it means to be a second Ctrl-C, and I’m not totally sure it can be clarified) would initiate a more forceful termination, which I think could intuitively include propagating termination to all threads immediately.

Similarly, I might want to retain the existing behavior that Ctrl-C is a force terminate, but have it propagate to threads for a final shutdown of those threads as well.

I think that either of those ideas are likely and unfortunately not reasonable for at least concerns of backward compatibility (but if we could, I like the idea of attempting a safe cancel first by default). But it may be possible to expose those as different modes of operation for the interpreter. So you could have a function in sys (or a context manager).

sys.interrupt_mode('ESCALATE_MAIN_THREAD_IMMEDIATELY')  # Current default
sys.interrupt_mode('CANCEL_THEN_ESCALATE')
sys.interrupt_mode('ESCALATE_IMMEDIATELY')
# Where ESCALATE means to do an unsafe interrupt

pf_moore · December 19, 2025, 5:15pm

There’s no documentation stating that threads will never receive an asynchronous exception. The documentation for SIGINT says it will only be delivered to the main thread, so the main source of asynchronous exceptions is avoided, but it’s not the only possibility. Like it or not, PyThreadState_SetAsyncExc is a documented and supported API - it’s not exposed to Python code, but C extensions can use it perfectly legitimately.

I’ll note that actually, (a limited form of) thread cancellation is easy. Just write a small C extension that uses PyThreadState_SetAsyncExc to raise a custom exception, and you’re done. It won’t interrupt blocking system calls, but to do that all you need is some OS-specific code (which is no harder to write in a C extension than it is in a language feature).

IMO, this discussion is drifting too far towards the “purity” end of the “practicality vs purity” scale. There are legitimate uses for thread cancellation. It’s difficult, and maybe impossible, to make cancellation 100% safe, but is that a deal-breaker? Clearly it is for you, but Python has always had a more relaxed attitude in this sort of case (hence the “practicality vs purity” Zen).

The hard bit is agreeing on acceptable semantics - making sure there’s a way to protect critical cleanup, etc. We may not be able to find something that works perfectly, but if there’s no willingness to compromise, we’ll just end up with nothing beyond the current status quo, and someone will finally get frustrated enough to write that C extension, and unsafe cancellation will become the norm^[1]. That wouldn’t bother me a lot, personally, but it would be nice if we could agree on something a bit safer

to the extent that anyone uses it - while cancellation is useful, it’s still a pretty niche requirement after all… ↩︎

ryanhiebert · December 19, 2025, 5:45pm

ryanhiebert:

But it may be possible to expose those as different modes of operation for the interpreter. So you could have a function in sys (or a context manager).
sys.interrupt_mode('ESCALATE_MAIN_THREAD_IMMEDIATELY')  # Current default
sys.interrupt_mode('CANCEL_THEN_ESCALATE')
sys.interrupt_mode('ESCALATE_IMMEDIATELY')
# Where ESCALATE means to do an unsafe interrupt

Was thinking about these modes a bit more. I think the CANCEL_THEN_ESCALATE behavior isn’t going to do everything I want, it’ll only be able to handle the cases where the interpreter itself is still delaying a cancellation. However, we can handle other cases of cleanup taking too long and escalating the second keyboard interrupt explicitly, and I think this could work.

if __name__ == "__main__":
  sys.interrupt_mode("CANCEL_THEN_ESCALATE"):
  big_thing_doer = BigThingDoer()
    try:
      big_thing_doer.begin_doing()
    finally:
      sys.interrupt_mode("ESCALATE")
      big_thing_doer.cleanup()

mikeshardmind · December 19, 2025, 5:48pm

I don’t find cancellation to be a dealbreaker. I do find adding cancellation, that people have already expressed wanting to be able to use arbitrarily, to existing code that was designed without it in mind significatnly more problematic. Putting it on a new Thread Subclass rather than on all threads allows new code to consciously be written with this in mind, and existing code to be updated as there is demand and maintainer time to properly consider it.

I expect that anyone with the knowledge to do this isn’t going to. (for one thing, without knowing what the thread in question is in process of, doing sois likely to not actually be async signal safe) I certainly wouldn’t, because there are better ways to handle cancellation gracefully without this.

If we were to go back and redesign the language from the ground up, I’d actually advocate for more cancellation than has been proposed here, but it would come with actual means of deferring cancellation while in protected scopes, something which cannot be adequately papered over with special casing __enter__ and __exit__.