What about interruptible threads
This post is to propose a feature rarely available in other languages, that I suppose could be considered “bad programming” by many. I’m conscious that topic could eventually be seen as going against common knowledge good practices regarding threading. I write this to explain why I consider the described feature being good programming and even a very interesting feature python can have.
This is all about control flow interactions between threads and in an safe way.
The ideal & intuitive way versus the cruel reality
Let’s start with an example: Let suppose I have two threads and I want to signal or stop a second thread
# in a perfect world
def myfunction():
# ... do something, very long, very complex, and very nested
mythread = thread(myfunction)
# ... do something
if whatever_plan_change:
mythread.terminate()
Well this is not possible in python using the proposed threading API. This is possible in other languages by killing the thread the way we kill a process. But this is not a great idea to reach to this solution for normal operations with processes, and this is an even worse idea when instead of killing processes with separated memory an resources, you kill a thread sharing memory and resources with its the other threads of the process (memory corruption and bad resource disposition will likely happen in an uncontrolled way).
Usually, giving the current thread API, one would recommend to create a flag and let the thread exit by itself instead
# in the real world
def myfunction():
while not my_thread_exit:
# ... do something, repeatedly, not nested
mythread_exit = False
mythread = thread(myfunction)
# ... do something
if whatever_plan_change:
mythread_exit = True
You can notice we had to change 2 important things from the previous code:
-
we created a flag variable we need to share between threads, It can be shared using a global variable (not recommended), or using a pointer/reference, or using a freevariable since python features it … or even using a thread object’s attribute if you even subclass the thread class.
So this change often not a big deal in python, but it assumes every function that need to stop the thread could access this flag variable (and this is the reason why global variables are often used instead of freevars)
-
we changed the function body of the thread into a loop, checking for the flag at every iteration. Alternatively we could keep the same function body and place a verification between each of the original code lines but this could be more than ugly and inconvenient: (here is a example just to make your eyes cry at this idea)
def myfunction(): if mythread_exit: return do_some_stuff() if mythread_exit: return yet_more_stuff() if mythread_exit: return if some_condition(): if mythread_exit: return do_an_action(): if mythread_exit: return and_yet() if mythread_exit: return
So in conclusion,the flag-trick obliges you to rewrite your thread as a loop and this is all what this post is about: because many tasks cannot be meaningfully rewritten as a loop. There are tasks where nothing is looping, but the operations are just time consuming and we want to stop the thread for various reasons (save computational/physical resources, are avoid the thread to do something that is not wanted anymore if the thread is managing some hardware or shared resources)
One could argue that with any non-looping task, you can rewrite it as a loop using the pattern of a state-machine (like this example or a bare switch-case), or somethng similar with an event-callback system or a set of possible commands in the thread. Well, all this is very intrusive for the code structure of the desired task:
- This will always lead to a very big amount of code if the task must be able to be interrupted at many points (you will split the task in one function per original code line, and writing one state or one callback per original code line).
- Furthermore you might want to stop/signal the thread during a long operation that has not been implemented by you but by some other libraries (and you cannot alter this code to fit in states or callbacks)
Looking at how this problem is addresses by different languages and libraries, we can see that: C users will choose a switch-case state-machine or conditional loops, C# users will choose event-callbacks, Qt enforces event-callbacks (even when there is 1 only thread), GTK likewise, nodejs too, python users will choose between the 3 options. Every time, fitting one of these 3 designs will ask a lot of work from the user who are not familiar with it, or who want to program a task that do not fit well in these code-splitted designs (like the above-mentioned long-but-sequential-but-nested-or-non-user-written code).
No to say that conditional loops, state-machine and event-callbacks systems are bad (I’m a great fan of them in other occasions), just that they are irrelevant to tasks that are not particularly splittable for various reasons.
Here are coming the interruptible threads.
a ray of hope with interrupts
The idea
The idea comes from the world of microships where there is not thread, but the issue of unexpected cases happenning at any moment in the code must be addressed (cases due to the program or to more likely to the physical world). On every microship in the world, we can do the following: (taking the arduino API in order to hide the implementation with registers)
void mycallback() {
// do something, once the callback is finished, the execution of the main continues
}
main() {
attachInterrupt(5, mycallback); // declare the callback for execution in case of change on external pin 3
attachInterrupt(CLOCK1, myothercallback); // declare the callback for execution by an internal flag of the microship
// do my complex, long, very nested stuff
// if an interrupt is triggered, the main is paused, the corresponding callback is called, and then the main is resumed
nit(); // disable interruption (interruptions signals will wait for further enable)
// do some sensitive stuff, that must not be paused
eit(); // reenable interruption
}
This is all working on a single thread (the microship’s only thread) and this is also possible on every computer CPU when writing code at the kernel level.
How does that help in our case ? Simple: I propose that we implement this kind of interruption function in the python threads. This could resemble to the following:
class Thread:
def nointerrupts(self) -> context manager
'''
return a context manager that disable interruptions as long
as it is openned, all received exceptions will be raised on the
context exit
'''
def interrupt(self, exception):
'''
send an exception to interrupt the thread, this can be done
from any thread. if the thread has disabled interrupts, the send
exceptions will be stacked in an ExceptionGroup and raised on enable
'''
# that method to join a thread an check for its possible errors
def wait(self, timeout) -> result:
'''
join the thread. if the thread exitted with a return value,
it is returned, if the thread exitted with an exception, it
is propagated here
'''
# maybe also that one, much more similar to the microship's attachInterrupt()
def attach(self, exception_type, callback):
'''
when an exception is received by a thread, it is paused and
the callback attached to the exception is run. if this callback
raises or there is no callback, the exception is raised in the thread
'''
So we could program this:
def myfunction():
with currenthread().nointerrupts():
# ... do critical stuff here that shall not be interrupted
# ... do something, very long, very complex, and very nested, that can be interrupted
mythread = thread(myfunction)
# ... do something
if whatever_plan_change:
mythread.terminate() # alias to mythread.interrupt(SystemExit)
# also possible with any exception
mythread.interrupt(MyCustomException('time to react, thread!'))
result = mythread.wait()
Implementation
One might wonder after all these esotheric considerations, if this is even doable. The answer is yes ! Both for python and for compiled languages who already provide exceptions.
There is already a way to implement it without writing a single line of C code, using PyThreadState_SetAsyncExc()
from the CPython API. The only limitation is that it can only send exception types to the thread to interrupt, and so there can’t be additional data in the exception, but we can work around it.
This first way of interrupting is currently tied to the current CPython implementation using the GIL . So we could be afraid that his features could not be ported to other python implementations now or in the future. Luckily this is not a problem, since posix threads is designed to be notified using OS-level signals (simple integers) that actually works in threads in the same way as interrupts do on microships. The best example is the SIGINT
signal that is sent to the main thread of a program when we type Ctrl+C
in a shell. In addition to SIGINT
, the posix threads allows to send a great variety of signals (including some user-defined signals) from processes to processes and from threads to threads.
In fact, posix threads allows to have half of what I propose for interruptible threads in any C program. Threads signals only pause and resume the thread’s operations, at contrary to exceptions who stop progressively everything until finding a stack frame where the exception is handled. For what regards C, the operation stops at signal reception can be implemented combining signals and the standard setjmp.h
. Here is an example (though very unsafe, as everything regarding threading in C).
Most OSs in the world implement posix threads, so does Windows with this port. And for Windows own threads that do not support posix signals, there is still a way to receive interruptions through their C/C++ structured exceptions.
This means that even in a wonderful future where the GIL would have been removed, interruptible threads will still be possible and fine. It would simply ask to rework a bit the signal
handling of python to let the threading.Thread
s receive and send a chosen signal between them.
a python implementation of interruptible threads using posix thread could work that way:
- the main thread is no more receiving all the OS signals, but let the threads receive a specific signal, let’s call it for instance
SIGEXC
- when a python thread is initializing, it registers a C signal callback or
SIGEXC
- when interrupting a thread, python sets the exception in the target thread’s waiting list, and send a
SIGEXC
to the other thread - when the target thread’s C callback receives a signal, the OS thread implementation pauses the thread execution, including the python code. the callback checks whether the thread has interrupts enabled
- if enabled: it sets the received exceptions for the next time the thread interpreter is checking exceptions
- if disabled: it stacks the received exception for check on reenable.
- when a thread reenable exceptions, it checks for stacked received exceptions, if present it will call the registered python callbacks first (if any), then set the exception in the normal operation flow and resume the flow
Unexpected must be handled through Exceptions
there is always exceptions
Now you might point out that all this is not needed if the whatever_plan_change
condition never happens because the programmer has written some code that is so good that nothing unexpected can happen. Well, I would answer that nothing in the real world is free of unexpected events: The real world of course is not, so writing a program interacting with physical stuff (any hardware device, network connection that can suffer from disconnections etc, machines with many effectors, robots and autonomous cars) actually needs a good exception handling system. Even the controlled and predictable inner world of a computer is subject to unexpected event: wouldn’t you mind if your computer OS was crashing when one program encounters a segfault ? or when you remove your USB stick without unmounting it first ? Also ambient radiations can corrupt the memory whenever the program is perfect. Luckily in the inner world of the computer, unexpected events are very rare, allowing all the OS to be written in a language without dedicated exception handling (like C). Even with a very intelligent system (like a human), performing a task it is used to (like moving in the street) in a relatively normed environment; rare and unexpected events can occurs (like a car crashing in front of him, I bet he will interrupt his walk in the direction of the danger whatever his walk state is, and not waiting for his walk procedure to complete). So in conclusion, only God shall be exception-free
async programming doesn’t help much
You might also wonder why this should go through exceptions/interrupts rather than an asyncio-like system where the language syntax automatically splits up the code into a state-machine and whose events are handled by an event-manager. I don’t think such thing is suited for mainly 3 reasons:
- The language syntax will produce a state-machine, or whatever code split that can theoretically be seen as a state-machine. As so, any interruption occuring in the middle a one of its state code blocks will wait for the state procedure to complete before interrupting. with asyncio this would mean that calling a non-async function will make your thread non interruptible until this function has returned. Of course is is a big issue since most computational functions are not (and shall not) be implemented as async.
- This would require to write most of your code in async syntax, meaning you wil put
await
in front of every function call so your thread could be interrupted at any line. while bit more readable than the aforementioned omnipresentif mythread_exit: return
it is definitely as cumbersome to write. - Python is an interpreted language, hence the individual operations are already splitted at bytecode level and the exception state is already checked after every operation. So receiving interrupts as exceptions brings no additional cost to normal operations at contrary to awaiting everything.
all as exception
My opinion is that python exception system is not just a debugging trick (yet very convenient for debugging !), but is a very good system to manage unexpected case, occuring because of uncertainties in the program design (like bugs) but also uncertainties in the program execution context especially the real world. Python exceptions are like a perfectly safe goto
which travels the call stack until finding a scope that can handle the situation. This is very powerful.
Experimenting interruptible threads
For now, I implemented such threads on top of threading.Thread
using the aforementioned PyThreadState_SetAsyncExc()
because that was easy. I can share this code if you want.
In my own case, I’m working on an advanced robotics project, involving many sensors, effectors and of course a lot of computation involving deep learning, path-planning and so on. So it’s mixing lot of libraries together and running a lot of resources, That is why I starting thinking to interruptible threads. I bet most projects in this field or in the field of autonomous vehicles might have the same needs. As well as server softwares that have to deal with the connection issues, with the bad client requests etc.
I’ve been using python for years but only one month ago I realized I could use interruptible threads (because my need became higher and higher by the past year). I must say it works quite well and simplifies a lot the control-flow of my programs. I think this could be of great help for many threading applications using Python.
What do you think ot it ?
I hope I clearly described the idea and interest behind interruptible threads. I think I didn’t forgot better ways to signal/stop threads (available in Python or not), if so please tell me. If you have observations, criticisms or even advices, you are welcome
@python-developpers: if you agree on the interest of such feature, do you think we could implement it in a next version of CPython ? (and I would make a PEP)