What about interruptible threads

What about interruptible threads

This post is to propose a feature rarely available in other languages, that I suppose could be considered “bad programming” by many. I’m conscious that topic could eventually be seen as going against common knowledge good practices regarding threading. I write this to explain why I consider the described feature being good programming and even a very interesting feature python can have.

This is all about control flow interactions between threads and in an safe way.

The ideal & intuitive way versus the cruel reality

Let’s start with an example: Let suppose I have two threads and I want to signal or stop a second thread

# in a perfect world

def myfunction():
    # ... do something, very long, very complex, and very nested
mythread = thread(myfunction)

# ... do something
if whatever_plan_change:
    mythread.terminate()

Well this is not possible in python using the proposed threading API. This is possible in other languages by killing the thread the way we kill a process. But this is not a great idea to reach to this solution for normal operations with processes, and this is an even worse idea when instead of killing processes with separated memory an resources, you kill a thread sharing memory and resources with its the other threads of the process (memory corruption and bad resource disposition will likely happen in an uncontrolled way).

Usually, giving the current thread API, one would recommend to create a flag and let the thread exit by itself instead

# in the real world

def myfunction():
    while not my_thread_exit:
        # ... do something, repeatedly, not nested
mythread_exit = False
mythread = thread(myfunction)

# ... do something
if whatever_plan_change:
    mythread_exit = True

You can notice we had to change 2 important things from the previous code:

  • we created a flag variable we need to share between threads, It can be shared using a global variable (not recommended), or using a pointer/reference, or using a freevariable since python features it … or even using a thread object’s attribute if you even subclass the thread class.

    So this change often not a big deal in python, but it assumes every function that need to stop the thread could access this flag variable (and this is the reason why global variables are often used instead of freevars)

  • we changed the function body of the thread into a loop, checking for the flag at every iteration. Alternatively we could keep the same function body and place a verification between each of the original code lines but this could be more than ugly and inconvenient: (here is a example just to make your eyes cry at this idea)

    def myfunction():
       if mythread_exit:  return
       do_some_stuff()
       if mythread_exit:  return
       yet_more_stuff()
       if mythread_exit:  return
       if some_condition():
           if mythread_exit:  return
       	do_an_action():
           if mythread_exit:  return
       and_yet()
       if mythread_exit:  return
    

So in conclusion,the flag-trick obliges you to rewrite your thread as a loop and this is all what this post is about: because many tasks cannot be meaningfully rewritten as a loop. There are tasks where nothing is looping, but the operations are just time consuming and we want to stop the thread for various reasons (save computational/physical resources, are avoid the thread to do something that is not wanted anymore if the thread is managing some hardware or shared resources)

One could argue that with any non-looping task, you can rewrite it as a loop using the pattern of a state-machine (like this example or a bare switch-case), or somethng similar with an event-callback system or a set of possible commands in the thread. Well, all this is very intrusive for the code structure of the desired task:

  • This will always lead to a very big amount of code if the task must be able to be interrupted at many points (you will split the task in one function per original code line, and writing one state or one callback per original code line).
  • Furthermore you might want to stop/signal the thread during a long operation that has not been implemented by you but by some other libraries (and you cannot alter this code to fit in states or callbacks)

Looking at how this problem is addresses by different languages and libraries, we can see that: C users will choose a switch-case state-machine or conditional loops, C# users will choose event-callbacks, Qt enforces event-callbacks (even when there is 1 only thread), GTK likewise, nodejs too, python users will choose between the 3 options. Every time, fitting one of these 3 designs will ask a lot of work from the user who are not familiar with it, or who want to program a task that do not fit well in these code-splitted designs (like the above-mentioned long-but-sequential-but-nested-or-non-user-written code).

No to say that conditional loops, state-machine and event-callbacks systems are bad (I’m a great fan of them in other occasions), just that they are irrelevant to tasks that are not particularly splittable for various reasons.

Here are coming the interruptible threads.

a ray of hope with interrupts

The idea

The idea comes from the world of microships where there is not thread, but the issue of unexpected cases happenning at any moment in the code must be addressed (cases due to the program or to more likely to the physical world). On every microship in the world, we can do the following: (taking the arduino API in order to hide the implementation with registers)

void mycallback() {
    // do something, once the callback is finished, the execution of the main continues
}

main() {
    attachInterrupt(5, mycallback);  // declare the callback for execution in case of change on external pin 3
    attachInterrupt(CLOCK1, myothercallback);   // declare the callback for execution by an internal flag of the microship
    
    // do my complex, long, very nested stuff
    // if an interrupt is triggered, the main is paused, the corresponding callback is called, and then the main is resumed
    
    nit();  // disable interruption (interruptions signals will wait for further enable)
    // do some sensitive stuff, that must not be paused
    eit(); // reenable interruption
}

This is all working on a single thread (the microship’s only thread) and this is also possible on every computer CPU when writing code at the kernel level.

How does that help in our case ? Simple: I propose that we implement this kind of interruption function in the python threads. This could resemble to the following:

class Thread:
    def nointerrupts(self) -> context manager
        ''' 
        return a context manager that disable interruptions as long 
        as it is openned, all received exceptions will be raised on the 
        context exit 
        '''

    def interrupt(self, exception):
        ''' 
        send an exception to interrupt the thread, this can be done 
        from any thread. if the thread has disabled interrupts, the send 
        exceptions will be stacked in an ExceptionGroup and raised on enable 
        '''
        
    # that method to join a thread an check for its possible errors
    def wait(self, timeout) -> result:
        ''' 
        join the thread. if the thread exitted with a return value, 
        it is returned, if the thread exitted with an exception, it 
        is propagated here 
        '''
    
    # maybe also that one, much more similar to the microship's attachInterrupt()
    def attach(self, exception_type, callback):
        ''' 
        when an exception is received by a thread, it is paused and 
        the callback attached to the exception is run. if this callback 
        raises or there is no callback, the exception is raised in the thread 
        '''

So we could program this:

def myfunction():
    with currenthread().nointerrupts():
        # ... do critical stuff here that shall not be interrupted
    # ... do something, very long, very complex, and very nested, that can be interrupted
mythread = thread(myfunction)

# ... do something
if whatever_plan_change:
    mythread.terminate()  # alias to   mythread.interrupt(SystemExit)
    # also possible with any exception
    mythread.interrupt(MyCustomException('time to react, thread!'))  
result = mythread.wait()

Implementation

One might wonder after all these esotheric considerations, if this is even doable. The answer is yes ! Both for python and for compiled languages who already provide exceptions.

There is already a way to implement it without writing a single line of C code, using PyThreadState_SetAsyncExc() from the CPython API. The only limitation is that it can only send exception types to the thread to interrupt, and so there can’t be additional data in the exception, but we can work around it.

This first way of interrupting is currently tied to the current CPython implementation using the GIL . So we could be afraid that his features could not be ported to other python implementations now or in the future. Luckily this is not a problem, since posix threads is designed to be notified using OS-level signals (simple integers) that actually works in threads in the same way as interrupts do on microships. The best example is the SIGINT signal that is sent to the main thread of a program when we type Ctrl+C in a shell. In addition to SIGINT, the posix threads allows to send a great variety of signals (including some user-defined signals) from processes to processes and from threads to threads.

In fact, posix threads allows to have half of what I propose for interruptible threads in any C program. Threads signals only pause and resume the thread’s operations, at contrary to exceptions who stop progressively everything until finding a stack frame where the exception is handled. For what regards C, the operation stops at signal reception can be implemented combining signals and the standard setjmp.h . Here is an example (though very unsafe, as everything regarding threading in C).

Most OSs in the world implement posix threads, so does Windows with this port. And for Windows own threads that do not support posix signals, there is still a way to receive interruptions through their C/C++ structured exceptions.

This means that even in a wonderful future where the GIL would have been removed, interruptible threads will still be possible and fine. It would simply ask to rework a bit the signal handling of python to let the threading.Threads receive and send a chosen signal between them.

a python implementation of interruptible threads using posix thread could work that way:

  • the main thread is no more receiving all the OS signals, but let the threads receive a specific signal, let’s call it for instance SIGEXC
  • when a python thread is initializing, it registers a C signal callback or SIGEXC
  • when interrupting a thread, python sets the exception in the target thread’s waiting list, and send a SIGEXC to the other thread
  • when the target thread’s C callback receives a signal, the OS thread implementation pauses the thread execution, including the python code. the callback checks whether the thread has interrupts enabled
    • if enabled: it sets the received exceptions for the next time the thread interpreter is checking exceptions
    • if disabled: it stacks the received exception for check on reenable.
  • when a thread reenable exceptions, it checks for stacked received exceptions, if present it will call the registered python callbacks first (if any), then set the exception in the normal operation flow and resume the flow

Unexpected must be handled through Exceptions

there is always exceptions

Now you might point out that all this is not needed if the whatever_plan_change condition never happens because the programmer has written some code that is so good that nothing unexpected can happen. Well, I would answer that nothing in the real world is free of unexpected events: The real world of course is not, so writing a program interacting with physical stuff (any hardware device, network connection that can suffer from disconnections etc, machines with many effectors, robots and autonomous cars) actually needs a good exception handling system. Even the controlled and predictable inner world of a computer is subject to unexpected event: wouldn’t you mind if your computer OS was crashing when one program encounters a segfault ? or when you remove your USB stick without unmounting it first ? Also ambient radiations can corrupt the memory whenever the program is perfect. Luckily in the inner world of the computer, unexpected events are very rare, allowing all the OS to be written in a language without dedicated exception handling (like C). Even with a very intelligent system (like a human), performing a task it is used to (like moving in the street) in a relatively normed environment; rare and unexpected events can occurs (like a car crashing in front of him, I bet he will interrupt his walk in the direction of the danger whatever his walk state is, and not waiting for his walk procedure to complete). So in conclusion, only God shall be exception-free :wink:

async programming doesn’t help much

You might also wonder why this should go through exceptions/interrupts rather than an asyncio-like system where the language syntax automatically splits up the code into a state-machine and whose events are handled by an event-manager. I don’t think such thing is suited for mainly 3 reasons:

  • The language syntax will produce a state-machine, or whatever code split that can theoretically be seen as a state-machine. As so, any interruption occuring in the middle a one of its state code blocks will wait for the state procedure to complete before interrupting. with asyncio this would mean that calling a non-async function will make your thread non interruptible until this function has returned. Of course is is a big issue since most computational functions are not (and shall not) be implemented as async.
  • This would require to write most of your code in async syntax, meaning you wil put await in front of every function call so your thread could be interrupted at any line. while bit more readable than the aforementioned omnipresent if mythread_exit: return it is definitely as cumbersome to write.
  • Python is an interpreted language, hence the individual operations are already splitted at bytecode level and the exception state is already checked after every operation. So receiving interrupts as exceptions brings no additional cost to normal operations at contrary to awaiting everything.

all as exception

My opinion is that python exception system is not just a debugging trick (yet very convenient for debugging !), but is a very good system to manage unexpected case, occuring because of uncertainties in the program design (like bugs) but also uncertainties in the program execution context especially the real world. Python exceptions are like a perfectly safe goto which travels the call stack until finding a scope that can handle the situation. This is very powerful.

Experimenting interruptible threads

For now, I implemented such threads on top of threading.Thread using the aforementioned PyThreadState_SetAsyncExc() because that was easy. I can share this code if you want.

In my own case, I’m working on an advanced robotics project, involving many sensors, effectors and of course a lot of computation involving deep learning, path-planning and so on. So it’s mixing lot of libraries together and running a lot of resources, That is why I starting thinking to interruptible threads. I bet most projects in this field or in the field of autonomous vehicles might have the same needs. As well as server softwares that have to deal with the connection issues, with the bad client requests etc.

I’ve been using python for years but only one month ago I realized I could use interruptible threads (because my need became higher and higher by the past year). I must say it works quite well and simplifies a lot the control-flow of my programs. I think this could be of great help for many threading applications using Python.

What do you think ot it ?

I hope I clearly described the idea and interest behind interruptible threads. I think I didn’t forgot better ways to signal/stop threads (available in Python or not), if so please tell me. If you have observations, criticisms or even advices, you are welcome :slight_smile:

@python-developpers: if you agree on the interest of such feature, do you think we could implement it in a next version of CPython ? (and I would make a PEP)

7 Likes

I’m not an expert on threads, but I wanted to comment on a few things:

If you want to go against the prevailing wisdom, it is not enough to just say that interruptable threads are good, you need to explain why the prevailing wisdom that interruptable threads are bad is actually incorrect.

In other words, you need to refute the arguments against interruptable threads, not just contradict them.

I will admit, I have never understood why threads cannot be interrupted, but whatever reasons there are, you have to refute those reasons and show why they are mistaken, not merely argue by analogy that since single-threaded CPU machine code can be interrupted, we ought to allow multi-threaded Python code to do the same.

I don’t understand what you mean by “pointer/reference” and “freevariable” here.

All your criticisms of threading seem valid. That’s an argument against threading, it does not refute the prevailing view that interruptable threads are dangerous or harmful.

How does one call PyThreadState_SetAsyncExc from the C API without writing C code?

What happens if two, or more, threads each register themselves to receive the SIGEXC?

1 Like

I’m too not extremely familiar with the implementation of threads, but code running in C instead of Python would make something like this somewhat confusing to the user. Some operations may be interruptible, while others aren’t be (e.g. a long running calculation that goes through numpy), and there’s not much you can do to arbitrarily stop C code gracefully

1 Like

Using ctypes is the approach I’d expect in that case.

1 Like

Of course, maybe my argument scheme is poor, but for the rest of the post I intended to proove that it is good because everything can me done safely for memory and ressources management.
I admit I haven’t been explicit about the subtle difference between my proposition (stoping thread by sending them exceptions) and what is discouraged by the best practices (killing threads). In the first way, you let the thread handle the exception the way it wants, and exit in a controled way; wheras in the second way you stop it in the midlle of something and taking no measure to prevent memory-leaks, dead-locks, or to exit from critical portions of code. As far as I can see, the good practices are currently discouraging the second way because of its uncontrolled consequences (and I totally approove it). But it says nothing about the first way because it is not really an option yet.
So what I propose actually is not opposed to the good practices (in my opinion). Only that I’m conscious that it could be seen as opposed.

argue by analogy that since single-threaded CPU machine code can be interrupted, we ought to allow multi-threaded Python code to do the same.

Well this is not exactly my point. I took inspiration from single-threaded CPU machine, but that’s just inspiration. But in the following consideration I intend to explain why I think this could be a good idea to transpose it to python …

I don’t understand what you mean by “pointer/reference” and “freevariable” here.

Simply this:

  • share a flag using a freevar:
def main():
	flag = 0
	def mytask():
		nonlocal flag  # this is only needed if 'flag' is assigned in this scope
		sleep(1)
		print(flag)  
		# 'flag' is somehow detached from the scope of 'main' so that it can be modified 
		# in 'main', or 'main' can return and the modifications will be visible on the 'flag' 
		# variable in 'mytask'
	thread = threading.Thread(target=mytask)
	thread.start()
	flag = 1

# outputs '1'
  • share with a reference/pointer:
def main():
	flag = [0]
	def mytask(flag=flag):
		sleep(1)
		print(flag[0])  
		# 'flag' is a reference to a list, so any modification to its content in 'main' 
		# will be visible from 'mytask'
	thread = threading.Thread(target=mytask)
	thread.start()
	flag[0] = 1

# outputs '1'

How does one call PyThreadState_SetAsyncExc from the C API without writing C code?

This is possible using ctypes, because ctypes already wraps all the CPython C headers

What happens if two, or more, threads each register themselves to receive the SIGEXC?

Nothing wrong should happen: each thread registers itself to receive SIGEXC only send to itself explicitely. So if you send SIGEXC to one thread, no other thread should receive it (according to the posix threads specs)

Yes I agree that could be a bit confusing, but it is already the same when you type Ctrl+C and python waits for the C-implemented functions to complete (this can take few seconds with libraries like Qt)
I think interruptible threads would still be interesting, because it would allow to request the thread to handle the exception or exit “as soon as it is safe” (meaning as soon as the control flow is in the python’s hands)

Unfortunately I’m not an expert on threads either … I just discovered I could make interruptible threads thanks to the CPython API and the GIL, then I looked at posix and windows threads to check if that could be made independently :wink:

I am afraid that interruptions of Python program are not safe and a signal handler must be extremely limited in what it can do. If I understand it correctly, the signal handler should not expect the Python data structures to be in a consistent state. …so it should not change the flow of the Python code.

If a signal handler raises an exception, the exception will be propagated to the main thread and may be raised after any bytecode instruction. Most notably, a KeyboardInterrupt may appear at any point during execution. Most Python code, including the standard library, cannot be made robust against this, and so a KeyboardInterrupt (or any other exception resulting from a signal handler) may on rare occasions put the program in an unexpected state.

Maybe this writeup from @njs is relevant to the discussion. In this blog post he details all the pitfalls he had to overcome to make sure the Ctrl-C signal handling in trio worked properly.

1 Like

@matmel, I will take a look at this then. In in mean time, the following is how I thought signals handling could be worked around

@vbrozik:

I agree. a signal received at C-level could occur at any moment, for instance during an object allocation, or a reference count operation, etc. I propose to split the signal handling in 2 parts: the C-level callback and the python level callback.
The C-level callback only receives the signal and store it somewhere so the python interpreter can find it on resume (in the exception state for instance). Then the C-callback exits without changing anything else to the unknown python state.
Then when the python interpreter resumes, and finish its paused operation, it finds the exception state set by the signal. then it can execute the signal callbacks and then raise an exception if needed.

KeyboardInterrupt (or any other exception resulting from a signal handler) may on rare occasions put the program in an unexpected state.

This however is a different matter I think: it’s not a matter of corrupting the python interpreter, but cutting the python flow at a moment we don’t want. I think the solution to this problem should be to suspend interruption (all signals, including SIGINT) for such sensitive sequence.
Until now, to acheive this, there was no other option than setting the callback for the whole python program in order to disable it. Then after the critical sequence to register the original callback again. So this was not convenient.
With the interruptible API I’m proposing and in the example from the docs, we could do something like the following, in order to make sure any interrupt occuring during the __enter__ procedure is handled afterward

class SpamContext:
    def __init__(self):
        self.lock = threading.Lock()

    def __enter__(self):
        # If KeyboardInterrupt occurs here, everything is fine
        with currentthread().nointerrupts():
            # if SIGINT is received here, it will occur after the 'with' statment
            self.lock.acquire()
        # KeyboardInterrupt could occur just before the function returns

    def __exit__(self, exc_type, exc_val, exc_tb):
        with currentthread().nointerrupts():
            ...
            self.lock.release()

In fact, whether or not this idea of an interrupt system is approved or not: Shouldn’t the with statement always lock keyboard interrupts, so it can never occur during one of __del__, __enter__, __exit__ (or only if python is blocked in one of those) ?

Problems and solutions for signal handling should follow signal handling in the main thread.

Yeah, this is one of those classic traps: it’s almost useful, it seems to work great in lots of cases, but if you want your software to be actually reliable then this kind of thread cancellation just… cannot be made workable. And sometimes you don’t discover that until you’ve already shipped it. (Like happened to Java and Win32. They have regrets).

For example:

try:
   do_stuff()  # <-- if an exception is raised here, the finally block will clean things up
finally:
   # <-- if an exception is raised here, whoops, no cleanup
   cleanup()

And putting a with nointerrupts() inside the finally block doesn’t help, because the interrupt might happen just before the call to with nointerrupts().

Or consider:

jobs = [...]
while jobs:
    job = jobs.pop()
    execute_job(job)

Suppose this gets cancelled. No problem! You can just look at the jobs list to see which jobs have executed and which haven’t. Except… that doesn’t work, because you can’t distinguish between an interrupt that fires before or after the pop. And this applies to like, any non-trivial state manipulation your program might do. Mayyyybe if you audit every single place in your program, and all third party libraries, that have any side-effects, and carefully disable interrupts etc., you could get your one program to work, but no-one can live like that.

(Thread cancellation actually does work great in Haskell, because you can make sure that the thread you’re cancelling has no possible side-effects. But Python isn’t Haskell :-).)

We mostly get away with it for KeyboardInterrupt, because most programs exit after KeyboardInterrupt, so any corrupted state gets wiped clean. And if once in 100 times, it does corrupt your program’s state and makes the program crash… well, crashing is like exiting, so the user won’t be too upset? (Thank you for playing Wing Commander.)

IMO PyThreadState_SetAsyncExc shouldn’t exist. There’s no way to write reliable programs using it, and it’s not even a good implementation of thread cancellation, because if a thread is stuck waiting on some I/O, you can’t interrupt it this way. It’s grandfathered in because it got added as a kind of experimental gimmick back in 2003, and I think IDLE used to use it? But you shouldn’t use it in new code.

7 Likes

There’s no function to directly raise an exception in another thread. One can suspend a thread and modify its context to call RaiseException (i.e. SuspendThread, GetThreadContext, SetThreadContext, ResumeThread), but I don’t see the point since the thread context could just be made to run the ‘signal’ handler instead of raising an exception.

Nowadays, QueueUserAPC2() can queue a ‘special’ user-mode APC (asynchronous procedure call) to a thread, which will interrupt a thread that’s running or in an alertable wait. It won’t interrupt a non-alertable wait, however, which is what Python generally uses for synchronous I/O and waiting on kernel objects. Alertable waits can be used instead of non-alertable waits, and synchronous I/O can be canceled via CancelSynchronousIo() (or use async I/O with an alertable wait).

These are all very good points, including what’s on your blog regarding Ctrl+C …
So Python’s SIGINT handling is only reliable at 99%, so it is trio or any async design, so would be any signal handling written in pure python :frowning_face:

If you don’t mind spending even more time with me on this topic, we might have more options to explore. But I’m afraid that this would be more complex to implement than what I previously proposed … I was tempted to give up on this topic in front of the difficulty, but since I proposed the idea and I think bringing python to be at 110% reliable on both KeyboardInterrupt and signals is important, I suggest this:
What about finding a way to make SIGINT 100% reliable ? and then maybe we could extend it to handle any signal or exception incoming ?

First of all, this case:

lock.acquire()
try:
    pass
finally:
    lock.release()

cannot be helped, whatever we do, for what is before a try is not meant to be part of it, and so even with the biggest modifications in python, we can not now if the lock acquisition is to be cleaned up or not. What I will propose do not address this case.

The problem in signal handling, after CPython has secured all its C-level functions, is that a signal could happen at during any bytecode instruction, and as the interpreter is checking between any instruction it can come at any place in our code. An option to explore is: could we change this fact that the interpreter is checking between every instruction ?

Suppose we have 2 more opcodes:

  • NIT opcode standing for “No further InTerrupts”

    It disables the check for interrupts for the current instructions and the following

  • TIT opcode standing for “Try enable InTerrupts”

    It enabled the check for interrupts for the current instruction and the following. But it only does so if the thread flag allowing interruptions is enabled. if not it is a no-op

And using these instructions, we ensure at the bytecode generation that the previous opcodes always wraps: the finally statement, and calls to __enter__ and __exit__ in the with statement. And that any GC operation like methods __del__ also disable interrupts in the way those opcodes do.

The following examples are modified bytecode of python 3.8

  • For a finally clause, it looks like this (and the acquire is not protected)

    try:
      # a signal can stop the content
      do_something()
    finally:
      # no signal can stop the finally block
      lock.release()   
    
    SETUP_FINALLY  (to NIT)
    
        LOAD_GLOBAL    do_something
        ...
    
    NIT    <----- we are starting the finally block and don't want to be itnerrupted
    
        LOAD_GLOBAL    lock
        LOAD_METHOD    release
        CALL_METHOD    0
        POP_TOP
    
    TIT
    RERAISE
    
  • for a with clause, it looks like this:

    # no signal can make entering or exiting the context to fail
    with lock:
        # signal can still stop the content of the block
        do_something()
    
    LOAD_GLOBAL   lock
    NIT                         <----- we are entering __enter__ and don't want to be interrupted
    SETUP_WITH   (to NIT)
    TIT
    POP_TOP
    
        LOAD_GLOBAL    do_something
        ...
    
    NIT                      <----- we are entering __exit__ and don't want to be interrupted
    WITH_EXCEPT_START
    TIT
    POP_JUMP_IF_TRUE
    RERAISE
    

Those examples should fix the problem of KeyboardInterrupt during cleanups, as long as cleanups are made only in context-managers or in finally blocks. And there will be moments where a python program cannot respond immediately to SIGINT, and that would be only for good. (And that shouldn’t be for too long since such procedures for deallocation, context management, etc are usually fast from a human perspective). In the rare case where the user has put some long operation in one of those procedures, the program may take long to react to SIGINT. My opinion is that if the user wants to stop in the middle of a critical-by-design procedure, there is no other safe choice than to stop the whole process sending it a SIGKILL instead of a SIGINT (as interrupting is very likely to break something). A C-level callback can still outputs the python backtrace in such occasion. So this could be implemented in the C-signal-callback, if multiple SIGINT are received in a short time, or this could be up to the shell to have a different keyboard shortcut that sends SIGKILL

For a bit more flexibility in those statements (in case we are not using them only for cleanup), we can add two thread-global context-managers nointerrupts and interrupts that play together with the latter NIT/TIT opcodes

  • nointerrupts disables the flag allowing opcode TIT to reenable interrupts, for the duration of the context-manager
  • interrupts enables the latter flag, for the duration of the contex-manager

The following is a complex example of nested initialization-cleanup and non-cleanup zone

with nointerrupts:  
    # TIT will not reenable interrupt in this block,
    # the block begins with a TIT
    # this is initialization zone
    lock.acquire()  
    
    # no signal can prevent from getting into the 'try' block
    try:  
        with interrupts:
            # the block begins with a TIT that reenables interrupt in this block
            # this is non-cleanup zone
        	do_something()
	finally:
        # this is cleanup zone
        lock.release()
        with interrupts: 
            # the block begins with a TIT that reenables interrupt in this block
            # this is non-cleanup zone
            do_something()

# TIT will reenable interrupt outside the block,
# the block ends with a TIT
LOAD_GLOBAL  nointerrupt
NIT                 <----- disable interrupts
SETUP_WITH          <----- set a flag to indicate that interrupts won't be reenabled again on TIT
TIT                 <----- so doesn't reenable interrupts
POP_TOP

	LOAD_GLOBAL  lock
	LOAD_METHOD  acquire
	...

    SETUP_FINALLY  (to NIT)

        LOAD_GLOBAL  interrupts
        NIT
        SETUP_WITH          <------ set a flag to indicate that interrupts can be reenabled again
        TIT                 <----- so renenable interrupts
        POP_TOP

            LOAD_GLOBAL  do_something
            ...

		NIT
        WITH_EXCEPT_START   <----- set the flag to indicate that interrupts cannot be enabled
        TIT                 <----- so do not enable
        ...

    NIT
    
    	LOAD_GLOBAL  lock
    	LOAD_METHOD  release
    	...
    
        LOAD_GLOBAL  interrupts
        NIT
        SETUP_WITH          <----- reset the flag so that interrupts can be enabled
        TIT                 <----- so enable
        POP_TOP
        
        	LOAD_GLOBAL  do_something
        	...
        	
        NIT
        WITH_EXCEPT_START   <----- set the flag to indicate that interrupts cannot be enabled
        TIT                 <----- so do not enable
    
    TIT
    RERAISE
    
NIT
WITH_EXCEPT_START    <----- reset the flag so that interrupt can be enabled
TIT                  <----- so enable

I don’t know what are the opcodes for the with and finally statements in more recent versions of python, but if there is specific opcodes for the blocks start and ends, the calls to NIT/TIT could also be integrated in those opcodes, so the bytecode sequence could be shorter.

1 Like

Killing threads is not bad practice, because it is already an inherent part of the language. If you really wanted to prevent bad practices you’d have to prevent everything that is equivalent to killing threads, and that would include exceptions.

If an exception is raised, a thread also does not get a chance to close any opened resources through normal code execution paths, and any mechanisms existing to deal with such a case can also be used for explicitly killed threads. The kill() function is equivalent to raising an exception at an arbitrary point in the code’s execution, which can already happen anyways be the thread running or blocking on some operation (such as timeout exceptions). The only difference is the conceptual reason as to why or who is responsible for triggering the exception hence killing of the thread, which is irrelevant from a programming point of view. So kill() can just raise a KilledException, and the thread is responsible for implementing a try/except/finally or with statement to handle resources. Such a feature would greatly speed up python because it would make concurrent programming much easier. The only thing required, which is already done for exceptions, is to respect the finally and give it a chance the run (or any other thread code defined to explicitly run on a kill/termination/exception), rather than really hard-stop the thread, because that is equivalent to linguistically allowing exceptions to be raised, but forbidding exception handling whatsoever.

1 Like

It’s worth noting that a try/finally does NOT guarantee that the lock will absolutely be released. What it guarantees is that lock.release() will definitely complete successfully before any following code executes. If I’m understanding you correctly, you’re talking about the situation where an exception is raised (due to an interrupt) DURING the finally clause. That would result in the exception continuing to bubble up; if that means it gets to the thread’s top level, it will terminate the thread.

So maybe the problem here isn’t the try/finally, but what happens when a thread dies while holding a lock. A threading.RLock() might be a partial solution here, as it keeps track of exactly which thread owns it; though this ownership isn’t a public attribute. Broadly speaking, though, it sounds like you want some sort of handling for the situation of “this lock is owned by a thread that no longer exists, so we need to clean up the resource and reset the lock”. (How you would go about cleaning up the resource safely depends on your use-case.)

Such killing is exactly what I imagined

The only difference is the conceptual reason as to why or who is responsible for triggering the exception hence killing of the thread, which is irrelevant from a programming point of view

And I totally agree with that

An exception occuring during the finally clause was not my topic. I meant an exception can occur between the lock.acquire() and the try clause, if the exception is triggered by an interrupt (like a KeyboardInterrupt triggerd by a SIGINT)
Any interrupting during a finally clause could already be prevented by the existance of nit(), eit() functions (no-interrupts, enable-interrupts): first command in the finally clause could be to call nit() disabling interruptions, and last command in the clause could be eit() to reenable interruptions once the lock is cleaned

A threading.RLock() might be a partial solution here

Yes and no. Sure a RLock can prevent this specific example to run into an issue, but that was just an example. In practice, the lock causing the issue can be inside third party code called in this thread, and the third party code does not know we need the lock to be a RLock rather than a Lock or any other non-python primitives. I also could rewrite this example with any other kind of concurrent ressource than a mutex for which there is no reentrant variants.

I have come to think to an alternative to what I proposed then.

Lastly I proposed to add two opcodes NIT and TIT to atomically disable/enable interruptions during code execution, but being opcodes they have to be inserted in all code sequences being generated and interpreted by python. So this would lead to some slow-down of the interpreter.

I think an easiest and more performant way to check for interruptions would be to check for exceptions in the thread only at the end of each opcode calling python code:

  • opcodes checking for exceptions:
    NOP, CALL_*, *_ATT5R, *_SUBSCR, *_SLICE, UNARY_*, BINARY_*, …
    and all other opcodes potentially calling python code (they need to propagate exceptions got from code called, or interruptions received during their execption).
    If the operation is not calling python code by compiled code, it is up to the compiled code to raise exceptions (what any library already does), or to stop themselves at signal reception if possible (like sleep and Lock.acquire already manage to)

    These opcodes should check for exceptions only AFTER their execution, so that when called in a context where interruption is disabled, it will be executed before interruption is checked again at their end. and when has effect to disable interruption, interruption can be ignored when found at their end.

  • opcodes not checking for exceptions:
    *_JUMP_*, RETURN_VALUE, LOAD_FAST, COPY, DUP, SWAP, POP_TOP, …
    and all other opcodes that cannot fail and cannot take long.
    Controlflow especially must not be checking for exceptions to avoid reacting to an exception while setting up a try-finally clause or a enter-exit statement.

    Other opcodes quoted here do not need to check for exceptions nor interruptions since they cannot fail, this will only speed them up.

Reading at the master branch of cpython, it seems it already evolved in having instructions that do check exceptions and others who do not. I didn’t looked yet how PyThreadState_SetAsyncExc works with it, but maybe it wouldn’t be that a big change to nicely support interuptible threads in the end :slight_smile:

Ahh. If I’m not mistaken, the end result would be the same, wouldn’t it? The thread terminates? So the issue at hand is broadly the same, and the idea of a “clean up the resource and free the leak” termination handler might still be worth visiting.