Revisiting PersistentTaskGroup with Kotlin's SupervisorScope

https://kotlinlang.org/api/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/supervisor-scope.html
I found that Kotlin’s supervisorScope does the exact same thing that I have been going to achieve with aiotools.PersistentTaskGroup while Kotlin’s coroutineScope corresponds to asyncio.TaskGroup.

Unlike coroutineScope, a failure of a child does not cause this scope to fail and does not affect its other children, so a custom policy for handling failures of its children can be implemented. See SupervisorJob for additional details. A failure of the scope itself (exception thrown in the block or external cancellation) fails the scope with all its children, but does not cancel parent job.

There was a concern about the naming “persistent” in gh-90999, and now I’m inclined to use “SupervisorScope”. As you may already notice, it also resembles with supervisord.

How about your thoughts? Would the Kotlin’s analogy make better sense to add this API to asyncio?

3 Likes

Looking at the docs for aiotools.PersistentTaskGroup I finally understand the key feature that you want: when a child task crashes with an unexpected exception you don’t want its siblings to be cancelled.

I wonder if this can’t be done with a very simple wrapper for asyncio.TaskGroup that simple wraps each task (coroutine) in a try/except BaseException block that ignores all exceptions? (Maybe with the exception of KeyboardInterrupt and SystemExit, like TaskGroup and the rest of asyncio do.)

So we could do something like this (untested, and simplified):

class AltTaskGroup(asyncio.TaskGroup):
    def create_task(self, coro):
        async def wrapper():
            try:
                await coro()
            except (KeyboardInterrupt, SystemExit):
                raise
            except BaseException:
                pass
        return super().create_Task(wrapper())

Your draft is equivalent to wrap the child tasks with asyncio.shield().
The point is that the child tasks still must be cancellable when the parent task group is explicitly cancelled (i.e., shutdown).
So I had to change the internal logic of TaskGroup to write PersistentTaskGroup.

  • When the persistent task group is cancelled or explicitly shutdown: all child tasks should be cancelled and awaited. → This makes it different from simply shielding child tasks. Propagation of cancellation should be controlled in the task group instead of individual tasks.
  • When a child task is cancelled: all other child tasks should remain intact.
  • When a child task raised unhandled exception: an exception handler configured in the persistent task group is invoked (the default fallback is loop.call_exception_handler()). All other child tasks should remain intact.
  • If all child tasks have finished, the persistent task group should exit as well. (same to the original task group)
  • It should guarantee termination of all child tasks if the control flow has exited from the persistent task group. (same to the original task group)

One of the reason to rewrite the code is that I need to call the shutdown process other than __aexit__() handler, such as shutdown(), when the persistent task group is used as long-lived object instead of a context manager. The current TaskGroup has all its shutdown routines inside __aexit__() and it’s not reusable from other methods. It’s not “refactored” or “designed” to be subclassed.

I wonder if maybe your proposed functionality could be implemented as a flag passed to the TaskGroup class. The main effect of the flag should be that if one task crashes this shouldn’t cause all other tasks to be cancelled, instead the exception should be logged (either by loop.call_exception_handler() or in some other way). Also you want a shutdown() method.

Would that work? If I am right then the main bikeshedding might have to be about the logging API to be used. I prefer there to be no coupling between asyncio and logging, so hopefully just calling loop.call_exception_handler() works for you?

2 Likes

Yeah, that would be nice!
I’d like to be able to customize the error handler with the default fallback to loop.call_exception_handler(), and agree with you in that it’s better to remove coupling between asyncio and logging.

I’ll try to implement a modified version of TaskGroup with the optional flag.
If done, shall I make a pull request to the CPython repository?
Or would you prefer to see it in a separate repo (e.g., aiotools)?

I recommend that you work on a PR, so I can review it, and eventually merge it so it will become available in 3.12. I’m not going to be able to review code that goes into other repos.

2 Likes