Server-oriented task scope design

achimnol · May 21, 2024, 8:57pm

So, I think we need to break down the issue to a set of more approachable tasks.

There are two big directions:

Implementing the intrinsic “task scope” entirely in the stdlib asyncio (e.g., an option to TaskGroup as some suggested, refactoring TaskGroup into TaskScope and TaskContext, etc.)
Splitting out the “task scope” implementation as a 3rd-party library after adding minimal supporting facilities in the stdlib asyncio

Practically, the second approach would sound better and be more useful in terms of reusability in aiomonitor and other libraries. Using hooks, it will be possible to embrace existing 3rd party libraries to work with task scopes seamlessly.

Things to do in the stdlib asyncio

Add a hooking interface to task creation and termination.
- We could keep the vanilla task factory, while adding callbacks to the task lifecycle events.
  - Task factories should be the responsibility of event loop implementations.
  - Task hooks should be the main interface for libraries to implement task tracking.
- This will allow multiple libraries to add their own custom task trackers.
Add fine-grained tracing.
- e.g., hooks/callbacks for _step()
- This is not strictly required for “task scope”, but will be useful for observability libraries.

Things to do in a 3rd-party “task scope” implementation

Attach a task creation hook:
- Query the “current task scope” contextvar.
- Add the reference of the created task to it.
Attach a task termination hook:
- Query the “current task scope” contextvar.
- Remove the reference of the terminated task from it.
- (This may be replaced with using WeakSet in the task scope, but it would be better to be explicit and keep possibility to add other actions here.)
The “current task scope” contextvar could be a tree of instances where we can access the parent and child scopes when needed.
Implement the task scope concept
- Concurrently shut down the child task scopes when explicitly closed.
- Do this recursively until the target task scope is shut down.

Things to think

What is the “natural” representation of task scopes in the code?
- async with blocks
- A class with tree-manipulating methods (imagine the DOM API)
- Both?
How should the exception callback for each task scope look like?
- Just copy the event loop exception handler interface?

Though, it is still nice to put task scopes in the stdlib, so that observability libraries could expect that it is always there (e.g., asyncio.task_scope_tree or asyncio.root). Maybe this can be done after the initial experimentation in a 3rd-party.

A potential concern is that the hooking interface may incur some extra performance overheads, but I think it is worth to trade off.

How about your thoughts?