Tracking the source of cancellation in tasks

Please forget about the cancel message but focus on the title of this thread: “Tracking the source of cancellation in tasks”. That is what I’d like to have support in asyncio.

It may be implemented using the cancel message, a new default argument/attribute of CancelledError, or sys.audit() calls, or whatever else that would be more appropriate.
I’m asking for the core devs and community’s opinion about what could be the best option.

The motivation of this request is as follows:

  • We need to increase debuggability and visibility of how coroutines and tasks work when debugging complex systems written in asyncio.
  • In thread/process-based concurrency, the location of stack is the most important information of an exception.
  • In coroutine-based concurrency (asyncio), both the location of stack where the exception is raised AND another stack where the exception is injected are important.
  • I’d like to have the full visibility of interactions between coroutines that involves logic-flow control of a task: currently we have the only one: cancellation.
  • When writing my own PersistentTaskGroup and its test cases, using the cancel message parameter for this purpose helped me a lot to debug whether the task is cancelled from outside or sibling or itself.

Example: (with a monkey-patch to async_timeout as well)

diff --git a/src/aiotools/taskgroup/persistent_compat.py b/src/aiotools/taskgroup/persistent_compat.py
index 357580f..1d56c1d 100644
--- a/src/aiotools/taskgroup/persistent_compat.py
+++ b/src/aiotools/taskgroup/persistent_compat.py
@@ -145,7 +145,7 @@ class PersistentTaskGroup:
         self._aborting = True
         for t in self._tasks:
             if not t.done():
-                t.cancel()
+                t.cancel(msg='explicit shutdown')

     async def shutdown(self) -> None:
         self._trigger_shutdown()
@@ -166,7 +166,7 @@ class PersistentTaskGroup:
             return ret
         except asyncio.CancelledError:
             if fut is not None:
-                fut.cancel()
+                fut.cancel(msg='cancel taskgroup scope')
             raise
         except Exception as e:
             # Swallow unhandled exceptions by our own and
diff --git a/async_timeout/__init__.py b/async_timeout/__init__.py
index 4188a98..a5866fc 100644
--- a/async_timeout/__init__.py
+++ b/async_timeout/__init__.py
@@ -216,7 +216,7 @@ class Timeout:
         return None

     def _on_timeout(self, task: "asyncio.Task[None]") -> None:
-        task.cancel()
+        task.cancel(msg="timeout")
         self._state = _State.TIMEOUT
         # drop the reference early
         self._timeout_handler = None

I’d like to have the same effect without manually patching all existing .cancel() calls in the asyncio stdlib and 3rd-party libraries by fixing one place: the cancel() method itself. Moreover, a human-provided message string does not guarantee any kind of structure, so if we could have a structured data (such as the caller’s module:function path string) that may be used by codes, programmatically, it would be much more useful.