A small reimplementation of gather: `quattro.gather`

Howdy,

I just released a re-implementation of asyncio.gather in quattro 23.1.0.

It leans on TaskGroups so it doesn’t leak tasks in case of child errors, and it’s super short (since the bulk of the logic is in a TaskGroup).

The differences are:

  • If a child task fails other unfinished tasks will be cancelled, just like in a TaskGroup.
  • quattro.gather only accepts coroutines and not futures and generators, just like a TaskGroup.
  • When return_exceptions is false (the default), an exception in a child task will cause an ExceptionGroup to bubble out of the top-level gather() call, just like in a TaskGroup.
  • Results are returned as a tuple, not a list.

It’s almost suspiciously short (9 lines of code) but I have alright test coverage on it so I guess it’s fine. Would be happy to contribute it to CPython if there’s interest (but we’d probably need a different name? And what to do with the current gather?).

If you (the potential user) are using TaskGroups anyway, why use any version of gather? Just stick your tasks in a list and access the list outside the taskgroup block. It’s 2 lines instead of 1 but it’s simpler even if it’s not shorter (because the API surface that you’re using is smaller), and you have more flexibility (e.g. you could use a dictionary instead of a list if that suits):

async with asyncio.TaskGroup as tg:
    tasks = [tg.create_task(foo(i)) for i in range(3)]
print(*[t.result() for t in tasks], sep=", ")

A couple of reasons:

  • it’s less type-safe (the task group version is not type-safe at all, tasks will just be a list[Task[Any]], right?)
  • the task group version is less explicit about the what and more explicit about the how, so gather is more readable
  • the task group version has a slightly larger cognitive surface - I want to run coroutines in parallel and get the results, why do I need to care about spawning tasks and collecting their results

In my opinion, gather is an amazing abstraction when it’s appropriate. I was surprised when I went looking for it in Go and I failed to find it (although that was a while ago, maybe things have improved now). Instead I got folks trying to teach me about channels on Twitter.

it’s less type-safe (the task group version is not type-safe at all, tasks will just be a list[Task[Any]], right?)

Does it? That sounds like a bug in the current type hints. I don’t see a reason why it wouldn’t have the correct type list[Task[<foo return type>]]. Trying now on my computer, both mypy and pyright complain that TaskGroup is not even a member of asyncio :-\

the task group version is less explicit about the what and more explicit about the how, so gather is more readable

Gather means “run these tasks in parallel and put their results in a list”. That is exactly what the snippet says, but just those two parts are written explicitly. Having a routine whose function is explained as “do x and y” is usually a bad sign (especially one named “do_x_and_y()”! but gather has avoided that). But readability is subjective.

the task group version has a slightly larger cognitive surface - I want to run coroutines in parallel and get the results, why do I need to care about spawning tasks and collecting their results

If this is literally the only task-based API you use in a program then you’re right. But I’m imagining that any non-trivial asyncio program will use task groups extensively, so you’d already be using them elsewhere.

Certainly having this in a separate library isn’t hurting anyone. I just don’t think it’s useful to be in the standard library if it’s already achievable by composing such a small number of existing pieces.