Parallel streams, parallel loops and asyncio

Maybe it’s a bit premature, but since from 3.13 there’s the possibility to get rid of the GIL, and in the future, we hope, Python without the GIL will be more and more improved, I think it will be useful something like Java parallel streams.

Honestly, I’m not sure about the implementation. Python has map, filter, any, all, functools.reduce, all the itertools, etc. I think making a parallel version of all of them is a no-go. Maybe a parallel param, False by default?

But the real problem are the loops. We already have async for. Maybe Python can introduce also parallel for/while?

And what about asyncio? Does it will have the possibility to run multiple loops in different threads, like WebFlux? Maybe it’s already the way it works now, it’s years I don’t program at low level with asyncio.

No one is interested? I thought an easy parallelization of many built-in functions, for and asyncio will be very attractive for the community.

In all fairness, I think your post belongs in the ideas forum.

Well, I thought my idea was not enough structured. Anyway, I try to change the section.

It’s possible currently to do this, but the benefits of it are not present in every application currently. More applications may find this useful with free-threading; Without it, this is primarily useful currently to mix different kinds of event loops per-thread (such as a GUI event loop in the main thread, and an asyncio eventloop in another thread for networking tasks). There are also a few difficulties that aren’t solved in the standard library (though there is tooling available in library code outside of it) for various “normal” tasks with a pattern like this. A major example of this is that most of the asyncio objects only have a very limited notion of thread safety, and bind to event loops. It’s possible to have an async semaphore/lock/queue that is independent of event loop and safe to directly use from multiple, but these and other objects in the asyncio module bind to event loops and are not usable like this.

1 Like

Parallel loops are only possible for specific types of operations, typically those that are easily split across iterations (like simple for loops, but not all while loops or recursive calls).

For example, Python supports this with the multiprocessing module:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, [1, 2, 3]))

Well, this is multiprocessing, not multithreading. I know you can do the same with multiprocessing, but it could be quite easier to write something like this:

parallel 5 for i in range(1, 4):
    f(i)

Side note: I would like that Python will add annotations as in Java, so you could write something like this instead:

@parallel(5)
for i in range(1, 4):
    f(i)

You’re right. This indeed will be a very big work - not big as making a good JIT and making Python almost fast without GIL as it is with GIL - but anyway it’s hard.

Both examples could also be implemented using multiprocessing. So, what makes multithreading special in this context? Since multithreading is currently limited by the GIL, I chose not to include an example using the map method of ThreadPoolExecutor.

As I mentioned in my previous post, the proposed syntax would only work in specific cases, namely those that can be handled by map() alone. In most situations, execution will be serial across multiple threads, since most algorithms are not perfectly parallel. This can actually lead to worse performance than running in a single thread.

I quote my first post as OP:

(please ignore the errors, I would say since from 3.13. I correct it)