Maybe it’s a bit premature, but since from 3.13 there’s the possibility to get rid of the GIL, and in the future, we hope, Python without the GIL will be more and more improved, I think it will be useful something like Java parallel streams.
Honestly, I’m not sure about the implementation. Python has map, filter, any, all, functools.reduce, all the itertools, etc. I think making a parallel version of all of them is a no-go. Maybe a parallel param, False by default?
But the real problem are the loops. We already have async for. Maybe Python can introduce also parallel for/while?
And what about asyncio? Does it will have the possibility to run multiple loops in different threads, like WebFlux? Maybe it’s already the way it works now, it’s years I don’t program at low level with asyncio.
It’s possible currently to do this, but the benefits of it are not present in every application currently. More applications may find this useful with free-threading; Without it, this is primarily useful currently to mix different kinds of event loops per-thread (such as a GUI event loop in the main thread, and an asyncio eventloop in another thread for networking tasks). There are also a few difficulties that aren’t solved in the standard library (though there is tooling available in library code outside of it) for various “normal” tasks with a pattern like this. A major example of this is that most of the asyncio objects only have a very limited notion of thread safety, and bind to event loops. It’s possible to have an async semaphore/lock/queue that is independent of event loop and safe to directly use from multiple, but these and other objects in the asyncio module bind to event loops and are not usable like this.
Parallel loops are only possible for specific types of operations, typically those that are easily split across iterations (like simple for loops, but not all while loops or recursive calls).
For example, Python supports this with the multiprocessingmodule:
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))
Well, this is multiprocessing, not multithreading. I know you can do the same with multiprocessing, but it could be quite easier to write something like this:
parallel 5 for i in range(1, 4):
f(i)
Side note: I would like that Python will add annotations as in Java, so you could write something like this instead:
You’re right. This indeed will be a very big work - not big as making a good JIT and making Python almost fast without GIL as it is with GIL - but anyway it’s hard.
Both examples could also be implemented using multiprocessing. So, what makes multithreading special in this context? Since multithreading is currently limited by the GIL, I chose not to include an example using the map method of ThreadPoolExecutor.
As I mentioned in my previous post, the proposed syntax would only work in specific cases, namely those that can be handled by map() alone. In most situations, execution will be serial across multiple threads, since most algorithms are not perfectly parallel. This can actually lead to worse performance than running in a single thread.