I read the following sentence “The Pool.map
and Pool.apply
will lock the main program until all processes are finished, which is quite useful if we want to obtain results in a particular order for certain applications.
In contrast, the async
variants will submit all processes at once and retrieve the results as soon as they are finished.” from this link: An introduction to parallel programming using Python's multiprocessing module
Yet, I found it not easy to understand. I was wondering, could anyone please provide a simple example to illustrate the point?
The async variants return a promise of the result. Pool.apply_async
and Pool.map_async
return an object immediately after calling, even though the function hasn’t finished running. This object has a get
method which will wait for the function to finish, then return the function’s result.
-
Pool.apply
: when you need to run a function in another process for some reason (and you want to use a process pool instead of creating a new process to run the function). -
Pool.map
: run a function over a set of arguments in parallel. -
Pool.apply_async
: run a function in another process, but allow the main thread to keep running. Use this when you don’t need the result right now. -
Pool.map_async
: run a function over a list of arguments in parallel, but allow the main thread to keep running. Use this when you don’t need the results right now.
Of further note is Pool.imap_unordered
, which is like running Pool.apply_async
over a list of arguments, and acting on each result-promise as they arrive.
Here is a tabulated example:
from time import sleep
from multiprocessing import Pool
def f(t):
sleep(t)
return t
p = Pool()
call | result | took (s) |
---|---|---|
p.apply(f, (0.1,)) |
0.1 |
0.102 |
p.map(f, [0.3, 0.1, 0.2]) |
[0.3, 0.1, 0.2] |
0.302 |
r = p.apply_async(f, (0.1,)) |
<ApplyResult object> |
0.0 |
r.get() |
0.1 |
0.104 |
r = p.map_async(f, [0.3, 0.1, 0.2]) |
<MapResult object> |
0.0 |
r.get() |
[0.3, 0.1, 0.2] |
0.303 |
r = p.imap(f, [0.3, 0.1, 0.2]) |
<IMapIterator object> |
0.0 |
list(r) |
[0.3, 0.1, 0.2] |
0.302 |
r = p.imap_unordered(f, [0.3, 0.1, 0.2]) |
<IMapUnorderedIterator object> |
0.0 |
list(r) |
[0.1, 0.2, 0.3] |
0.302 |
Note the result of list(p.imap_unordered(...)
is not in the same order: you can act on finished calls as they arrive (eg for logging).
1 Like
As @EpicWink has well illustrated. I also want to put the simple concepts into context:
-
Pool.apply
andPool.map
are blocking, meaning when you are calling them, you have to wait until the processes are finished. -
Pool.apply_async
andPool.map_async
are asynchronous. You don’t wait for them to return the executed processes’ results to you, but instead a temporary result (AsyncResult
) immediately. But when they do finish, you can callget()
,ready()
andsuccessful()
onAsyncResult
.
1 Like