Differences between `Pool.map`, `Pool.apply`, and `Pool.apply_async`

I read the following sentence “The Pool.map and Pool.apply will lock the main program until all processes are finished, which is quite useful if we want to obtain results in a particular order for certain applications.
In contrast, the async variants will submit all processes at once and retrieve the results as soon as they are finished.” from this link: An introduction to parallel programming using Python's multiprocessing module
Yet, I found it not easy to understand. I was wondering, could anyone please provide a simple example to illustrate the point?

The async variants return a promise of the result. Pool.apply_async and Pool.map_async return an object immediately after calling, even though the function hasn’t finished running. This object has a get method which will wait for the function to finish, then return the function’s result.

  • Pool.apply: when you need to run a function in another process for some reason (and you want to use a process pool instead of creating a new process to run the function).
  • Pool.map: run a function over a set of arguments in parallel.
  • Pool.apply_async: run a function in another process, but allow the main thread to keep running. Use this when you don’t need the result right now.
  • Pool.map_async: run a function over a list of arguments in parallel, but allow the main thread to keep running. Use this when you don’t need the results right now.

Of further note is Pool.imap_unordered, which is like running Pool.apply_async over a list of arguments, and acting on each result-promise as they arrive.

Here is a tabulated example:

from time import sleep
from multiprocessing import Pool

def f(t):
    return t

p = Pool()
call result took (s)
p.apply(f, (0.1,)) 0.1 0.102
p.map(f, [0.3, 0.1, 0.2]) [0.3, 0.1, 0.2] 0.302
r = p.apply_async(f, (0.1,)) <ApplyResult object> 0.0
r.get() 0.1 0.104
r = p.map_async(f, [0.3, 0.1, 0.2]) <MapResult object> 0.0
r.get() [0.3, 0.1, 0.2] 0.303
r = p.imap(f, [0.3, 0.1, 0.2]) <IMapIterator object> 0.0
list(r) [0.3, 0.1, 0.2] 0.302
r = p.imap_unordered(f, [0.3, 0.1, 0.2]) <IMapUnorderedIterator object> 0.0
list(r) [0.1, 0.2, 0.3] 0.302

Note the result of list(p.imap_unordered(...) is not in the same order: you can act on finished calls as they arrive (eg for logging).

1 Like

As @EpicWink has well illustrated. I also want to put the simple concepts into context:

  • Pool.apply and Pool.map are blocking, meaning when you are calling them, you have to wait until the processes are finished.
  • Pool.apply_async and Pool.map_async are asynchronous. You don’t wait for them to return the executed processes’ results to you, but instead a temporary result (AsyncResult) immediately. But when they do finish, you can call get(), ready() and successful() on AsyncResult.
1 Like