I’m working in a code base with lots of statements like
trades = await self.get_new_trades()
my best understanding based on reading the asyncio docs is that in order to get any advantage from await, we’d need to create asyncio tasks and then you can run a block of tasks in parallel, like
await task_1
await task_2
await task_3
or create a list of tasks, and then run them in parallel like
answers = await asyncio.gather(*tasks)
In the code base there are 3 functions that we ourselves use a create_task call on. But about half the functions are decorated with async and then when we call them we have to use await.
My questions are:
is there some advantage to async & await that I’m missing?
Indeed the reason this code base is because of web API calls, where we make a batch of calls and then send them all to the server. (And then wait until the server has replied to all the calls.)
So I’m running functions asynchronously rather than in parallel.
(On the web server they presumably do run in parallel, but that is outside my domain.)
That still doesn’t answer the question of whether there is any point in declaring a function with async def f() if it’s always going to be called as await f().
async tasks are not run in parallel but rather concurrently. Multiprocessing does run processes in parallel via assigning dedicated cores to them. Concurrency implies tasks that overlap in time when executing.
The advantage of declaring functions with async is that the program allows you to run tasks concurrently without blocking other tasks while waiting for I/O responses. For example, if you are requesting (communicating) with an external program (i.e., another computer on the network), while you’re waiting for a response that can take relatively considerable time relative to the CPU, your program does not have to sit idly by. It can pause that task until a response has been received and tend to other tasks in your program.
So, as the cartoon in the link that @franklinvp provided and alluded to, while the burger is being made, the couple can continue on with their date as opposed to becoming robots sitting idly by (what would happen in a regular synchronous based program).
And then it has to be awaited when called so the caller needs to be async and so on. That does not continue indefinitely though because at some point you get back the sync code that starts the event loop using e.g. asyncio.run.
There will be a sync to async boundary somewhere so it might be possible to confine the async code to a small part of the codebase without losing the benefits of async. Hard to say though without seeing the code.
If all those functions are doing IO and they are ultimately called by the 3 top level async functions then they do need to be async. It might be that some of these are really just synchronous functions that don’t actually block on anything and don’t really need to be async though. Can’t say without seeing the code.
There can be other ways to split this like you can have async stuff to gather the IO in one thread but then push all the outputs into a queue that is consumed synchronously in another thread.
The short answer is that you benefit even if you don’t gather or spawn tasks yourself because the web framework you’re using will be spawning concurrent tasks. So things will be happening concurrently anyway.
Thanks for your advice everyone! The reading material provided was also helpful.
It seems I didn’t understand the difference between waiting in the sense of await (“go away and I’ll let you know when you need to come back and pick up this thread”) and in the sense of asyncio.run() (“just wait until this is done”). (And the function coloring problem.)
@oscarbenjamin : a pretty good model of our code is this:
import asyncio
import random
async def f1():
print(await f2())
async def f2():
ans = await f3()
# await x = do_stuff(ans) -> go to other async blocks elsewhere
# await y = do_more_stuff(x) -> go to other async blocks elsewhere
async def f3():
tasks = [f4(i) for i in range(10)]
result = await asyncio.gather(*tasks)
return result
async def f4(n: int) -> int:
sync_func(n)
return await echo_api(n)
async def echo_api(n: int) -> int:
await asyncio.sleep(random.uniform(0.0, 1.0))
return n
def sync_func(n):
print(n, end=', ')
asyncio.run(f1())
I can make f1 and f2 regular functions by doing eg
but I can’t do the same to f3 because asyncio.run(asyncio.gather(*tasks)) throws an error and I can’t find a good way to do it. I’m inclined to assume doing that is a bad idea, and that’s why Python made it hard to do.