Yes, it can be any other loop implementation which doesn’t inherit from BaseEventLoop
although the most prominent one is uvloop
.
From the user’s perspective, I think we’d be trading one footgun for another.
I’m not the expert, so let me contrast Python and Javascript. Javascript doesn’t make a distinction between functions (which may or may not return a Promise) and coroutines, which leads to a known footgun of Coro „unexpectedly” beginning to execute upon instantiation, which business logic is not prepared for, most notably if the code raises an exception before first await.
At the same time create_task(…).cancel()
is a footgun too. Instead of thinking of lesser of two evils, could these be considered orthogonal?
Wrt. code running faster. I’d take that with a grain of salt, because with a subtle semantics change like this, it’s the same code but no longer the same outcome being compared.
I have certainly written asynchronous code that subtly relies on current create_task semantics, broadly speaking that „I don’t need to lock anything, if the resource use doesn’t span an await point”. If I understand the proposal correctly, that assumption will be void.
I’m happy to fix my code and likewise libraries can be fixed, but we’d also potentially be breaking a lot of existing application code.
Can you elaborate on why it is unexpected, or is this merely a distinction without either side being particularly “better” than the other? I agree that there’s a difference here, resulting in a difference of execution order; I don’t know that either one is necessarily better, but I do know that the distinction (whenever it’s actually significant) is a subtle one that’s not easy to debug.
Thus I will be switching any and all Python asyncio code of mine to use eager tasks sooner rather than later.
Yes, exactly. Third-party loop providers.
They can upgrade eventually, but I prefer not to break them.
No, Python has a different behavior. Even with eager tasks, the thrown exception doesn’t bubble up to a code called create_task(...)
. Exception handling requires await task
or task.get_result()
for both laze and eager modes.
Unexpected in terms of business logic looks something like this:
state.task = asyncio.create_task(coro())
async def coro():
myself = state.task
…
In terms of locking, it’s something like this:
await foo()
# no need to lock here because no other async code may be run
bank.acct1 += 42
email1 = create_task(send_email())
bank.acct2 -= 42
email2 = create_task(send_email())
await gather(email1, email2)
The bank total is ensured with lazy task start, and IIUC is not with eager tasks.
It’s like the definition of „async code” is being changed from „body of async function” to „trace through async function starting from first await point and ending in return/yield/raise”. The latter is harder to reason about.
For the JavaScript footgun I was referring to, see Automatic batching for fewer renders in React 18 · reactwg/react-18 · Discussion #21 · GitHub which required opt-in when behaviour was changed.
I’m not sure this translates directly to Python world, YMMV.
Sorry, I’m coming to this late and don’t really understand the distinction between “eager” and “lazy” tasks, but my understanding is that there’s still no arbitrary context switching going on here - it’s just that you need to be sure that (the part before the initial await in) send_email
doesn’t mess with your bank total. That’s exactly the same as you’d need in sequential code, so it doesn’t seem that surprising to me.
I do agree that it’s a disconcerting change - I’ve been used to the idea that create_task
puts a task into the event loop, to run “when the current function yields control”, and that’s now changed to the more nuanced “to start immediately but yield back to the current code”, which requires some thinking to assure yourself it’s OK - and frankly isn’t as intuitive to me.
Of course, it’s possible I’ve completely misunderstood the distinction between “eager” and “lazy” here. But if so, I hope the announcement of the change when (if) it happens covers the new semantics clearly, with illustrative examples.
Hm, that’s a bit of a showstopper. The lazy semantics imply that each coroutine is its own critical section (multiple when using await) but the eager semantics merges the initial part of a coroutine with the coroutine that creates it.
That looks like a major reason to expect problems, and this is something that asyncio has always promised.
With eager tasks the first chunk (from the beginning of the function to the first await) is also non-interruptable. You still can think that it is covered by critical section as well. The only difference is that the first chunk is executed early when the task is created.
Sometimes a code may want lazy behavior, it can be simulated by await sleep(0)
very easily.
But from my experience the lazy request is very rare; the most code works well in both modes.
I think the ability to avoid eager behavior on-demand may need to remain forever even if the default becomes eager.
When examining this at my day job, specifically during application shutdown, we set the task factory back to the default lazy one to have a reliable ordering that is independent from anything functions we might not control do. Being sure other tasks aren’t launched during several parts of shutdown is essential here, and we don’t want to replace the task factory with something that will error, we want everything needed to still run, just after shutdown is structured and each component is told to stop accepting new work.
Outside of that part of the application lifecycle, I can’t imagine a strong reason why it would be needed, but I wouldn’t be surprised if there are applications that rely on it.
As for making it the default, I think this should happen slowly. One of the big “selling points” of async/await was in the mental model people were taught. Other coroutines/tasks won’t be switched to unless the current coroutine yields or ends. This doesn’t entirely break that model, but it does bend it, and it’s going to take time for people to adapt their mental model to include “creating a task may run up until the first yield point within it”
Ok, guys.
I see a strong request to keep lazy tasks.
Thus, I can propose the following:
run()
/Runner
accepts neweager_tasks
argument which isFalse
by default.- The current behavior remains the default; if we want to switch it, we need to have a separate discussion in the future.
asyncio.create_task()
andloop.create_task()
accepteager_start
argument to provide fine control on the created task color.
Opinions?
Folks are saying eager tasks might be breaking the contract of not switching tasks outside of suspension points. What about doing an async create_eager_task
(or whatever) method, making it an explicit potential suspension point? The implementation wouldn’t need to actually change.
Do you propose declaring create_eager_task()
call an yield point along with await v
expression?
I was thinking we make create_eager_task
async so you would have to await it when using it. Hence introducing a suspension point, visually.
Sounds good except the longer name.
If we decide to go this way we probably need a new method in Task group also.
Agreed. In my opinion TaskGroups are more important - I think a code base that uses task groups exclusively over asyncio.create_task
is going to end up looking better in the long run, and end up doing structured concurrency by default. Maybe we can use this change to nudge folks towards task groups by making the api a little nicer there, and a little lower level in asyncio proper? Just brainstorming here.
async def spawn(...)
looks short and natural, isn’t it?
I find “eager” to be a bit jargony, and not so self-explanatory. A new user would have to look up what it means. create_and_start()
or create_and_start_now()
for me would express what’s going on more clearly.
spawn
sounds good. Some other options:
async def start(...) # Short, describes that the task will start right away
async def create_and_start(...) # Sorts before `create_task` in autocomplete