When I set_result on a Future that is awaited by some coroutine it doesn’t immediately wake up the coroutine. Instead all Future callbacks are scheduled using loop.call_soon.
Would it be possible to add some kind of Future.eager_set_result/eager_set_exception that would execute all callbacks right away and effectively begin executing awaiting coroutines immediately?
Is there maybe a way to do it now with the current Future implementation. Maybe some hack?
This may be very useful in high level networking libraries that receive data through non-async Protocol.data_received/buffer_updated callbacks. Libraries typically parse new data and at some point want to deliver result to a user coroutine that is awaiting on a Future.
Being able to do it eagerly will reduce overall read latency. Also in some cases it will allow to reference the read buffer of the library through memoryview. and avoid memory copying.
I’m optimizing a hot loop in my Elixir-Python interop project. The Python side loops on await pipe_based_stream_reader.readexactly(n), and schedules code execution via loop.create_task.
The end-to-end[1] latency is dominated by loop.call_soon. I was able to reduce it from ~50 μs to ~39 μs just by calling loop.create_task(..., eager_start=True). With monkey-patching Future.set_result to @tarasko’s implementation above, I got it down further to ~28 μs.
(Blocking os.reads instead of StreamReader got me to ~22 μs, but that’s problematic for various reasons.)
That is to say, there’s a ton of latency to unlock by skipping loop.call_soon. Just adding Future.eager_set_result is a great step - I could implement a custom StreamReader based on eager Futures. Right now I could only achieve that with callback style, no awaits.
Maybe we could aim for something like loop.set_future_factory(lambda: loop.Future(loop=loop, eager_start=True))? That would nicely mirror how Tasks are treated.
Elixir caller → encoding request → BEAM Port system → pipe → Python → decoding request → eval → encoding response → pipe → BEAM Port system → decoding response → BEAM process send → Elixir caller ↩︎
Yes, totally agree so many things could benefit from eager futures.
To summarize, these are possible API changes for futures that are consistent with the eager task API:
# By default Futures are non-eager, but it is possible to override defaults
# with BaseEventLoop.set_future_factory(factory)
Future(*, loop=None, eager_start=False)
BaseEventLoop.set_future_factory(factory)
BaseEventLoop.get_future_factory()
BaseEventLoop.create_future(**kwargs)
# C implementation that optimize eager futures construction (comparing to pure python lambda: loop.Future(loop=loop, eager_start=True)
asyncio.EagerFutureFactory(loop)
# It is possible to override default and explicitly specify eagerness when result is set
Future.set_future(result, eager_start=None)
Future.set_exception(exc, eager_start=None)
Did I missing something?
I have updated the draft PR with the proposed API changes
I guess that the future factory approach is dangerous; it breaks assumptions for all users of the event loop.
Assume we have an HTTP framework A that is built with awareness of immediate callback calls. It is safe to use. Moreover, a delayed futures mode doesn’t make much harm; it could only slow down the whole system a little.
Also, we have a library B that should be utilized by user code, e.g., for implementing HTTP endpoints. If the library B is written with the assumption that fut.set_results() acts in a conventional style, and the loop future factory is switched to eager mode, the library could be broken, including an infinite dead loop and other unpredictable consequences.
On the other hand, the initial proposal with set_eager_result() and set_eager_exception seems safe to me; eager_start boolean argument, if it is accessible as keyword-only, also works fine. Task should forbid calling eager versions of set_result() / set_exception() as it does for lazy setters already.
The summary is: only eagerness-aware code could work in this mode safely. Sometimes this optimization is acceptable, but if applied constantly, it could lead to fragile errors.
Honestly, I think an eager task factory is also dangerous. Unfortunately, we added that factory first and later added eager_start argument to create_task().
I suggest not making the same mistake again.
Could likely have a way to set the factory within a context using contextvars. I’m not sure if that’s worth it to the use cases that would benefit.
As a function or method that isn’t called by default, it can remain fully opt-in behavior and only used at the precise locations where it is intended and is both safe and beneficial, so I don’t have any objections to the functionality existing, but I am concerned about the same things with making it either a factory or default behavior.
I don’t mind not having the factory; the proposal was just for symmetry with Task. I agree it’s problematic, and if I want to take responsibility, I can always have a custom loop that creates futures I like (maybe even based on Context).
Design-wise, I like f.set_result(r, eager_start=True) more than loop.create_future(eager_start=True), because the choice of eagerness stays on the caller side. Though most of the time, the same library controls both setting results and creating futures.
For optimizing existing code, I like loop.create_future(eager_start=True) more, exactly because I could have a custom loop that returns them when I choose. For example, I could instantly have an eager StreamReader, without having to copy its implementation. Same brittleness problem though - as a consumer, I can’t guarantee StreamReader is eagerness-safe.
Either implementation unlocks latency gains that were hard to get before.
I don’t mind not having the factory; the proposal was just for symmetry with Task.
Yes, same for me. I personally don’t need set_future_factory, just having set_eager_result, set_eager_exception will be sufficient.
I also think the less change the better. New API can be added later, but deprecating API is a pain.
Design-wise, I like f.set_result(r, eager_start=True) more than loop.create_future(eager_start=True),
Me too. create_future is done at one place, and set_result can be done from multiple places. There could be more context available regarding if going eager at this particular circumstance is safe or not.
Should eager_start be a keyword-argument to set_result, set_exception or is it better to make 2 new methods?
I have a slight preference for new methods because in some library code it can be tested with getattr if future supports eagerness.
Echoing what @asvetlov said, set_eager_result should be async (and needs to be awaited). Otherwise we break a fundamental property of asyncio, that tasks only switch at explicit suspension points.
@Tinche It is a really subtle question.
The task should switch at an explicit yield point, I totally agree.
I see two use cases for set_eager_result():
Call from a regular callback, e.g. from Protocol.data_received() or Protocol.buffer_updated()as the topic was initially raised. I guess immediate callbacks execution is safe here if the code is written well. For example, calling a user HTTP handler immediatelly after receiving the whole HTTP packet is totally fine, there is no need to wait for the next loop iteration here. Moreover, for calling from regular callbacks set_eager_result() should be a plain (non-async) function.
set_eager_callback() is called from async code for some reason. Yes, here it could switch tasks implicitly, and I agree that it is not a desired action. Moreover, in this case await fut.set_eager_callback() also doesn’t make any sense in terms of performance; we should jump between tasks by a series of loop.call_soon(task.__wakeup, …) calls anyway. That said, I propose forbidding set_eager_*() calls from async functions where current_task() is not None.
@tarasko yes, I think Future.set_eager_result() and Future.set_eager_exception() are fine. There is no need to blow up the public API without a clear use case.
And yes, I also slightly prefer new methods over adding arguments because of better introspection by getattr()
Is this really a fundamental property? This is what already happens when you call loop.create_task(eager_start=True). The intent to switch doesn’t even have to be explicit, since eager_start=True can be set on loop.set_task_factory.
I disagree that it doesn’t make any sense - to me, it’s exactly the same use case as the “call from a regular callback” one, and exactly the same time latency reduction as in the sync case.
I don’t think calling eager_set_result from a task is any more problematic than create_task(eager_start=True) is (which can be created from a task), but it unlocks more use cases, like an async IO loop:
async def io_loop(
reader: StreamReader,
waiters: list[Future],
buffer: list
):
while True:
result = parse_data(await reader.readexactly(10))
if waiters:
waiters.pop().eager_set_result(result)
else:
buffer.append(result)
For a more real world example, in my library:
Main loop is an async function. It reads requests & responses from a pipe - currently a StreamReader
requests are run with create_task(eager_start=True)
responses are run with result_fut.set_result()
The created “user” tasks can send requests through StreamWriter, and await responses through result_fut.
I use eager_start=True here to immediately run these tasks, because it’s a significant reduction of end-to-end latency in the most common scenario of 1 user task at a time. I would want to use result_fut.eager_set_result for exactly the same reason, unblocking the user task as quickly as possible, because the single call_soon delay is comparable to a full IO round trip to the Elixir process.
In practice the answer is yes; that’s how we’ve been selling asyncio to the community for a long time now and that’s what the community expects. For proof you can read through the virtual threads discussion; folks in favor of asyncio very eagerly mention this.
You’re right that this rule got broken with eager task factories; that’s unfortunate. We don’t have to break it further; I’d make an argument that it’s important that we shouldn’t.
@kzemek could you provide more realistic example?
Sorry, io_loop() looks too clumsy and synthetic; and I cannot understand your library design from a very short description.
Side question: does the fact that StreamReader/StreamWriter use standard ‘lazy’ futures affect the performance?
@asvetlov My real main loop is the async def _run_loop in py_src/snex/runner.py. I can try to distill a better example later tonight.
I admit it was a very synthetic attempt. The point was that there’s a ton of valid system designs that could benefit from eager_set_result, that currently pay high costs because of the call_soon overhead. Right now the name of the game in performance-sensitive asyncio code is having as few futures on the call path as possible, to the point that it feels like code golfing.
Very much so. StreamWriter is alright, as only drain is async, and it only blocks on high watermark. I’m currently trying various hacked together workarounds for the future overhead in StreamReader though, and they’re not pretty. But removing it reduces latency by about 20% percent, so feels worth it.
I also find it a bit weird from the user perspective that it wouldn’t be allowed to use eager_set_result from async code. Especially given that create_task(…, eager_start=True) works fine from async code.
I can imagine that it may become somewhat annoying when implementing composed operations. Let’s say if we have a non-async function that is setting result on a future, it might have to check asyncio.active_task() to decide if it can use eager_set_result.In a simple code base this is not an issue, but with a multi-layered code where result can be set for a variety of reasons from multiple places, this may become a problem.
This sounds a bit wrong to me for some reason, but no strong opinion.
I personally don’t need it. Just being able to call eager_set_result from buffer_updated/data_received is enough for my case.