Asyncio without function colouring

I’m not interested in testing it without fully understanding the intended behavior I’m meant to be testing. Above, when you described the behavior, you described several things that are breaking changes, but are now saying all code will work unchanged.

If you want people to take it seriously, there has to be a clearer picture of what’s being proposed. An implementation is not a specification.

1 Like

Yes, and no. I listed the things you’ve picked up on because they could be incompatible changes. Any change can be an incompatible one, the art is judging the level of impact. I’m glad you’ve picked up on them. I do not have a good feel for the existing async landscape, so thank you for the feedback.

On inline generators ((x for x in things)) my own experience, is that of surprise. ((x for x in things,2,3)is a tuple, (x for x in things,2)is a tuple, (x for x in things,) is a tuple, but (x for x in things)is a generator. Inline async generators, for me, is a whole extra level of gotcha. That being said, as async generators still work, and, if this incompatibility is a problem, the code generator could be adapted to make an async generator in this case.

On your library idea - I think it could work in many cases, maybe the majority, but quite a few libraries, wouldn’t be fixed (ones with call-backs passed in, for example - is the callback async or not?). I also feel it’s a work-round for only one part of the problem.

On 3rd party event loops - I have no feel how much this is done, and whether they have their own Future classes. Do you have an example in mind?

Could you create a similar WASM demo so that anyone interested can try it out without compiling?

I don’t support this change at all - much for the reasons already posted by others.

I am writing , though, to remind that in many cases there are suitable workarounds to many of the things you want to achieve as motivation: enabling async code in special “dunder” methods such as `_setattr_` (I can imagine a pass-through setting to a network-baked object being convenient), or mainly, to avoid writing duplicate intermediate code so that one call-path composed of pure “async def” functions.

(As for properties, they work with async-def functions out of the box)

One workaround is when calling sync code from an async context, that would downstream have to await things (become async again), to take note of the current running loop, call the synchronous code in a thread, and then, when calling async again, schedule the async code in the original loop (and synchronously wait for it in the sync-thread).

That is the approach I take in my “extraasync” packages and its `sync_to_async` and `async_to_sync` function pairs, and you would most welcome to use and or collaborate with it.

Example of code using an async `_setattr_`:

import asyncio

from extraasync import async_to_sync, sync_to_async

class A:
    def __setattr__(self, attr, value):
        async def set():
            await asyncio.sleep(1)
            super(A, self).__setattr__(attr, value)
        sync_to_async(set)

        
def trampoline(i):
    a = A()
    print(f"starting set to {i}")
    a.b = i
    print(a.b)
    
    
async def main():
    t1 = async_to_sync(trampoline, (1,))
    t2 = async_to_sync(trampoline, (2,))
    await asyncio.sleep(0.5)
    print("middle way")
    await asyncio.gather(t1, t2)

    
asyncio.run(main())

The only thing on the writer perspective is that when calling sync code from async, if one ever intend to do async inside that call, it has to be done through the `async_to_sync` bridge.
If the code calling `sync_to_async` doesn’t perceive that it is in a context started in this way, it will spawn an event loop for the current thread instead (which will be reused for further sync_to_async calls)

2 Likes

No, all of those things you say to be a tuple are syntax errors. A generator comprehension has to be surrounded by ().
((x for x in y),2,3), ((x for x in y),) - these are tuples.

Many thanks for the correction. This prompted me to investigate further, and so I rediscovered what caught me out originally:

After x = [a for a in range(3)] you can go x[1], however x = (a for a in range (3)) you can’t because x is a generator.

Yeah, this catches out a lot of people (that’s NOT a “tuple comprehension”). I don’t know if it’d help with remembering the distinction, but a tuple should generally be considered less like a list and more like a dataclass. The namedtuple from the standard library helps with this; yes, it does have a well-defined sequence to the parts, but it’s very definitely a specific set of parts that can be identified and named. In contrast, a list is generally able to grow and shrink. Comprehensions (list, dict, set) can start with any length of input, may potentially eliminate some elements, and could result in any amount of output; so on that basis, it doesn’t really make sense to have a “tuple comprehension”. It does, however, make sense to have a sort of “lazy list comprehension”, which is what a genexp is.

Maybe that’s not helpful, in which case feel free to ignore it :slight_smile:

1 Like

Yes, many of the dunder methods do work ‘out of the box’, for example async def __lshift__(self) can be used await (asyncthing << value). However, the mechanics of await get in the way quite quickly. For example, the sync

ostream << ‘My weight is ’ << me.weight << ‘ kg\n’

in async becomes

await (await (await (ostream << 'My weight is ') << me.weight) << ' kg\n'

If you need to await __getattr__() too,

await (await (await (ostream << 'My weight is ') << await me.weight) << ' kg\n'

What was a slick way to output to a stream, becomes somewhat unwieldy, masking the author’s intent with the mechanics.

Some dunder methods do not work. For example async def __setattr__(..). The return value from the call to __setattr__() is discarded. As that’s the coroutine which needs to be awaited, not a great deal happens except for the warning about the un-awaited coroutine.

Moving onto sync_to_async etc. Here I checked using asgiref’s implementations of these. sync_to_async executes the sync code on a separate thread through a thread pool. This brings limitations from the OS’s number of threads, and multi-threaded interlock hazards (TBH, not usually a problem), but that’s not the real gotcha. I noticed this implementation had a pool of one thread, executing the code pieces off a queue. It was easy to build a, not entirely contrived, case which deadlocked due to the thread pool bottleneck. Two tasks, A and B, both of which went async→sync→async and in the inner sync waited for a future before setting the result of a different future. A third task waited a bit, set the result of A’s waited-for future, waited for A’s other future, then set B’s waited-for future. If A was created before B everything was OK, if B was created first, its sync section blocked the task pool, so A’s sync section couldn’t execute and release the end of the sequence. IIRC I encountered this deadlock in my website - just one of the problems I encountered trying to adopt asyncio on my website.

So why does ‘await anywhere’ help? The park-this-Task-while-it-waits mechanism is completely different, so is allowed anywhere…

You don’t have to sync_to_async or async_to_sync, which avoids deadlocking through the thread pool.

You can monkey-patch sync libraries to be asynchronous aware. As a worked example, I thought requests was a good example as there’s a Stack Overflow asking for exactly this. After a couple of unsuccessful approaches using asyncio’s connection system, I went to the bottom layer and monkey-patched socket. The socket class can only have its methods replaced due to ssl’s implementation. Here’s an example monkey-patched function:

    def recv_inner(self, do_recv, flags):
        if AsyncSocket.docalldirect(self, flags):
            return do_recv(flags)
        AsyncSocket.setsubblocking(self, False)
        fd = self.fileno()
        loop = asyncio.get_event_loop()
        fut = loop.create_future()
        handle = loop.add_reader(fd, AsyncSocket.send_and_recv_cb, self, fut)
        try:
            while True:
                try:
                    return do_recv(flags)
                except (BlockingIOError, InterruptedError):
                    pass
                except BaseException:
                    raise
                await fut
        finally:
            if handle is None or not handle.cancelled():
                loop.remove_reader(fd)

    def recv(self, bufsize, flags=0):
        return AsyncSocket.recv_inner(self, partial(rawsock.recv, self, bufsize), flags)

A little explanation: docalldirect() - is the caller expecting a non-blocking interaction?; setsubblocking - there’s a separation between what the socket user has set for non-blocking, and what the underlying socket is set to, this sets the underlying socket’s blocking mode; send_and_recv_cb() after the select has triggered this translates the socket’s state into the future’s result. This is written assuming selector_events (the default), and would need adapting to proactor events. If you dig into it further, you’ll realise that ssl does most of the communication directly, not via socket. However, once an optional call-back was added to use python code to do its selects, this was also very easy to make async capable.

The overall result: requests on my proof-of-concept python with the monkey patch is async ready without needing to adjust or add anything to requests. Moreover, any library which uses socket or ssl is now asynchronous ready. Any sync library can be used async as only the outermost entry point of a Task, and the innermost thing-which-waits need to know that async is happening.

As it worked so well, I’ll now integrate the monkey patch into the proof-of-concept and make improvements to its timeout handling (which, in the monkey-patch demo, is incomplete) and loop-kind awareness (selector versus proactor loops).

The transparent promotion of “socket” to “async” in a transparent way as you show is impressive.
Still, as I first wrote, I don’t know if I’d back such changes

This is mostly the reason for my other, (and for this) reply: the example in my other post is exactly about __setattr__. What the sync_to_async and async_to_sync pair in the extraasync library allows is for synchronous code (including __setattr__ and other dunder methods), which return synchronous results, to call awaitable code and have then awaited - either in the “host” event loop, or in a new one, if there is none running (and then keeping it around to be reused in further sync_to_async calls)

(Extraasync will also avoid the deadlock situation you mention, by throwing in more OS level threads - if the need arises, I might put something in place to cap and finetune this behavior)

And as for:

Yes, that would also work with extraasync by setting up __lshift__ as I did the `_setattr_` above - but other than that, I want to be away - very far - from this kind of code. :slight_smile: It may or may not be the main motivation behind the creation of C++ , but don’t count me in.

  • For that, I’d rather use the new t-strings there, like in ostream.write(flatten_awaitables(t”My weight is" { me.weight } kg\n”))
1 Like

I’m curious to know what ‘the need arrises’ is. I’m guessing it’s when all the current threads in your pool are busy, but I’d love to know if it’s cleverer than that. In any case, your pool of threads is, due to OS limits, finite (100-200 quite often I believe). This means it is still has the same deadlock hazard, just you’re less likely to hit it. All the threads currently running synchronous jobs could be waiting for the one extra, synchronous job to run so they can carry on, but that job can’t run as there’s no spare thread to run it.

In a Django and other ORM website, you lean heavily on __getattr__() etc, so most of your page serving code will need to be run sync. If you’re doing that using Extrasync (or asgiref), that sync code will be running on a Thread… per page serve. Which loses the point of asyncio, that being threads-on-the-cheap for serving web pages by the 10000’s at a time.

I understand that you’ve found your happy place with asyncio as it is. This change is not for you (although I suspect if you started using it, I think you’d quite like the freedom it gives). This feature is for website developers who want to use asgi, and all their ORM code, and not to have to decorate their code with transitions between sync and async, and not have deadlock hazards. In other words, programmers who want to do their coding, without needing the mechanics of using asyncio, but still keep the benefits. (my opinion: function colouring is not a benefit :wink: )

To help you understand where I came from: I had a well developed website, I tried asyncio-ing it. I was aware I’d need to decorate every function async… it was having a unusable ORM, deadlocks, and, where it did work, awaits liberally sprinkled through expressions which got me to the conclusion asyncio was practically unusable here. I didn’t want to spend half a year adapting Python in a way that left existing async code working, but where awaiting an async def is unnecessary, but now I have, I’m happy to share it.

1 Like

You would have a better experience with asyncio if you didnt try to force the patterns of synchronous apis to work with it, or were more intentional about wrapping behavior. You only need a single await, even with operator overloading, by having the operator return a class that can collect further operator use and is awaitable.

And despite that plenty of people run websites with asyncio, you concluded the issue was with asyncio, not with sprinkling in async and await without considering how it works?

Notably, asyncio writers have separate write(...) and drain() methods, and only the latter is awaitable.

1 Like

That is the problem. If you have to do a ton of “consideration” about how to do it, then it becomes harder and harder to justify doing it. You can frame this as “user error” but the fact of the matter is that what most people want is simply to have their code run faster without having to think to much about how exactly its doing that.

If people just want their code to run faster, asyncio is not going to do that. I’m not sure what you’re expecting of it.

It’s not all that different from adding threading everywhere without considering it. If you take code that wasn’t designed for concurrency and just blindly add concurrency, of course you are going to get issues like deadlocks.

People should be thinking about the code they write.

I went to take a look at Django’s own documentation of it’s async support here, and between my own experience with asyncio, other frameworks that use it, and the specific considerations Django has chosen to expose to it’s users directly, I’m pretty comfortable caling this “not a problem with asyncio”.

The extra consideration here looks to mostly be on how Django specifically wants certain things wrapped/written, not on asyncio, though I agree with @Liz that you can’t just add concurrency to code that never had it without at minimum considering code structure.

I’m not the biggest fan of Django myself, and find the async support provided by other frameworks to be more intuitive, because they were written with async patterns in mind, rather than the way Django evolved to add it[1]


  1. This isn’t meant as a slight, just as an acceptance of the fact that the expectations Django started with don’t work well for async, so any migration path for the framework to support it was going to be either awkward, breaking, or having completely diverging APIs for WSGI vs ASGI, and that awkwardness was pushed to their users with the choices they felt were best for their project. ↩︎

1 Like