PEP 828: Supporting 'yield from' in asynchronous generators

Unfortunately my experience isn’t exactly recent - other than removing uses of them, I haven’t touched async generators for some years now. From memory, the issues I had were to do with exceptions and cancellations - but you can get other issues along those lines even if you’re not using async generators, so indeed the root cause may have been somewhere else, and merely avoided by switching away from async generators. It also doesn’t help that I’m still mainly using Python versions 3.10 and 3.11, so things may well have improved in newer versions. I realise now for these multiple reasons, I’m not really in a good position to argue here.

I am (broadly) aware of this distinction, though I was probably conflating some things. I think the asyncio module has some flaws, and separately I think the design of the async/await syntax has some flaws, and often these can overlap. (For clarity, I still believe they are extremely useful features - like I said, I love asyncio - I’m not bashing them.) Regardless, I understand that this and my understanding of how async generators work definitely isn’t on topic for this thread.

To try and explain my reasoning for why I think it’s counterintuitive for yield from to mean “yield from sync in async” and for async yield from to mean “yield from async in async”: really, it’s all down to the fact that you use the same yield x syntax in both sync and async generators.

Regular generator:

def sync_gen():
  yield val

Async generator:

async def async_gen()
  yield val

Then, when you want some form of composition:

def sync_gen():
  yield from other_sync_gen()

By symmetry, the intuitive way to compose in an async generator would be:

async def async_gen():
  yield from other_async_gen()

That said, while I have been interpreting the async in async yield from to be associated with the yield (which is why it didn’t make any sense), I can see the logic now, thanks to Alyssa’s explanation, that you have to consider it to be associated with the (invisible) for instead.

I still don’t find it intuitive, and so it still feels “wrong” to me - but I can at least now see why this pattern has been chosen.

Interesting; I’m glad you raise this point.

Are you suggesting that with async yield from added to the language, issues with delegating exception handling to subiterators might be solved, in a way that the existing async for x in y: yield x does not? In other words, that async yield from might not be equivalent to this, but superior to it?

If so, I believe async yield from might hold more value.


Finally, I’d like to apologise for derailing this thread - indeed, discussions of general problems with async isn’t on topic here, and admittedly my original post was a bit of a knee-jerk reaction when I saw the proposal, based on my bad experiences with async generators. (Usually posts like these end up in a text file on my PC, never to be seen again… this one got past the filter, I suppose.) I was quite surprised that the problems I’ve run into haven’t been encountered more widely. I’m now pretty convinced that I don’t really know what I’m talking about on this subject, so even though I still find the syntax weird, I’ll withdraw my -1.

3 Likes

As a data point of sync generator co-routines being used in the wild, we have built a high-level control system for scientific data acquisition around them [1] and leverage all of the communication channels in and out of the co-routines.

[1] GitHub - bluesky/bluesky: experiment orchestration and data acquisition · GitHub (we had the name before the other now more famous bluesky)

4 Likes

Thanks for the clarification, this makes more sense.

I do understand how one might intuit that yield from yields from an asynchronous generator, but that intuition breaks down once you understand more about how generators work. Specifically, yield doesn’t need an async keyword because yielding can’t actually invoke any asynchronous code. Like normal generators, yield is just the point where the generator suspends, which doesn’t involve any direct code execution. On the other hand, yield from does invoke code, so Python needs to know whether to do that synchronously or asynchronously.

Yeah – when async yield from is used, calls to asend, athrow, and aclose are also delegated to the subgenerator. That doesn’t happen with an async for loop. Granted, it’s certainly less common to need to do that for async generators, but maybe that’s just because async generators don’t support yield from.

Apart from using yield from to yield from an async generator (as I’ve explained why that’s a problem), is there anything else that you think would improve the syntax? To me, async yield from feels pretty natural and fits with Python’s existing conventions about sync/async, but I’m happy to hear other ideas.

2 Likes

Presumably async yield from agen desugars to async for i in agen: yield i.

Meanwhile the latter can be made conditional or some computation can applied over i.

In which case, the benefit is marginal.

Arguably the same could be said for synchronous yield from. In fact that feature always puzzled me. Maybe it’s fair to call that a misfeature.

Anyway, allowing async yield from for completeness alone doesn’t seem right.

2 Likes

See this section:

The current workaround for the lack of yield from support in asynchronous generators is to use a for/async for loop that manually yields each item. This comes with a few drawbacks:

  1. It obscures the intent of the code and increases the amount of effort necessary to work with asynchronous generators, because each delegation point becomes a loop. This damages the power of asynchronous generators.
  2. asend(), athrow(), and aclose(), do not interact properly with the caller. This is the primary reason that yield from was added in the first place.
  3. Return values are not natively supported with asynchronous generators. The workaround for this it to raise an exception, which increases boilerplate.
2 Likes

One example is PEP 789

1 Like

I’ve fallen foul of “asend(), athrow(), and aclose(), do not interact properly with the caller” and therefore would like this addition :grinning_face:

2 Likes

It’s taken me a bit to get my head around it, but because of the really helpful explainer from Alyssa, it seems clear to me that we should have both yield from and async yield from, and for the same reasons. The semantic difference between them isn’t as obvious as I’d like, but when you’re dealing with mixing function colors that just is part of the complexity you have to take on.

I have personally been bitten by not being able to delegate exception handling to subiterators in coroutines, and it seems to me that, though I expect async yield from to be more common in coroutine generators, that this reason applies equally well to synchronous-in-a-coroutine delegation to a generator.

3 Likes

FTR, I have done a clarity update on the PEP. Most importantly, I added two new sections to “Rejected Ideas” regarding why the proposal doesn’t use yield from to delegate to asynchronous subgenerators.

7 Likes

It looks like discussion died down, so I’ve submitted PEP 828 to the SC. Thanks everyone!

11 Likes

I understand the appeal of consistency and this does make asynchronous generators a bit more consistent with other generators, but overall I’m -1 on this, for three reasons.

  1. The PEP is underspecified. The implementation relies on yet more exceptions acting a vehicle for returning values, and other flow control.
    PEP 492 and PEP 525 which introduce async and async generators explain the additional machinery required in detail. This PEP should do the same.

  2. Complexity of implementation: Looking at the reference implementation, it does seem to add quite a lot to the interpreter and the bytecode compiler. If this were an important or popular feature, that would be worth it, but it doesn’t feel like it is for this.
    If the implementation added less to the interpreter, or added a cleaner method for handling the 4-way branching of asynchronous iteration (return, yield, async-yield, raise) then I’d be a lot more positive.

  3. yield from is problematic in async functions, as it was initially added as a half way house to await. When used in asynchonous functions, its secondary purpose of supporting send/yield stacks significantly complicates the implementation of its now primary purpose: iteration.

4 Likes

Thanks for the feedback, Mark.

To address your three points:

Which parts do you think need more specification?

There isn’t much new functionality to specify – this is, overall, a very small PEP. All the semantics outlined in PEP 380, PEP 492, and PEP 525 remain unchanged and are reused for this proposal. The only new (user-visible) functionality is a statement equivalent to yield from that calls asynchronous methods rather than synchronous ones. In the PEP, I simply delegated to the docs instead of restating what’s already documented.

Yeah, I acknowledged that complexity is subjective, so I guess we’ll have to agree to disagree here. I firmly think the implementation is pretty simple:

  1. For yield from, the subgenerator is just wrapped in an iterator object that returns each subgenerator value as an async_generator_wrapped_value object, which the interpreter already knows how to handle.
  2. For async yield from, we reuse most of the code for yield from, but we just call async functions (and emit the necessary bytecode to await them).

What cleaner method do you have in mind? I’m open to trimming down the implementation if it’s bothersome, but it’s unclear to me as to what’s messy about the current design.

But that doesn’t really change with this proposal, does it? Even if this PEP is rejected, we still have to keep that behavior. This proposal is just an expansion of what’s already implemented for async generators (in the bytecode interpreter, at least; I had to change a few minor things in genobject.c).

4 Likes

Which parts do you think need more specification?

What are the precise semantics of async yield from.
The semantics of async yield are already a bit weird, so a precise semantics is needed.
Like what happens if the containing async generator is suspended at a yield point and I send to it, or throw into it, or send to its sub-iterator, or throw into its sub-iterator. Likewise, what happens when it is suspended awaiting?

What role does async_generator_yield_from have?
Also, the changes to StopAsyncIteration could be made more explicit.

What cleaner method do you have in mind?

Ideally, this would be implemented entirely in the bytecode compiler. Realistically, some small changes to the interpreter are probably needed, but they should be minimized.

When used in asynchonous functions, its secondary purpose of supporting send/yield stacks significantly complicates the implementation of its now primary purpose: iteration.

But that doesn’t really change with this proposal, does it?

It does, though. Otherwise you wouldn’t need to add the ability for async generators to return values.
Currently, you can either iterate, async for, or yield. async yield from does both adding even more complex control flow, due to sub-iterators and being an expression not a statement.

It would just operate on whatever subgenerator is currently executing. If async gen a is currently delegating to async gen b, then asend, athrow, and aclose will be sent to b. The exact behavior of those functions remains unchanged by this proposal.

Please correct me if I’m wrong, but I feel like CPython-specific implementation details don’t belong in a language-specification PEP. Though, I could add a section to “Reference Implementation” if you’d like.

I think I can remove 2 of the 3 added instructions by switching to LOAD_ATTR + CALL for GET_ASEND and GET_ASYNC_YIELD_FROM_ITER. We’ll probably have to keep CLEANUP_ASYNC_THROW in order to handle StopAsyncIteration. Would you be okay with that?

I do agree that async generators are complicated and can get particularly nasty when you look at the details, but that’s already a problem, and I don’t currently feel that it’s made worse by this proposal (the only necessary modifications to the generator implementation are just removing or adding some PyAsyncGen_CheckExact calls in various places, and a tiny change to add the value attribute to the propagated StopAsyncIteration exception).

Is there some plan to simplify generators that async yield from would otherwise block? If so, then I do see how this PEP could be troublesome, but otherwise I think the increase in complexity is pretty minimal and non-invasive.

4 Likes

I actually think it would really be valuable for the PEP, to explicitly spell out the semantics in the same style that PEP 380 does, and to try to check that the intended implementation matches it. It makes sure that everybody is understanding the modifications being made to the PEP 380 semantics in the same way.

Also, the PEP says “async yield from is conceptually equivalent to” the simple for loop, which I think is importantly not the case; that’s part of why you want to add it!

Otherwise I like this!

3 Likes

I guess this isn’t too hard to add. I’ll write something up and put it in the PEP.

Note the word “conceptually” here. Maybe “roughly” would have been a better term? I’m trying to introduce the idea to those who aren’t already familiar with yield from or are otherwise confused about what async yield from would do. I think starting with a simple loop is a good way to familiarize the reader with the high-level behavior, rather than just diving into subgenerator delegation.

1 Like

I think roughly would be good, yeah.

1 Like

The phrase “roughly equivalent to” is common enough in PEPs that it’s almost its own piece of terminology. Or if you prefer, “broadly” would also work.

1 Like

Using the “roughly” comparison to introduce the concept, and then the full Python equivalent comparison in the specification section should definitely help make it clear what is missing from the existing looping constructs (full delegation of send and throw handling)

2 Likes