Preventing `yield` inside certain context managers

njs · April 2, 2019, 7:32am

What’s this?

It’s a proposal that we make it possible to write context managers that prevent yielding out of their with block. This is something @yselivanov and I have been discussing because we need it for “structured concurrency” features in Trio and asyncio. It will become a PEP but I wanted to post something early to start discussion.

What’s wrong with `yield`?

Trio pioneered two new constructs, which are both implemented as context managers: cancel scopes, and nurseries. A cancel scope lets you delimit a block of code, and apply timeouts or otherwise cancel it. A nursery is a structured way to run a collection of concurrent tasks. We believe these are the right primitives to model concurrency in Python, and Yury is working on adding them to asyncio.

But, both of these constructs assume that “the code that runs inside the with block” is a well-defined thing. Generators and async generators break that assumption. If users write code like this:

def bad_agen1():
    with trio.CancelScope():
        yield  # yield inside 'with cancel scope' doesn't work

async def bad_agen2():
    async with trio.open_nursery():
        yield  # yield inside 'async with nursery' doesn't work

…then the generator’s caller’s code ends up running in between the calls to __(a)enter__ and __(a)exit__, and this corrupts the concurrency system’s runtime state.

If you’re lucky this produces a confusing crash at some later time, and if you’re unlucky it could produce incorrect results. We’d like to make it so attempts to yield in these contexts instead raises a prompt and informative error.

Is this a problem in practice?

Yes. Our experience with Trio is that new users hit this frequently. Trying to use a cancel scope or nursery inside an async generator is a natural thing that people often try. The resulting errors are hard to interpret or debug. The details of the problems are highly technical, so it’s hard to teach users which cases are OK and which ones aren’t.

For example, here’s a tricky case:

# https://github.com/HyperionGray/trio-websocket
from trio_websocket import open_websocket

async def my_agen():
    async with open_websocket(...) as websocket:
        while True:
            yield await websocket.get_message()

This looks innocent, but is actually broken. The websocket protocol requires a background task for each connection, so that PING frames can be handled promptly. Therefore, the open_websocket context manager encapsulates a nursery context manager inside it, which means that this code actually has a yield inside a nursery! As you can imagine it’s hard for new users to anticipate this, and even if you understand the theory about where yield is and isn’t allowed, it’s still easy to accidentally write buggy code if you don’t realize that the context manager you’re using has a cancel scope or nursery hidden inside it.

When we add structured concurrency to asyncio, then we expect asyncio users will run into the same issues.

There’s several years worth of discussion in these issues:

Raising an error seems unfriendly. Can’t we fix it so `yield` just works?

We’ve tried. For example, the proposals in PEP 533 and PEP 568 were originally motivated by problems we encountered with yield inside nurseries and cancel scopes, respectively. But now that we understand the issues better, we think the only general solution is to raise an error if someone attempts to yield inside these blocks. There are some special cases where PEP 533 and PEP 568 could help, but it’s not worth adding them just for that. (At least as far as Trio/asyncio go – they might be useful for other cases, I’m not taking a position that here.)

Here’s the fundamental issue: yield suspends a call frame. It only makes sense to yield in a leaf frame – i.e., if your call stack goes like A -> B -> C, then you can suspend C, but you can’t suspend B while leaving C running.

But, nurseries are a kind of “concurrent call” primitive, where a single frame can have multiple child frames that run concurrently. This means that if we allow people to mix yield and nurseries, then we can end up in exactly this situation, where B gets suspended but C is actively running. This is nonsensical, and causes serious practical problems (e.g., if C raises an exception, we have no way to propagate it). And this is a fundamental incompatibility between generator control flow and structured concurrency control flow, not something we can fix by tweaking our APIs. The only solution seems to be to forbid yield inside a nursery block.

If you want more details on all the specific problems that arise, and how they relate to this proposal, and to PEP 533 and PEP 568, then see this comment

So how could this actually work?

The basic idea is: when we enter one of these context managers, we set a flag in the runtime state that says "yield is not allowed", and then when we exit the context manager we restore the flag to its original state.

So let’s say we add a new entry to PyInterpreterState:

typedef struct {
    PyObject* yield_forbidden;
} PyInterpreterState;

When yield_forbidden is NULL, yields are allowed. When it’s non-NULL, yields are forbidden, and the object holds an error message explaining why yield was forbidden. The sys module gains a way to set and restore tstate->yield_forbidden:

with sys.forbid_yield(error_message):
    ...

The yield and yield from statements are modified to check this attribute, and if it’s non-NULL, do raise RuntimeError(yield_forbidden).

But, there are a bunch of important subtleties.

First, this should only apply to yield and yield from statements; we never want to forbid await:

async def myfunc():
    async with open_nursery():
        await sleep(1)  # This is fine

Currently await and yield from use the same opcode. The simplest solution would be to make the check: if (yield is forbidden && the current frame is a generator, not a coroutine) { raise RuntimeError }.

Another complication: we need to handle nested generators. We don’t actually want to forbid every yield that happens inside our with block; we only want to forbid yields that temporarily exit the with block. It’s fine if the code inside the with block iterates over a generator that has some internal yield. For example:

def inner_frame():
    yield "hi"

def outer_frame():
    with forbid_yield(...):
        # There's a 'yield' in inner_frame, but that's OK
        for obj in inner_frame():  # no error
            print(obj)
        # This 'yield' temporarily exits the with block, so it's illegal
        yield  # error

Therefore, we add an additional rule: when entering a generator via __next__, send, throw, or close, and when entering an async generator via __anext__, asend, athrow, or aclose, we set tstate->yield_forbidden = NULL. On exit, we restore the previous value. That makes the example above work as expected, because the forbid_yield in outer_frame doesn’t affect inner_frame.

And finally, we want to allow people to define their own context managers using @contextmanager and @asynccontextmanager, that wrap a no-yields-allowed context manager. This is especially subtle, as can be seen from an example:

@asynccontextmanager
async def open_websocket(...):
    async with open_nursery():
        yield Websocket(...)  # This yield is OK

async def caller():
    async with open_websocket(...):
        yield  # This should be an ERROR

Syntactically, there’s a yield inside the @contextmanager function. But semantically, this is totally different from a generator, which can be suspended/resumed/abandoned at any time. In a @contextmanager function, the yield is essentially replaced by the contents of the block where our new context manager is used (and @contextmanager goes to great lengths to preserve this illusion). So here we want the forbid_yield inside open_nursery to take effect in caller, not in open_websocket.

While subtle, this turns out to be fairly straightforward. We add a boolean attribute that can be set on generator objects and async generator objects – something like gen_obj.__passthrough_yield_forbidden__. By default, it’s False. @contextmanager and @asynccontextmanager should set it to True. (And so would a few other closely-related use cases, like @pytest_trio.trio_fixture.)

When this attribute is True, then it does two things:

It disables the check on yield/yield from. A generator/async generator with this attribute set is always allowed to yield, regardless of what tstate->yield_forbidden says.
It disables the save/restore when entering/exiting an (async) generator. This means that changes to the state are allowed to “leak out”.

So in our example, (1) means that yield Websocket(...) is allowed, even though open_nursery has set tstate->yield_forbidden to a non-NULL value. And (2) means that this setting is will remain effect when we return to the body of caller, so the yield in caller will fail, as we wanted.

Summing up

So the final version of all the pieces would be roughly:

Generator __anext__ and friends:

if (!self->__passthrough_yield_forbidden__) {
    saved_state = tstate->yield_forbidden;
    tstate->yield_forbidden = NULL;
}
// ... actual functionality here ...
if (!self->__passthrough_yield_forbidden__) {
    tstate->yield_forbidden = saved_state;
}

The YIELD opcode and friends:

if (tstate->yield_forbidden
    && !gen_obj->__passthrough_yield_forbidden__
    && !gen_obj->is_coroutine) {
    raise RuntimeError
 }

Other use cases?

Since PEP 568 is currently not accepted, context managers that set some context-local state – for example, decimal.localcontext – have a surprising interaction with generators, where yielding inside the context manager can let the local context accidentally “leak out” to the calling code. As long as this is the case, it might make sense to use the mechanism here to forbid yield inside localcontext and similar cases? But, that’s not the major motivation, and we’re not currently proposing to change the decimal module.

dimaqq · April 2, 2019, 7:44am

Quick comment: it would be easier for others to understand the issue if the post made fewer references to cancel scopes and nurseries, as these are kind of novel concepts.

Quick question: is the proposal to forbid yield only in the current block / frame, or also in any function that could be called from current block?

njs · April 2, 2019, 8:16am

It’s only in the current block/frame. (Plus some trickiness to make @contextmanager do the right thing.) Iterating over a generator is always fine. The problems all come from yielding out of a context manager, so that you’re sort-of-inside-it-and-sort-of-not at the same time.

gpshead · April 3, 2019, 4:05am

As a more general alternative, wouldn’t enhancing the context manager protocol to introduce a new __(a)yield_context__ method (and equivalent C API tp_ function if needed) work? Any context manager wanting to prevent callers from yielding would implement that such that it raises an exception thus blocking the yield and in the usual uncaught exception case exiting the context manager via __(a)exit__ due to said exception as desired such that the developer sees the error from where it originated.

njs · April 3, 2019, 5:03am

Heh, that’s PEP 521

Technically, the tricky part would be making @contextmanager / @asynccontextmanager work. (This is also a major problem with PEP 521 actually, that I didn’t recognize at the time.) Example:

@contextmanager
def inner():
    with some_cm:
        # you DON'T want to call some_cm.__yield_context__ here:
        yield

def outer():
    with inner():
        # but you DO want to call some_cm.__yield_context__ here:
        yield

This seems pretty tricky – like somehow @contextmanager would have to disable the __yield_context__ call for yields inside it, but capture the stack of active context managers, and then define its own __yield_context__ that forwards to all the CMs in the captured stack?

The implementation would be more complex also, because instead of a single pointer in the thread state, you’d have to track a stack of active context managers and call their __yield_context__ methods in the right order. Speed-wise, I’m a bit nervous about the cost of doing an extra Python attribute lookup for .__yield_context__ on every with block – last time I checked, failed attribute lookups were really slow, b/c they had to allocate an AttributeError object. But that might be out of date, or I suppose we could add tp_ slots for the context manager protocol. (We don’t have them currently.)

Anyway, I’m not sure the complexity and generality are worth it, given that we have a pretty narrow use case. But if people like this approach better, and if we can solve the issue with @contextmanager, then I guess it’s fine with me.

pf_moore · April 3, 2019, 7:51am

Agreed. I thought I understood the proposal when I first read it as I have a little bit of familiarity with trio. But on re-reading I found that I was less clear than I thought on what’s going on.

@njs While the background on the trio use case is very useful, could you maybe add an example of how this would be used in practice, using “artificial” code? A bit like your inner/outer example from the message where you mentioned PEP 521 (don’t know how to link to individual messages in Discourse, sorry) but in the context of the actual proposal, and with a little more explanation of why inner and outer might need to allow or disallow yields.

Basically, I get from the original post that trio has a legitimate need for this feature, and how it might work at the implementation level. But I don’t get a sense of how to recognise other cases where the feature would be useful, or how to use it in those cases.

njs · April 3, 2019, 8:37am

I don’t know any other cases where this is useful either :-). Probably there are not many, since generators are turning 18 next month, and I haven’t heard anyone else complaining about this so far.

The “structured concurrency” argument is that just like for and while loops are the right primitives for expressing looping control flow, and if/elif/else is the right primitive for expressing alternating control flow, nurseries are the right primitive for expressing concurrent control flow. Adding a new control flow primitive is a pretty rare thing, and generator control flow is itself pretty unusual, so it’s not too surprising that there’s some funky interactions. (Heck, when generators were initially added, yield was always forbidden inside try/finally, and with is just sugar for try/finally. It took 4 years to figure out the details to make them work together at all…)

I do want to emphasize though that nurseries aren’t unique to trio – they’re a generic tool for structuring concurrent programs. Asyncio wants to add nurseries too. And if for some reason you want to avoid all this async stuff and use threads instead, then implementing a threads-based nursery is probably a good way to do that.

FYI, the little timestamp in the upper-right-corner of each post is a permalink to that post specifically. (Also as you scroll up and down, the URL in your URL bar automatically changes to match the post you’re looking at, though I find it more cute than useful.)

smurfix · April 4, 2019, 1:52pm

@dimaqq: To put this another way, another name for a nursery is “task group”. A task group groups tasks, obviously. The point of Structured Concurrency, as applied to Python, is that conceptually all tasks in a taskgroup/nursery run entirely within the async with open_nursery() block – thus, the taskgroup’s async with block is exited only when all tasks have ended, and if/when a task raises an exception, that exception travels up the call stack until caught, cancelling any sibling tasks it encounters along the way.

Obviously this does not work when code within the async with block has yielded control to its caller: the exception should be raised in the nursery’s frame, but that is not live. This situation breaks all the nice guarantees that make thread programming with Structured Concurrency a whole lot easier than with Futures/Deferreds/what-have-you, which is why we’d like a solution that makes the breakage obvious, instead of causing difficult-to-track bugs and/or intractable exceptions in your async runtime.

@asynccontextmanager manages to restore these guarantees: again conceptually, the code in the async with block that calls the context manager behaves as if the yield statement called it. Convincing Python to do that is not that easy – just look at the code that implements this wrapper – but it works, and it lets us write easy-to-understand code that would otherwise be rather messy.

The downside is that currently we have the same syntax (a single yield in a function) for code that ends up meaning two entirely different things depending on whether it’s wrapped with @asynccontextmanager or not, hence the technical part of this discussion.

Personally I’d love to introduce some improved syntax for this, much as we replaced @coroutine/yield from with async def/await – but this use case is not as pervasive (and, frankly, not as annoying to type) as yield from was, so …

asvetlov · April 5, 2019, 9:18am

@njs I feel your pain.
Forbidding yielding outside of async context manager is understandable wish.

Would you provide a case when yield is still desired (except @asynccontextmanager case)?
In other words: should the feature be opt-in or opt-out?

bdarnell · April 5, 2019, 2:42pm

Tornado’s StackContext was another one (although it was deprecated and has been removed from Tornado 6.0). I had a hacky assertion trying to detect and diagnose inappropriate uses of yield in a with StackContext block. I would have used this feature if it had been available, but since StackContext was an experiment that’s no longer useful with modern async primitives, I wouldn’t consider it a strong indicator of demand.

njs · April 5, 2019, 6:05pm

The complete list of cases that I’m currently aware of, where someone might want to use __passthrough_yield_forbidden__ = True:

contextlib.contextmanager
contextlib.asynccontextmanager
contextlib backport packages distributed by third parties (contextlib2, async_generator, etc.)
Async-lib-supporting versions of @pytest.fixture, like the one implemented in pytest-trio.
We’ve been experimenting with a decorator that wraps an async generator to let yield work inside cancel scopes/nurseries. Essentially we move the async generator to run in a background task (so you have to use async with wrapped_agen() as ait, which opens a nursery and starts the generator running in it), and then “replace” all the yields with cross-task channel sends (in the same sense that @contextmanager “replaces” yield with the body of the with block; it’s also safe for the same reasons @contextmanager is safe). More details.

So looking at the usual array of design choices we have to make:

We can’t make __passthrough_yield_forbidden__ a private API that only contextlib gets to use, because there are users outside the stdlib
All the cases where you want to enable __passthrough_yield_forbidden__ involve careful analysis and complicated machinery to make it work. So they won’t mind if they have to do something explicit to enable it.
The cases where you don’t want __passthrough_yield_forbidden__ are the ones that naive users write all the time by accident.

So that’s why in my proposal, __passthrough_yield_forbidden__ is a public API that defaults to False, and requires some explicit request to set it to True.

asvetlov · April 5, 2019, 6:47pm

I understand __passthrough_yield_forbidden__ need.

Also, as you mentioned, “The cases where you don’t want __passthrough_yield_forbidden__ are the ones that naive users write all the time by accident.”

My question is: should sys.forbid_yield() be optional?

The alternative is: make yield forbidding mandatory.
It raises RuntimeWarning for Python 3.8, in 3.9 the warning will be converted into RuntimeError
from __future__ import forbid_yield enforces the behavior for 3.8 as well.

Context manager writers can adopt __passthrough_yield_forbidden__ if needed.

Casual users will fix their code to avoid the warning to be ready for py3.9.

From my understanding sys.forbid_yield() requires adding this call to almost any async context manager.
Sure, you will apply it to trio code even before Python 3.8 release.
I can do the same for aiohttp.
But aiohttp users can have own context managers (actually we have them on my work).
Adding forbid_yield() everywhere is cumbersome.
Moreover, there are naive users which will not read this thread/PEP/changenotes.
RuntimeWarning / RuntimeError will help them to fix subtle errors in their code.

Did I miss something?
Is there the case when yield forbidding is not desirable in async function context at least?
I’m not sure about regular functions, honestly I don’t care too must about them.
Maybe explicit sys.forbid_yield() call is the only way to manage the problem for synchronous context manager, but I expect too many misusages from asyncio users and their existing code (we had this mistake on my job a month ago).

njs · April 5, 2019, 8:09pm

Oh, I see!

To make sure everyone’s on the same page, I think the summary is:

In my proposal, at a very high-level, the design is: you can mix context managers and yield freely by default. And then you can use special mechanisms to designate two things: context managers where you don’t want to allow yields (via sys.forbid_yield), and yield statements where you do want to allow yields (via __passthrough_yield_forbidden__).

Originally I thought @asvetlov was asking whether we should flip the default for the __passthrough_yield_forbidden__ switch, so that yields are only forbidden when both the context manager and the yield statement are explicitly marked. But he was actually asking about flipping the default for context managers, so that by default yield is always forbidden inside all context managers, except if the yield statement is explicitly marked.

No, that’s not my understanding at all! In my understanding, the only context managers that want to use forbid_yield are (1) cancel scopes, (2) nurseries, (3) context managers that are composed out of cancel scopes or nurseries.

Here’s some examples of context managers where I was not planning to use forbid_yield:

with socket_object:
async with stream_object:
with contextlib.suppress(...):
with log_entry_and_exit():

Way back in 2005, PEP 342 decided to allow yield inside try/finally/with in general – the idea being that the finally/__exit__ block will be executed when the generator is GC’ed, if not before.

This does cause some problems, because __del__ methods execute in a weird reentrant context. PEP 342 potentially turns every generator finally/__exit__ into a __del__ method, so people are getting opted into the weird reentrancy issues without realizing it. But this is a general issue that affects sync and async code equally.

There’s a wrinkle here about async generators: they could have created some unique problems for async code, because their cleanup needs to happen in async context. This is why PEP 525 added the asyncgen_hooks mechanism. The asyncgen_hooks stuff doesn’t solve all the problems with generator cleanup, but it does fix the stuff that was unique to async generators. As long as your async library implements the hooks correctly, async generator cleanup has most of the same weirdnesses as regular generator cleanup, but at least it isn’t any worse.

So given that background: IMO if we want to revisit PEP 342’s decision to allow yield inside try/finally and with, then we should do that in general, not just for async code. I think it’s really unlikely that we could reverse the decision entirely, since it would be a huge compatibility break.

But even if we don’t go that far, we could still potentially change things to make this stuff less tricky. For example, PEP 533 suggests making most generator cleanup happen at a predictable time in a predictable context, instead of relying on the GC to do it at some arbitrary time in an arbitrary context. Or, we could make __del__ methods less tricky in general. For example, Java runs all finalizers in a separate thread, to reduce weird reentrancy issues. We could potentially do something like this in Python. PEP 556 discusses this some (though I think the details are a bit different); I guess the gilectomy fork does this too. The asyncgen_hooks are basically the same idea, except using separate tasks instead of separate threads, and restricted to just async generators, not __del__ methods in general. I guess somewhere in here there might be some useful changes we could make :-).

Unless you think we can actually disallow yield entirely inside try/finally/with, I think these ideas are orthogonal to this proposal, and if you want to discuss them further we should split that out into a separate thread.

Can you give more details on which context managers in aiohttp, and in aiohttp users code, would want to use forbid_yield? Can you tell us about what happened in your job? Like I said above, I have no idea what cases would want to use forbid_yield besides cancel scopes and nurseries, so if you have more use cases we definitely need to hear about it

njs · April 5, 2019, 8:21pm

Oh duh, async_timeout.timeout_after would totally want to use forbid_yield, since it’s basically doing the same thing as Trio’s cancel scopes.

graingert · September 26, 2021, 7:20am

it might be worth taking a look at making this a pep as work starts on asyncio.TaskGroup

ncoghlan · January 2, 2022, 2:20pm

I missed this when it was first posted, but discovered it via @ambv’s November update at Weekly Report, November 22 - 28 - Łukasz Langa

The basic idea sounds solid to me, but I’d suggest a slight different colour for the bikeshed:

sys.forbid_yield is probably OK, but I think sys.raise_on_yield would be clearer (with “string means RuntimeError” as a special case, but also accepting any inherently raisable object). I would also have sys.raise_on_yield(None) serve the purpose of allowing yields again from code that wants to permit it. The CM would need to include the storage to restore the previous setting when leaving the with statement body.
I don’t like __passthrough_yield_forbidden__ as an attribute name, but agree that functionality will likely be needed for performance reasons (I think updating the calling wrappers to use with sys.raise_on_yield(None): ... around the delegated calls would technically be enough to do the right thing, but it would also be problematic performance wise, as well as being a much more intrusive code change). I’d suggest calling the (async) generator attribute something like __ensure_yield_allowed__.

njs · January 3, 2022, 6:42am

The reason I didn’t include support for arbitrary raisable objects in the initial proposal was that raising an exception object is a mutating operation (it rewrites __traceback__, __context__, etc.), and in principle you could have multiple attempted yields inside the same context. If it’s a string, then we generate a new RuntimeError on each yield, so this isn’t a problem.

We could let the user pass some kind of exception factory, but that seemed like overkill for this case, plus we could always add it in the future if for some reason it turns out to be important.

ncoghlan · January 3, 2022, 12:36pm

Oh, that’s an excellent point. So I withdraw that suggestion, but any eventual PEP should specifically mention that rationale.

barry · January 3, 2022, 4:36pm

I’m trying to use Discuss more, so landed with Nick’s recent reply. I’m sure I don’t understand all the nitty gritty details, and the discussion goes back a couple of years. The question about this particular take on the problem (which is not yet PEPified, right?) is whether there’s a backward compatibility question to answer. Are you aware of any legitimate, non-buggy uses of this out in the wild?

njs · February 22, 2022, 6:49am

So the core proposal is to add a new feature to the interpreter. So it doesn’t do anything by itself – it only makes it possible to define context managers with these new semantics. So there’s no backcompat issue for the core proposal.

Where potential backcompat issues come in is in transitioning existing context managers to take advantage of this feature.

For Trio, that’s fine – this would just make us fail faster and more reliably in cases where we currently get internal corruption and explode messily. Technically there’s some backcompat issue in cases where people are getting lucky and “getting away with it” right now, but that’s something we can work out with our users.

For asyncio, it’s a bit more complicated – as of 3.10, there are no context managers that would use this, so there’s no problem. But in the current main branch, asyncio has TaskGroups, which do want these new semantics, but don’t enforce them. So if 3.11 is released in the current state, we’ll be in tough position where users will start (a) hitting the problems that this is supposed to prevent and getting frustrated and confused, and (b) writing code that mostly works in practice even though it’s broken, so if we fix it to fail-fast then they will also be frustrated and confused and it will effectively be a backcompat break.

So now that the asyncio folks are moving forward on TaskGroups, it makes this proposal much more urgent. Getting this in before 3.11 would avoid a lot of headaches down the road.