Context manager protocol extension

Andy_kl · April 15, 2023, 7:45pm

Hello all. I want to start a discussion of context manager protocol v2, seeing ideas from the PEP 707 discussion.

I propouse to extend context manager protocol adding 2 new dunder methods
def __enter_ex__(depth: int) -> _T
def __leave__(enter_result: _T, exception: Exception | None, depth: int) -> Any

And the changes of protocol are

__leave__ has access to the result of __enter[_ex]__ call.
__leave__ uses single exception parameter instead of 3-tuple form of exc_info.
Interpreter rebinds the with statement target to the return value of __leave__.
Both __enter_ex__ and __leave__ takes an argument depth equal to the number of calls of with with the same EXPR before.

The syntax remains the same

with EXPR [as VAR]:
    BLOCK

And the specification translates into such pseudo-code

def default_enter_ex(obj, depth):
    return obj.__enter__()

def default_leave(obj, enter_result, exception, depth):
    if exception is None:
        handled = obj.__exit__(None, None, None)
    else:
        handled = obj.__exit__(
            type(exception), exception, exception.__traceback__)

    if not (handled or exception is None):
        raise exception

    return enter_result

mgr = (EXPR)
leave = getattr(type(mgr), "__leave__", None)
if leave is None:
    # Check if old protocol is supported.
    type(mgr).__exit__
    leave = default_leave

enter_ex = getattr(type(mgr), "__exit_ex__", None)
if enter_ex is None:
    # Check if old protocol is supported.
    type(mgr).__enter__
    enter_ex = default_enter_ex

depth = sum(i is manager for i in interpreter.with_stack)

enter_value = enter_ex(mgr, depth)
VAR = enter_value # Only if "as VAR" is present
try:
    try:
        interpreter.with_stack.append(mgr)
        BLOCK
    finally:
        interpreter.with_stack.pop()
except Exception as e:
    # __leave__ should reraise exception or swallow it.
    leave_value = leave(mgr, enter_value, e, depth)
else:  # Or non-local goto.
    leave_value = leave(mgr, enter_value, None, depth)
VAR = leave_value  # Only if "as VAR" is present

The old protocol continues to exist unchanged and is documented as a simpler version.

I wrote a pure-Python gist which demonstrates how the new protocol works.

cc: @iritkatriel, @storchaka, @ncoghlan.

iritkatriel · April 15, 2023, 10:49pm

[snipped - uninteresting editing suggestions which were applied].

Andy_kl · April 16, 2023, 12:39am

Thanks for your suggestions. I have updated my post, leaving only the specification.
I think let your message remain, so that people know that there are a vast number of characters in the post changes that they can read

Could you also cc people who you think might be interested in this discussion, I don’t feel brave enough to call core devs.

iritkatriel · April 16, 2023, 7:27am

Cool. You can leave a short comment on the other thread referring to this one.

jrivers · April 17, 2023, 2:22am

I don’t like that you’d have to explicitly reraise the exception, if the return value of __exit__ is repurposed in the way that’s proposed. The way it is now, if the __exit__ method is only cleaning up resources, and not handling exceptions, then the arguments can be ignored and nothing needs to be explicitly returned. That leaves any exception to be handled elsewhere or produce a traceback. I think that’s generally what you’d expect to happen.

I wouldn’t want to see context managers being made more complicated to write for the benefit of this feature, which probably won’t be used as much as __exit__ methods that only do cleanup. And I especially wouldn’t want exceptions suppressed simply by neglecting to deal with them. If the exception is to be suppressed, it should be suppressed explicitly.

I also have a question about the “depth” argument. Is this meant as a kind of reference counting, so you’d only do cleanup when the last with statement having a particular context manager terminates? I think it looks like a useful feature, but I’m wondering if it will add a lot of overhead to the with statement in cases where the feature isn’t used. I’m also wondering if passing the value as an argument is essential. Could you add a new function to the standard library, name it something like sys.get_context_manager_depth, to get that information? That way you don’t need to know about this feature if you don’t need it, and just want to write a simple cleanup function.

I don’t mean to object to improvements being made to the context manager protocol at all! I just want to say with my critique—please try to keep things simple for the common uses.

steve.dower · April 17, 2023, 10:09am

Is there a good reason to not simply use a yield like the contextmanager decorator expects?

class MyObject:
    def __with__(self):
        try:
            yield self.value
        except ...:
            # error handling if you want it
        finally:
            # cleanup if you want it

with MyObject() as x:
    pass
# -->
mgr = MyObject()
try:
    with_ = type(mgr).__with__
except AttributeError:
    try:
        enter_ = type(mgr).__enter__
    except AttributeError:
        raise TypeError("not a context manager") from None
    exit_ = type(mgr).__exit__
    enter_value = enter_()
else:
    cm_ = with_()
    enter_value = next(cm_)

try:
    BLOCK
except BaseException as e:
    if cm_ is not None:
        cm_.throw(e)
    else:
        if not exit_(type(e), e, e.traceback):
            raise
else:
    if cm_ is not None:
        cm_.send(None)
    else:
        exit(None, None, None)

I guess the good reason is that it’s harder to implement in native code… but should that be a big consideration? Maybe we can create a helper function to take two enter/exit MethodDefs and return a suitable __with__ object?

Andy_kl · April 17, 2023, 10:12am

Thanks for your input.

Yes, the fact that the exception must be raised explicitly is the main drawback of this proposal.
The thing is, if we want to add the " rebinding after __leave__" feature, here are the alternatives I was considering

Return value is rebind value, if you don’t want to suppress the exception - raise it.
Like 1 but to indicate suppress flag we use some magic value (because True, False, or raised exception all could be a valid rebind value).
Invoke both __leave__ and __exit__, first for rebinding, second for suppress detection.
__leave__ should return a 2-tuple, where first element is suppress flag and second is rebinding value.
Like 4 but if return value is not a tuple - no rebinding, return value is suppress flag.

I went with 1 because magic value will be easy to confuse with just bool, 3 is not less wordy, and 4 and 5 are too cumbersome.

I looked at the first 10 context managers in the code I work with and they fall into 2 main categories

class FirstKind:
    def __exit__(self, *i_actually_dont_care):
        # Maybe cleanup code
        return False

class SecondKind:
    def __exit__(self, exc_type, exc, exc_tb):
        # Maybe cleanup code
        if isinstance(exc, type_i_want_to_catch):
            return True
        else:
            return False

With this proposal, first kind almost always can stay as it is (remember, old protocol is still supported).
And to update the second kind I need to replace each return True with return enter_result and return False with raise exception. But still a lot of them could stay __exit__. Right now I know only 2 context managers of my codebase I will update to __leave__ to get advantage of rebinding feature. And a few extra characters isn’t much of a price to pay.

For new context managers if this implemented, for first kind I will continue to use __exit__(self, *_) (In case I do not need reentrancy), but will use __leave__ for second kind.

Yes, it is counter of “how many times above the same context manager entered context”. I would say it is the same as __reduce__ vs __reduce_ex__. We have 2 methods instead of sys.get_current_reduce_value. If you don’t need this feature, stay with __enter__ and ignore depth in __leave__ (you most likely already ignore the 3rd argument for __exit__).

Andy_kl · April 17, 2023, 10:16am

Hm. I have not considered such approach at all. At first glance it looks promising. I will think what are drawbacks of it, but I already see that there will be shenanigans to get return value from the iterator.

pf_moore · April 17, 2023, 10:24am

I really don’t like the idea that a context manager by default will swallow exceptions that it doesn’t explicitly re-raise. That seems like a very likely cause of subtle, hard-to-diagnose bugs. Suppressing exceptions should be the exception, rather than the rule^[1] and as such, should be explicitly requested by the user.

I know that many context managers will use a higher level wrapper like contextlib.contextmanager, but IMO that’s not a reason for making the lower level mechanism unsafe.

Sorry, I couldn’t resist! ↩︎

steve.dower · April 17, 2023, 1:41pm

Why? There’s no need for a return value here - exceptions just get sent back into the generator and handled normally. The pattern already exists, and we can simplify the existing contextmanager decorator to:

class contextmanager:
    def __init__(self, target_function):
        self._func = target_function

    def __with__(self):
        return self._func()

(Besides, if you want a return value from a generator, you simply return it and it gets passed as the argument of the StopIteration exception. This was added around 3.3/3.4 with yield from.)

The shenanigans will be around handling GeneratorExit from within the block, since we want it to propagate out but it won’t naturally propagate through a context manager that is itself a generator (it should get converted into a RuntimeError, unless this particular case was already considered, but I don’t remember it coming up at the time). But that will be possible to handle, I just didn’t include it in my example above.

stereobutter · April 17, 2023, 3:06pm

A feature that the current context manager protocol lacks and that some people have asked for is the ability for the context manager to skip executionn of the block. Maybe by using a special exception akin to StopIteration the new protocol could support this?

Andy_kl · April 18, 2023, 12:33am

After remembering how generator’s send/throw work and playing around with implementation options I really like this approach. I wrote a pure-Python gist which simulates the functioning of this approach.

You were right, getting the rebinding value from StopIteration was really quite easy, but I ended up not liking the “return value of generator is a rebinding value” approach because just forgetting to add return enter_result at the end of the function will rebind with target with None.
For example naive empty cm.__with__ is

def __with__(self):
    yield self

will rebind to None, and the correct way would be

def __with__(self):
    yield self
    return self

So, the new approach as a whole goes like this:
Context manager protocol extends with a new dunder method def __with__(depth: int)
The depth argument is equal to the number of calls of with statement with self as statement expression up the frame stack.

The method must return a generator-iterator when called. This iterator must yield from one to two values. The first value yielded will be bound to the targets in the with statement’s as clause, if any. At the point where the generator yields, the block nested in the with statement is executed. The generator is then resumed after the block is exited. If an unhandled exception occurs in the block, it is reraised inside the generator at the point where the yield occurred. If the exception is catched, the generator must reraise that exception otherwise it is supressed. If an exception has not occurred, or has been suppressed, the generator can yield a second value, which again will be bound to the targets in the with statement’s as clause (if any). After that generator will be closed.

That way all parts of the protocol are explicit. I.e. if you have not explicitly excepted, the exception is reraised. And f you have not explicitly yielded the second value, there is no rebinding. Enter result is available to the exitter because it is closed to the same function.

Andy_kl · April 18, 2023, 12:47am

After more consideration I agree. I think if we will decide to add that rebinding feature, it is better to go with "__leave__ should return a 2-tuple, where first element is suppress flag and second is rebinding value, otherwise it should return bool meaning only “suppress flag”, so if one want to update __exit__ they could just replace it with new signature (with same name for exception) and function behaves like before. However, that way I don’t that much like a rebinding feature, there are workarounds.

I am -1 for that. I don’t want ever to protect myself with

executed = False
with cm:
    executed = True
    ...

if not executed:
    panic

The way I think it should work is

class Manager:
    def can_with(self):
        return ...

    def __enter__(self):
        # In case you as a manager developer want to protect against unsafe execution
        if not self.can_with():
            raise Exception("You should check if manager.can_with() before attempt")
       ...

manager = Manager()
if manager.can_with():
     with manager:
        ...

I see no need for such a feature outside of dirty/hacky one-day solutions.
But with __with__ approach I can see how it could be implemented - just return before first yield. I will reconsicer this if there is a real use case and demand.

jrivers · April 18, 2023, 2:31am

This more radical redesign is the kind of thing I was imagining when I heard about “redesigning the context manager”. Seeing your proposal, It looks like it addresses almost all of the points raised very neatly. It would be a little more work if one wants to switch existing code to the new method, but I like this way of doing this.

One thing, in this discussion (about adding a __leave__ method, that PEP 707 linked to), it talks about supporting the case where a base class switches to __leave__ and a derived class still uses __exit__. Would something similar need to be supported with this proposal, where a base class changes to __with__ and the derived class still uses __enter__/__exit__? Or do you just say the whole class hierarchy must use one or the other?

If we look at the original goals from PEP 707. They are:

Better performance in the interpreter. If I understand right, the internal implementation has switched from using the 3-value form of exceptions to the 1-value form, and there’s extra overhead wherever the 3-value forms are still used. If we keep using __exit__, you don’t get the performance benefit of the 1-value exceptions, and likely extra overhead from the interpreter needing to support both forms.

The “depth” parameter would also add overhead to context managers just to keep track of that information. That goes if they’re used or not, and whether you use __enter__/__exit__ or __enter_ex__/__exit_ex__. I don’t know if this would be significant, or if the savings from using the 1-value exception balances it out.
Simplifying the language. If I understand right, everything that uses the 3-value exceptions should get an alternative with 1-value exceptions, and that that’s been achieved, except for __exit__. The fact that the traceback parameter is rarely used, and it can be gotten through other means, is one of the reasons for it to be removed. Adding a “depth” parameter makes it more complicated again.

When we’re writing a __with__ method. What is the advantage of having the depth parameter passing in, versus doing something like this:

def __with__(self):
    self._depth += 1
    depth = self._depth
    try:
        ... # do setup
        try:
            yield self
        finally:
            ... # do cleanup
    finally:
        self._depth -= 1

It seems easy enough to keep track of the information that way, if you need it. If you don’t, then you leave it out and there’s no performance overhead. Is there a situation where this doesn’t work?

steve.dower · April 18, 2023, 10:30am

Andrej Klychin:

I ended up not liking the “return value of generator is a rebinding value” approach because just forgetting to add return enter_result at the end of the function will rebind with target with None.
For example naive empty cm.__with__ is
def __with__(self):
    yield self
will rebind to None, and the correct way would be
def __with__(self):
    yield self
    return self

The yield is where the block of code executes, so the yielded value (the first self) is what gets bound. Anything returned at the end is probably going to be ignored. If you want to suppress exceptions, wrap the yield in try/except.

I suggest referencing contextlib.contextmanager, since we already have an implementation of this. All we’d be doing is promoting it to “native” - there shouldn’t be any need to change any other semantics.

Oof, this could be a challenge, yeah. We might have to invert the logic to prefer __enter__/__exit__ if they’re defined (and not-None, so that a subclass can hide them if it wants) for a deprecation period (with a warning), and eventually switch to preferring __with__ and ignoring the old logic if it’s there.^[1]

Generally you wouldn’t override __enter__/__exit__ anyway (if it’s designed well, you’d give people specific methods to override), so I imagine most types will be able to get the benefits immediately, even if we keep preferring the old way for now.

A more complex approach could check the MRO and use the most derived implementation’s API. No idea whether that’s worth it though. ↩︎

jrivers · April 19, 2023, 12:38am

Look back at this post and this post from the original discussion. Both suggest adding some way to return a value from the context manager when the with block is finished, not just when it starts. That’s where the thing with return values is coming from.

steve.dower · April 19, 2023, 1:34am

Okay, so the proposal is to enable something like this:

with timeit() as duration:
    code_under_test()

assert isinstance(duration, float)

Which is not possible today because the as name is intended for use within the block and the final primitive value cannot be known until after the block has been executed.

I’d worry about changing the user-visible semantics of with, as opposed to only the implementer’s semantics, though it doesn’t actually seem terrible to simply rebind the name again at the end. And if we go ahead with a generator-based __with__ approach then it’s certainly possible to handle a returned value.

Worth writing it up, at least, but I’d hold onto it loosely. It’s the kind of change that will get a proposal rejected while everything else is uncontroversial. It’s also likely that we’ll find some reason it’s not a good idea in the process of specifying the behaviour.

Andy_kl · April 20, 2023, 9:36am

Here is a concrete example of where these changes will be used. This is not a fictional example, it may well occur in the wild, although perhaps real cases will only use some of the bits we are talking about.
So, it is reentrant context manager which also wants to leave something after the block to represent the result of interaction with the connection.

# What we have to do now.
class ConnectTo:
    def __init__(self, where):
        self.where = where
        self._depth = 0
        self._connection = None
        self._connection_result = None

    @property
    def last_connection_result(self):
        return self._connection_result

    def __enter__(self):
        if self._depth == 0:
            self._connection = get_connection(self.where)

        self._depth += 1
        return self._connection

    def __exit__(self, exc_type, exc, exc_tb):
        if isinstance(exc, ConnectionError):
            self._connection.do_fancy_stuff(exc)
            result = True
        else:
            result = False

        if exc is None:
            self._connection.commit()
            self._connection_result = self._connection.get_state()

        self._depth -= 1
        if self._depth == 0:
            self._connection.disconnect()
            self._connection = None

        return result

with (cm := ConnectTo('db')) as connection:
    ...
result = cm.last_connection_result

# What we can do with __enter_ex__ and __leave__.
class ConnectToEx:
    def __init__(self, where):
        self.where = where
        self._connection = None

    def __enter_ex__(self, depth):
        if depth == 0:
            self._connection = get_connection(self.where)

        return (self._connection, None)

    def __leave__(self, connection, exc, depth):
        result = None
        if isinstance(exc, ConnectionError):
            self._connection.do_fancy_stuff(exc)

        elif exc is None:
            connection.commit()
            result = connection.get_state()

        if depth == 0:
            connection.disconnect()
            self._connection = None

        if exc and not isinstance(exc, ConnectionError):
            raise exc

        return (None, result)

with ConnectToEx('db') as (connection, result):
    ...
result

# What we can do with __with__.
class ConnectToWith:
    def __init__(self, where):
        self.where = where
        self._connection = None

    def __with__(self, depth):
        if depth == 0:
            self._connection = get_connection(self.where)

        try:
            yield (self._connection, None)
        except ConnectionError as e:
            self._connection.do_fancy_stuff(e)
        else:
            self._connection.commit()
            yield (None, self._connection.get_state())

        finally:
            if depth == 0:
                self._connection.disconnect()
                self._connection = None

with ConnectToWith('db') as (connection, result):
    ...
result

As can be seen from this example, everything discussed here can be achieved now, but requires some boilerplate.

The main advantage is that you don’t have to maintain this yourself. It will be quite cheap for the interpreter to do this (interpreter.with_stack.append(id(manager_object)) and interpreter.with_stack.count(id(manager_object))). I’ve faced bugs related to reentrancy several times, and it’s always incorrect handling of the depth counter (e.g. not reducing it in one of the branches). If the counter is inside the interpreter, these bugs should disappear.

I don’t think that’s a problem. If we call this kind of default __with__. In case __enter__/__exit__ code uses super() then it will be an error, and if it doesn’t, as I understand it, it means that the derived class doesn’t want to use the parent’s protocol.

def default___with__(obj, depth):
    # Exception in __enter__ should be propagated.
    enter_result = obj.__enter__()
    try:
        yield enter_result
    except BaseException as e:
        if not obj.__exit__(type(e), e, e.__traceback__):
            raise
    else:
        obj.__exit__(None, None, None)

But as you correctly said, it’s pretty rare when the manager’s creator expects an derived class to overload magic methods, they will make additional methods.
So, the main ‘drawback’ of __with__ approach is that super().__with__ is pretty useless and I expect people will add such methods anyway

class Parent:
    def enter_context(self):
        ...
    def exit_context(self, exc):
        # default handlers

    def __with__(self):
        try:
            yield self.enter_context()
        except Exception as e:
            self.exit_context(e)

class Child(Parent):
    def __with__(self):
        try:
            yield self.enter_context()
        except MyException:
            ...
        except Exception as e:    # this is pretty super().__with__
            self.exit_context(e)

For me, both ideas of depth and rebind are not very important. There are workarounds to get this. It is quite possible that we will go without them, and they will be a potential __with_ex__ if needed later on.

oscarbenjamin · April 20, 2023, 1:29pm

I haven’t worked this through based on your example code but I can imagine simpler ways to manage the boiler-plate rather than changing the context manager protocol such as:

Make a ReentrantContextManager that keeps track of depth but can be subclassed like:

class ConnectToSubclass(ReentrantContextManager):
    # __enter__ and __exit__ supplied by superclass
    def __enter_ex__(...):
        ...
    def __leave__(...):
        ...

Make a decorator like @contextmanager_reentrant that wraps something like the __with__ function or perhaps the class:

@contextmanager_reentrant
class ConnectToDecorator:
    # __enter__ and __exit__ supplied by decorator
    def __init__(...):
        ...
    def __with__(...):
        ...

Maybe there’s also a nice way to do this with generator functions like the existing @contextmanager decorator.

Either way the result could be something that achieves the suggested behaviour but wrapped up to work with the existing context manager protocol rather than requiring a new protocol.

The contrast in the given examples between “what we have to do now” and “what we can do with …” does not suggest to me that this alternative version of the context manager protocol would be easy to understand. The fact that some things are handled implicitly by the interpreter reduces the amount of explicit code but there is instead a cognitive burden to understand how the implicit behaviour of the (nested) with statements interact with the code that is visible. This is especially jarring for the implicit rebinding that takes place invisibly after the with statement. Usually a name is visible at the place in the code where it is rebound even if the binding is implicit like when using a decorator.

Both examples need if depth == 0 in at least two places which hints that it might be cleaner to have separate method(s) to be called for the depth == 0 case. Also apparently the only relevant property of depth is whether it is equal to zero so if there were separate methods for that then maybe the depth argument would not even be needed.

steve.dower · April 20, 2023, 2:03pm

Reentrancy is pretty easy with a yield-based context manager:

def reentrant(obj):
    try:
        obj.open()
    except AlreadyOpenError:
        yield obj
    else:
        try:
            yield obj
        finally:
            obj.close()

There are a few important variants on this, which is why we don’t want to bake it into the protocol. Users of it can design the handling they need easily enough, whether it’s using instance state or local state.