Context manager protocol extension

NeilGirdhar · April 20, 2023, 2:29pm

My view is that __enter__ and __exit__ are examples of methods that have an “augmenting pattern”; they should always call super because you don’t know what your superclass is, and you don’t know if it will have some important behavior.

I realize that we’re just hashing things out, but in the documentation, I think the examples should always show delegation to super as a good habit to get into. Here’s an example of unconditionally calling super using the AbstractContextManager as a stub provider:

from contextlib import AbstractContextManager
from time import perf_counter_ns
from types import TracebackType
from typing_extensions import Self, override

class Timer(AbstractContextManager['Timer']):
    @override
    def __init__(self) -> None:
        super().__init__()
        self.start = 0
        self.end = 0

    @override
    def __enter__(self) -> Self:
        super().__enter__()
        self.start = perf_counter_ns()
        return self

    @override
    def __exit__(self,
                 exc_type: None | type[BaseException],
                 exc_val: None | BaseException,
                 exc_tb: None | TracebackType,
                 /) -> None:
        super().__exit__(exc_type, exc_val, exc_tb)
        self.end = perf_counter_ns()

I think you’re absolutely right. As far as I can see, there’s no good way to implement inheritance with a __with__ as we see it. The tempting thing to do is to yield from the parent class’s __with__, but the problem with that is that we’re stuck yielding whatever the parent wanted to yield. And we can’t alter that in the local scope or else the parent won’t receive exceptions. Am I missing something?

If this is the case, then I agree with Andrej that this definition of __with__ is essentially broken with respect to inheritance. Adding auxilliary methods is not a reasonable compromise since parent classes (which are unknown) may not know about your auxilliary methods.

Maybe map could be altered to make this work?

    @override
    def __with__(self) -> Generator[Self]:
        self.start = perf_counter_ns()
        try:
            yield from map(lambda _: self, super().__with__())
        finally:
            self.end = perf_counter_ns()

Even so, this doesn’t seem simpler than the original code with __enter__ and __exit__.

steve.dower · April 20, 2023, 3:06pm

That’s not how inheritance (as a design pattern) works. This is encapsulation using subclassing. And Python doesn’t really do either that well - composition is preferred.

For inheritance, the base class provides the entry point to the common functionality and the subclass provides the implementation. In this case, that means the base class must be a context manager. Otherwise, you’re wrapping a context manager around a class that isn’t one, and so you can do whatever you like.

In essence, a base class that doesn’t provide subclassable behaviour isn’t really a base class.

I guess if you really wanted to subclass something that doesn’t provide proper extension points and override its behaviour while keeping the underlying functionality, you could do it this way:

class SubClass:
    def __with__(self):
        with super() as s:
            # do something else
            yield s

I’m pretty sure that would require special handling of super() objects by the with bytecodes, but that’s doable. It does, however, need to be specified.

NeilGirdhar · April 20, 2023, 3:16pm

Of course you’re right that composition is preferred (“composition over inheritance”). But you can’t always compose things. Sometimes you need inheritance because of polymorphism requirements. Personally, I think Python’s inheritance pattern is just fine.

Yes, and in the example I gave, the base class is AbstractContextManager, which is a context manager? Am I missing your point?

That’s a fascinating solution! It also gets rid of the try/finally right? If this really can be done, then __with__ looks much more attractive!

steve.dower · April 20, 2023, 3:23pm

Right now it can’t, but presumably this just means we need to implement it:

Python 3.11.3 (tags/v3.11.3:f3909b8, Apr  4 2023, 23:49:59) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class Base:
...   def __enter__(self):
...     print("Base.__enter__")
...     return 'base'
...   def __exit__(self, *a):
...     print("Base.__exit__")
...
>>> class Sub(Base):
...   def __enter__(self):
...     print("Sub.__enter__")
...     return 'sub'
...   def __exit__(self, *a):
...     print("Sub.__exit__")
...   def f(self):
...     with super() as s:
...       print("Got", s)
...
>>> Sub().f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in f
TypeError: 'super' object does not support the context manager protocol

FWIW, my expected/hopeful output here would be:

>>> Sub().f()
Base.__enter__
Got base
Base.__exit__

Andy_kl · April 20, 2023, 3:24pm

One thing we all should keep in mind while discussing things is that there is also async world and their
async with which should be in line with synchronous world. The main thing is that we cannot suggest any pattern using yield from because this is a syntax error in an asynchronous generator function.

Andy_kl · April 20, 2023, 5:12pm

Yes, I’m aware of the ways to make reentrancy. It never was a question of “the language does not allow to make it”, rather a question “what way is the best for that concrete context manager”. And pretty often contextlib and/or some pattern satisfy the need. I wrote about `depth’ to make it heard, but I don’t hold on to that idea in any way. So, I step down with it. If someone has an example where it significantly simplifies the code, they can advocate for it.

But I will still advocate for the “rebind on block exit” idea.
First thing, we already have something simillar

try:
    1 / 0
except Exception as e:
    pass
e  # UnboundLocalError: local variable 'e' referenced before assignment

So, “rebinding the name at exit of some syntax block” is not completly new for the language.
And because it needs an explicit statement in __with__ approach (second yield) I think it does not add that much cognitive burden.
After all, I expect people will use it mostly as:

with cm() as result:
     assert result is None
assert result is not None

with cm() as (enter, leave):
    assert enter is not None and leave is None
assert leave is not None

Am I understanding correctly that in the matter of inheritance our final goal with __with__ approach is


class Base:
    def __with__(self):
        print("Base.__with__ enter")
        try:
            yield self
        except Exception as e:
            print("Base.__with__ exception")
            raise e
        else:
            print("Base.__with__ exit")
        finally:
            print("Base.__with__ finally")


class Derived(Base):
    def __with__(self):
        try:
            with super() as enter_result:
                print("Derived.__with__ enter")
                yield self
        except Exception as e:
            print("Derived.__with__ exception")
            raise e
        else:
            print("Derived.__with__ exit")
        finally:
            print("Derived.__with__ finally")


def exception_case():
    with Derived() as d:
        print("exception_case body")
        raise Exception

    # prints:
    # Base.__with__ enter
    # Derived.__with__ enter
    # exception_case body
    # Base.__with__ exception
    # Base.__with__ finally
    # Derived.__with__ exception
    # Derived.__with__ finally

def no_exception_case():
    with Derived() as d:
        print("no_exception_case body")

    # prints:
    # Base.__with__ enter
    # Derived.__with__ enter
    # no_exception_case body
    # Derived.__with__ exit
    # Derived.__with__ finally
    # Base.__with__ exit
    # Base.__with__ finally

NeilGirdhar · April 20, 2023, 5:42pm

I hope so. Except, I think Base should inherit from AbstractContextManager and should also wrap its yield in a with super().

steve.dower · April 20, 2023, 6:19pm

There’s no reason to do this. I don’t even know why we have that, apart from perhaps as documentation for people who read source code instead of docs.

Context managers are a protocol, not an override. Abstract base classes are way too limited and way too heavyweight to normalise them when they aren’t useful.

This is only because of a reference cycle created by the traceback that leads to massive memory leaks when you handle exceptions in a loop. The unbound exception name is a wart, and not a good precedent for future decisions - we don’t want more of these, we want less.

If you want to avoid leaking resources beyond a with statement, then you can clean them up. If except blocks had been implemented after context managers, they’d probably use similar semantics to that, but they came first and we decided to make things better on the second attempt.

So let’s not consider turning warts into principles of good language design

Andy_kl · April 20, 2023, 7:05pm

Okay, it seems before answering the question “how to express rebinding after block” we first need to answer “do we want the protocol allow rebinding at all”. This is a pretty important question, because unlike depth thing it is something the language does not allow to do in a clear way.

To give context to new readers, the problem was first raised here

In code it looks like

from time import perf_counter
class TimeIt:
    def __init__(self):
        self.elapsed = None

    def __enter__(self):
        self.start_time = perf_counter()
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.elapsed = perf_counter() - self.start_time

with TimeIt() as time_it:
    ...
print(time_it.elapsed)

# In case manager.__enter__ returns non-self
with (use_me_to_get_result := cm()) as enter:
     ...  # work with 'enter'
use_me_to_get_result.result

The proposed solution is to allow to return extra value (return value of __leave__ or second yielded value of __with__). When such value is present as target of with statement rebinds with that value.

I’d like to hear input from all the people, whenever this “rebind after block” feature is desirable. Maybe things like timeit and assertRaises are black sheeps, rare cases that should use workarounds.

NeilGirdhar · April 20, 2023, 7:08pm

The reason to do this is so that anyone can inherit from multiple context manager classes without losing behavior. If you try to explicitly call parent classes, you could get into trouble when there are common base classes. I understand if people don’t ever want to use inheritance with the context managers that they write. But by not calling super, we prevent anyone else from inheriting from our context managers.

Okay, I see your point about “normalizing” this. People are free to write context managers that don’t support inheritance, just like they’re free not to call super in __init__. I personally like to keep my options open, so I make an effort to call super in both __init__ and __enter__/__exit__.

As long as calling super is easy, then I’m happy.

What’s the “wart” here?

pf_moore · April 20, 2023, 7:43pm

I think of the value produced by the context manager as the managed object and not as a result. So I’d expect to use it in the with block, not afterwards - making it be a result feels unnatural to me.

Examples like timeit and assertRaises are unusual cases, mainly because there’s no “managed object” in that sense. But viewing the returned value as an object that manages information about what happened in the block is not unreasonable, so they do (mostly) fit this model, and aren’t so much “rare cases that should use workarounds” as “unusual cases that use the pattern in a slightly different way”.

So in summary, no I don’t think the rebinding feature is important, and I think that a mutable value that collects information about the block execution is a perfectly acceptable way of handling the use cases that rebinding is aimed at.

steve.dower · April 20, 2023, 7:49pm

I meant (and quoted) inheriting from AbstractContextManager.

You can, and should, totally call super() when you want to inherit the behaviour of something later in your MRO. But you can’t rely on everyone mixing in with you also subclassing from the ABC, and as soon as someone doesn’t, you may find your super() chain is terminated early.

Implicit del from the user’s namespace. If the user wants to remove a name, the syntax should reflect that - learning that except will unbind the name shouldn’t be necessary (it just turned out to be less harmful than leaving the call stack reference cycle around).

oscarbenjamin · April 20, 2023, 8:08pm

This subtopic just keeps reminding me why I don’t like inheritance or at least I think it should be used much less than it is. Multiple inheritance is even worse and at best I would only use it for something like a mixin where the different base classes provide mostly orthogonal functionality e.g. if one provides context manager behaviour and the other provides something completely different. The idea that you should be able to inherit from two different context managers and somehow combine and extend their functionality using super in a subclass just sounds like a mess and absolutely a case where composition or something else should be used instead. Making something like this work in a multiple inheritance scenario would require very careful design that would not even be possible without a clear use case. There is no way that I would try to design for MI in a context manager class just to “keep my options open” without having any particular idea of what is actually needed.

I think it is better to start from the premise that most classes just should not be subclassed. Those that are expected to be subclassed should be very deliberately designed to provide a particular contract between super and subclass and which methods are to be implemented or called by each (just calling super is not enough to define this contract). There should be no presumption that “anyone can inherit from multiple context manager classes without losing behavior” just as there should be no presumption that it is okay to subclass any random class that was not carefully designed for subclassing.

NeilGirdhar · April 20, 2023, 8:23pm

That’s totally fine. If you mark those classes final, and then it’s no problem if you don’t call super because you know who your parents are.

I think some of your apprehension is justified because some people use inheritance as lazy composition—I agree that that’s bad. When it comes to cooperative multiple inheritance, I find that in general, as long as all the classes inherit from an appropriate base and all call super, everything does actually just work.

Let’s look at a concrete example. Upthread, there’s discussion about baking reentrancy into context manager protocol because writing it yourself is “error-prone”. Suppose instead, we were to add the following base class:

from contextlib import AbstractContextManager
from typing import override, Generic

class ReentrantContextManager(AbstractContextManager[T], Generic[T]):
    @override
    def __init__(self, **kwargs: Any) -> None:
        super().__init__(**kwargs)
        self.reentrancy_count = 0

    @override
    def __with__(self) -> Generator[T]:
        self.reentrancy_count += 1
        try:
            with super() as x:
                yield x
        finally:
            self.reentrancy_count -= 1

If you use this as a base class, and provided your class calls super in all code paths, then the reentrancy count is necessarily updated in all code paths.

I agree that composing this is often preferable.

stereobutter · April 24, 2023, 8:26am

I’ve already quoted @njs in the discussion about PEP 707 but I think his piece of insight is even more relevant for this discussion

So new users constantly try to write their own enter /exit methods […] and I think literally every person who’s ever tried this has gotten it wrong (mostly around exception handling details). At this point we don’t even try to debug; we just tell users to always use @contextmanager.

From a usability standpoint I think a single generator based __with__ inspired by @contextmanager instead of a pair of __enter__/__exit__ methods would be a clear usability improvement in addition to yielding (pun intended) a very natural syntax for a few features that the current context manager protocol lacks:

skipping the enclosed code block (already mentioned above)
accessing a value returned by the block from the context manager

def __with__(self):
    if some_condition:
        return  # early return means the enclosed block is skipped
    try:
        ...
        # binds 'self' to the name captured by the `as` clause 
        # and receives any value returned from the block 
        # into the variable 'block_return_value'
        block_return_value = yield self  
    except:
        ...
    else:
        ...  # potentially do stuff with 'block_return_value' here
    finally:
        ...

NeilGirdhar · April 24, 2023, 9:41am

I think this makes context managers way too complicated. It’s already complicated enough to reason about them.

How does the code block return a value?

stereobutter · April 24, 2023, 9:54am

I meant that that __with__ method would receive the value (if any) return from the block e.g. "hello world" in the example below

def example():
    with SomeContextManager() as cm:
        return "hello world"

NeilGirdhar · April 24, 2023, 10:21am

So you can only return a value from block if you’re also returning from the enclosing function?

This seems unnecessarily complicated.

stereobutter · April 24, 2023, 10:35am

To be clear, my idea of making the value returned from within the block accessible to the context manager (in addition to exceptions raised in the block) is not related to the idea rebinding the variable bound by the as clause that some other people mentioned. I personally think using an attribute on the context manager is a way better solution than to rebind the variable in __exit__.

with Timer() as timer:
   ...

print(f"the above block took {timer.duration} to execute")

Rosuav · April 24, 2023, 1:59pm

I wouldn’t consider it a normal behaviour of a code block, but it should at least be considered. For example, I’ve often made small database query functions that look like this:

def get_thing(id):
    with conn, conn.cursor() as cur():
        cur.execute("select blah where id=%s", (id,))
        return cur.fetchone()

(rough approximation from memory), so in this instance, the block is in process of returning when the context manager ends.