Yield-based contextmanager for classes

Conchylicultor · April 28, 2021, 5:47pm

@contextlib.contextmanager is awesome to easily create contextmanagers without boilerplate. However, creating contextmanager for classes is still painful/boilerplate, expecially when dealing with nested contextmanagers.
It would be great to extend the yield-based syntax of @contextlib.contextmanager to classes (e.g. through a __contextmanager__ method).

Here is an example how this would look on a (simplified) real use-case I had.

@contextmanager
@dataclass
class File:
  path: str
  mode: str = 'r'

  def __contextmanager__(self):
    with tf.io.gfile.GFile(self.path, self.mode) as f:
      with h5py.File(f, self.mode) as h5_f:
        yield h5_f

  ...  # File has other methods

with File('/path/to/file.txt') as f:
  data = f['dataset']

Without __contextmanager__, implementing __enter__/__exit__ would be much more verbose/ugly. One would have to save the GFile, h5py.File through self._gfile_context attributes. It’s also not trivial at all to correctly implement __exit__ in case GFile or h5py.File suppress the exceptions (GFile.__exit__() return True).

The implementation could be a simple extension of contextlib.contextmanager. Here is a proof of concept:

def contextmanager(cls):
  cm: Optional[ContextManager[_T]] = None

  def __enter__(self):
    nonlocal cm
    cm = self.__contextmanager__()
    return cm.__enter__()

  def __exit__(self, exc_type, exc_value, traceback):
    return cm.__exit__(exc_type, exc_value, traceback)

  cls.__enter__ = __enter__
  cls.__contextmanager__ = contextlib.contextmanager(cls.__contextmanager__)
  cls.__exit__ = __exit__
  return cls

Note: This implementation also works with inheritance

@contextmanager
class FileWrapper(File):

  def __contextmanager__(self):
    with super().__contextmanager__() as f:
      yield f

uranusjr · April 29, 2021, 12:48am

The context of a context manager has to live somewhere (the manager), so if calling File returns a context manager, the state has to live in File. In other words, an extra attribute is always needed to implement __contextmanager__, there’s no way around it. You can limit the attribute to exactly one without any new magic though:

@contextmanager
def _file_context_manager(s):
    with tf.io.gfile.GFile(s.path, s.mode) as f:
      with h5py.File(f, s.mode) as h5_f:
        yield h5_f

@dataclass
class File:
    path: str
    mode: str = "r"
    _ctx: Any: = None

    def __enter__(self):
        if self._ctx is not None:
            raise RuntimeError("can't enter again")
        self._ctx = _file_context_manager(self)
        return self._ctx.__enter__()

    def __exit__(self, et, ev, tb):
        if self._ctx is None:
            raise RuntimeError("not entered")
        ctx = self._ctx
        self._ctx = None
        return ctx.__exit__(et, ev, tb)

(No comment on whether the interface is good design or not in the first place.)

njs · April 29, 2021, 7:24am

I find myself explaining this trick to people all the time, so I agree it would be nice if there was a standard version I could point them to.

In my case, it’s b/c Trio has some important context managers that people want to wrap (cancel scopes + nurseries). So new users constantly try to write their own __enter__/__exit__ methods that manually call Trio’s __enter__/__exit__ methods + do other things, and I think literally every person who’s ever tried this has gotten it wrong (mostly around exception handling details). At this point we don’t even try to debug; we just tell users to always use @contextmanager.

One thing that makes me a bit uncomfortable is the handling of the magic contextmanager state. There’s the question of what to call the magic object attribute, but even more so, it’s weird to have a method that implicitly sets object state – what if someone calls __enter__ twice and one of the context manager objects overwrites the other? But I think these can be solved [1], and I’d be in favor of adding this to contextlib. @ncoghlan and @yselivanov, IIRC you’re the contextlib maintainers; what do you think?

[1] e.g. make the second call to __enter__ raise an error instead of overwriting internal state.

Conchylicultor · April 29, 2021, 12:05pm

Maybe I misunderstood. My implementation doesn’t required any new attribute yet seems to works (by using closure instead). Did I missed something ?

Yes, I know about the self._ctx using @contextlib.contextmanager trick but is it really the best Python can do ? My point is that this is verbose, easy to get wrong (e.g. handle the re-entrant context), and has to be duplicated over and over.

Re-entrant contextmanagers is a good point. I fixed my implementation to support this (also added few checks to have something closer to production code):

def contextmanager(cls):
  """Yield-based contextmanager for classes."""
  # Use cls.__dict__ instead of hasattr to support inheritance
  if '__contextmanager__' not in cls.__dict__:
    raise NotImplementedError(f'Missing {cls.__name__}.__contextmanager__')
  if '__enter__' in cls.__dict__ or '__exit__' in cls.__dict__:
    raise ValueError(
        f'{cls.__name__}.__enter__/__exit__ should not be defined when using '
        '@contextmanager'
    )

  context_stack: List[ContextManager[_T]] = []

  def __enter__(self):
    cm = self.__contextmanager__()
    context_stack.append(cm)
    return cm.__enter__()

  def __exit__(self, exc_type, exc_value, traceback):
    if not context_stack:
      raise RuntimeError('Context manager was never __enter__.')
    cm = context_stack.pop()
    return cm.__exit__(exc_type, exc_value, traceback)

  cls.__enter__ = __enter__
  cls.__contextmanager__ = contextlib.contextmanager(cls.__contextmanager__)
  cls.__exit__ = __exit__
  return cls

This is just one of the possible implementations. There might also be implementation relying on inheritance and abstract method.

@contextmanager
class MyOject:
  count = 0

  def __contextmanager__(self):
    self.count += 1
    yield
    self.count -= 1

obj = MyOject()
with obj:
  with obj:
    assert obj.count == 2

yselivanov · April 29, 2021, 4:35pm

The proposal looks OK to me, actually.

ncoghlan · April 30, 2021, 11:33am

The re-entrant version looks plausible to me as well.

I’ve had two prior goes at this (once in the original PEP which we rolled back because it was impossible to explain and once when I considered making a public API for the somewhat nasty hack that lets ContextDecorator work), and using the decorator’s closure as additional storage space potentially addresses some of my concerns with previous attempts.

One challenge with a closure, though, is thread safety - every instance will get the same storage, as the closure is associated with the class definition rather than each instance.

Lazily initialised storage on the instance object is one way to avoid that. Context variables could potentially offer another option.

Conchylicultor · April 30, 2021, 10:38pm

Thead-safety is another good point. Couldn’t weakref.WeakKeyDictionary be used for this ? Something like:

def contextmanager(cls):
  """Yield-based contextmanager."""
  context_stacks = weakref.WeakKeyDictionary()

  def __enter__(self):
    cm = self.__contextmanager__()
    context_stacks.setdefault(self, []).append(cm)
    return cm.__enter__()

  def __exit__(self, exc_type, exc_value, traceback):
    if not context_stacks.get(self):
      raise RuntimeError('Context manager was never __enter__.')
    cm = context_stacks[self].pop()
    return cm.__exit__(exc_type, exc_value, traceback)

  cls.__enter__ = __enter__
  cls.__contextmanager__ = contextlib.contextmanager(cls.__contextmanager__)
  cls.__exit__ = __exit__
  return cls

I believe this should works if each thread uses different instances of cls. However if thread are trying to open/close contextmanagers concurrently on same instance there would still be an issue I believe. I’m not really familiar with context variables but I’ll have a look.

njs · May 1, 2021, 12:11am

If you want to allow arbitrarily deep enter/exit nesting in arbitrarily many threads, you could do something like:

def contextmanager(cls):
    stack_holder = threading.local()

    def __enter__(self):
        cm = self.__contextmanager__()
        if not hasattr(stack_holder, "stack"):
            stack_holder.stack = []
        stack_holder.stack.append(cm)
        return cm.__enter__()

    # ... you get the idea

Personally I’d be fine with the super-simple version where __enter__ can only be called once (what real context manager objects need to handle getting reentered arbitrarily?), and the state just uses a uniquely-named attribute on the class (might make debugging easier?). But all these options are pretty simple to implement, and if @ncoghlan prefers the fancy version then that’s good enough for me.

Conchylicultor · May 1, 2021, 5:54am

Using threading.local sounds nice. I still think we need to keep each instance separately to avoid bugs like:

all_contexts = [MyObject() for _ in range(3)]
for cm in all_contexts:
  cm.__enter__()
for cm in all_contexts:  # Should be reversed(all_contexts)
  cm.__exit__(None, None, None)  # Oups, closing in wrong order

On the contrary I think this is an important feature.
For real cases examples of re-entrant contextmanagers, here are the first example which comes to mind but I’m sure there are many others:

threading.RLock
tf.Graph().as_default()
tf.name_scope (edit: Actually this one is just a simple contextmanaget but using a global shared stack state)

lock = threading.RLock()
with lock:
  with lock:  # non-blocking capture
    pass

Conchylicultor · May 1, 2021, 6:37am

To inspect the state, another idea is to attach it on the __contextmanager__ function. Setting the state on the instance is possible, but might add additional complexity to deal with inheritance, immutable classes, or if the class overwrite __getattribute__.

ncoghlan · May 23, 2021, 5:22am

It isn’t just thread safety that needs to be considered, but also async safety. Hence the prospect of using context variables rather than thread local variables.

I think it’s worth the effort of designing something that’s thread-safe, async-safe, and re-entrant, rather than having to document the limitations.