Context managers inside generators: __genter__ and __gexit__

Hi guys.
I would like to use enter and exit inside a generator, but in my case enter and exit could yield.

An example:

def generator():
     o = Obj()
     with o as obj:
         obj.use()

class Obj:
    def __genter__(self):
        for i in range(5):
            yield i
        print("Object finally ready to be used.")
        return self

    def __gexit__(self):
        return True

    def use(self):
        print("Using the object")

Usage:

g = generator()
while True:
    next(g)

I.e. something similar to:
async with <async_fcn>:

Do you think something like that would be possible?

Hmm, this looks like an awkward use of a context manager and generator; they need to know too much about each other. What does it mean for the generator to yield values? Presumably this isn’t actually some form of asynchronicity, or you’d just use async with, so they must be some sort of lazy values; but without context, it’s hard to see what’s going on.

Is it possible to separate the preparation from the actual context management, either before or after? For example:

def generator():
    o = Obj()
    yield from o.prepare()
    with o as obj:
        obj.use()

or

def generator():
    o = Obj()
    with o as obj:
        yield from obj.prepared_values()
        obj.use()

or possibly:

def generator():
    o = Obj()
    with o as headers:
        yield from headers
        o.use()

These would all work, but it depends on how you’re using this context/gen pair as to how much sense they do or don’t make.

Hi Chris.
Of course, it can be worked around.
The generator does not need to know anything about context manager. The generator is just a generator and does not need to know if the values come from context manager or from any other code.
The only issue is that the we cannot yield from context manager, i.e. there is not possibility, that the context manager entry or exit could yield.

Imagine this code though:

mutex = Mutex()
def generator():
    with mutex: # wait for mutex
         do_something()  # mutex is acquired

Note, the Mutex class here is nothing to do with OS multithread mutex; neither it is asyncio.Lock, it is my own component.
Of course, I could workaround without context manager:

yield from mutex.lock()  # wait for mutex
try:
    # access shared state
finally:
    mutex.release()

But you surely admit that this is where context managers are very useful.

Or, I could:

with mutex:  # this will not wait for mutex; it would be elegant if the context gets already acquired mutex
    yield from mutex.lock()  # better, but still not nice use of the context manager

Yes, but it doesn’t make a lot of sense in general for a mutex to cause a generator to yield values. What kind of generator is this? What type of values is it producing?

If it’s a coroutine, I’d recommend using async with instead, since that’s what it’s designed for. But a generator normally is like a lazy list of values, and every generator will make different kinds of things - a prime number generator is very different from a tree-walking generator is very different from a directory-listing generator. (And all of them are very different from a coal generator, which does not generate coal. Man, I love English.) But anything can use a mutex. So what does it mean, conceptually, for the mutex to be able to yield values on behalf of the caller?

1 Like

“Yes, but it doesn’t make a lot of sense in general for a mutex to cause a generator to yield values.”
It perfectly does. If you try to emulate a mutex in your own simulation environment, then the mutex has to yield to say “I am waiting to be run next time when I can acquire the mutex” - exactly what you do when you do:

lock = asyncio.Lock()
async with lock:
    pass # access shared state

Here the lock in fact yields (async is just a syntactical sugar). The difference is that the yield is supported by the event machine of the asyncio, where I have my own one. It my case it is not a coroutine.

However, you tried to move to the discussion about mutexes. It was just an example. You can have any other situation where you want to have your context manager to yield until correct value is expected.

This is not a generator function, because it does not contain yield or yield from.

1 Like

It is generator function, because

with o as obj:  # here the yield is hidden, because this in fact would do: yield from obj.__genter__()
   obj.use()

since genter() is generator.

The compiler does not know that. The type of o is not known until this line be executed.

In any case, with yielding values would be a great surprise. You need different keyword (or sequence of keywords) for this.

1 Like

… so… it’s a coroutine.

… so… it’s not a coroutine.

What is it yielding? Is it a coroutine or isn’t it? What are you trying to do, and can you PLEASE provide more context? It makes a difference.

So far, from what you’re showing, it looks like this is a coroutine, which would mean building something on top of async/await is going to be far more suitable than building something on top of generators. Yes, the code is nearly identical, but (a) it’s not identical, notably with features like async with that simply don’t exist in generators, and (b) you can’t easily make generator-generators but you can pretty straight-forwardly make an async generator, so if you use generators to pretend to do async code, you’re cutting out the opportunity to ever use generators.

Of course, it was just a simplification.
We already have “async with” to tell the interpreter to translate to yielding.
We could have for generator “yield with” or something similar.

1 Like

It is a generator.
A coroutine is a special generator for asyncio.
I do not want to use asyncio. Nothing on top of async/await.

Regarding to your comment:
(a) async with simply don’t exist in generators - and hence the request here- I am asking to exist something for generators
(b) you can pretty straight-forwardly make an async generator, so if you use generators to pretend to do async code - can you show me how?

A coroutine is more general than Python’s asyncio module. Python’s generator functions are one form of coroutine, and asynchronous functions are another form of coroutine.

So I ask again: What are you doing with generators and why is it not better suited to async functions? You do NOT have to use asyncio to use async functions. I’m not going to show you how to “use generators to pretend to do async code”, I’m going to show you how to do async code.

import time

class Thing:
	async def __aenter__(self):
		print("Entering a thing")
		return self
	async def __aexit__(self, *args):
		print("Exiting a thing")

class Delay:
	def __await__(self):
		print("Delays, delays...")
		return iter([123])

async def coro():
	print("Hi, I'm a coroutine")
	await Delay()
	async with Thing() as obj:
		print("I have", obj)
	print("The context manager is all done now.")
	await Delay()
	print("Bye!")

waitfor = coro()
while True:
	try: waitfor.send(None)
	except StopIteration: break
	time.sleep(2)

Not a hint of asyncio in sight. This is async functions done manually. Obviously to be useful, you’d have to have a proper event loop, but this is a simple proof of async/await in an extremely vanilla context.

Generators != asynchronous functions.

2 Likes

No, it is not a generator function. Maybe you want it to be a generator function in the future but right now it is not a generator function.

The first technical issue is this question: how does the compiler know at compile time that o is a context manager containing a yield?

The compiler has to decide at compile time whether the def statement builds a regular function or a generator function. It does that by looking for yield inside the function block. If there is no yield, and the value of o is unknown until runtime, how does the compiler decide what to do?

But before we even get into the technical details, let’s talk about the most critical issues:

  • What are you trying to do?
  • Why are you trying to do it this way?
  • What is this supposed to do, at runtime?

I’ve read this entire thread, and I still don’t understand the motive behind it, or the use-case.

I think the motive is that you want to write asyncronous code without using asyncio, but I’m not sure.

  1. Why do you want to avoid asyncio?
  2. Have you tried using an old-style generator-based coroutine?

The core devs aren’t going to spend many hundreds of person-hours designing, discussing, implementing and testing a complex new feature if the answers are just “no particular reason” and “no I haven’t”.

In your first post you showed us this usage:

g = generator()
while True:
    next(g)

which is great, but I have no idea what that is supposed to do and how it will differ from the exact same code run today.

To explain your proposal, its not enough to just show a sample of how you use your feature, you also have to tell us what that code will do.

In another post, you have this:

with mutex: # wait for mutex

followed just ten lines later:

with mutex:  # this will not wait for mutex

o_O

That looks like a contradiction to me. The exact same line of code will both wait for mutex and not wait for mutex. Is this something to do with quantum mechanics? How is the reader supposed to know which case it is, if the comment is missing or inaccurate? How does the interpreter decide between the two behaviours?

Yes, that was two alternative ways to achieve a goal. In the first (and more desirable) form, with mutex: means “wait for the mutex”; in the second form, with mutex: just means “I’m using the mutex”, and yield from mutex.lock() means “wait for the mutex to be available”.

But this is exactly what async functions are good at. And it’s NOT what generators are good at. I’m still at a loss as to why this somehow HAS to be a generator function.

Guys, the issue is that I made you into a mistake just thinking that the level of explanation is good enough, so I commited some ‘uncertainties’ which I have to clarify now.

First of all, I want something like this:

def generator():
     o = Obj()
     yield with o as obj:  # this is the request of my post: can we have something like "yield with" which would compile into generator obj.__genter__() feeding?
         obj.use()

class Obj:
    def __genter__(self):
        for i in range(5):
            yield i
        print("Object finally ready to be used.")
        return self

    def __gexit__(self):
        return True

    def use(self):
        print("Using the object")

I hope now it is clear.

Why I need that… First of all, I was not aware, that I can easily make it with async / await without introducing asyncio etc. I tried it (thanks @Rosuav), it works. I tried that I can use async/await together with generators (both can be fed with send() ), that is very good.
However I cannot use async / await for the particular problem (to workaround “yield with” with “async with”), because now I create the coroutine dependency - the caller of the coroutine has to be a coroutine too). That means I would need to switch the whole project to the coroutines. So far I did not decide if this is the right solution. The reason is that it creates another dependencies for the users of my package and they will have to use coroutines.

Unfortunately, you’ve hit on something pretty fundamental. Coroutines tend to infect everything they touch. Whether you ended up forcing the matter through generators or switching to async functions, the exact same thing would happen.

This probably isn’t a major problem, as there are (broadly speaking) two solutions:

  1. Separate “stuff that can wait” from “stuff that can’t”, and make sure that the ONLY things that go into the second category really truly cannot ever be slow. This is kinda a pain sometimes, when you discover that something needs to switch categories, but it’s what most people do.
  2. Decide that async def is simply how you start functions from now on, and await func() is how you call them. There’s a bit of overhead but at least it’s consistent. There’ll be some oddities, such as that you’d have to use __new__ rather than __init__ to set up your classes (since init can’t be a coroutine), but for the most part, it should work.

One concept that I’m toying with is that a coroutine that isn’t immediately awaited should be implicitly “spun off” as a separate task. I’m not entirely sure how to accomplish this, but it might make this sort of code a bit easier to work with. The trouble is that it might introduce a lot of overhead, since there’s no easy way to check whether a coroutine is about to be awaited; that means every coro call has to become a stand-alone task, and then if you wait for it, you wait for the task. But that overhead might be worthwhile for the simplicity of being able to do this:

async def get_ip_address(hostname):
    """TODO: use DNS (not gethostbyname) so it works asynchronously"""

async def write_log_entry(line, **kwargs):
    if "hostname" in kwargs and "ip" not in kwargs:
        kwargs["ip"] = await get_ip_address(kwargs["hostname"])
    # log with all available information

async def do_stuff():
    write_log_entry("Doing stuff with Google", hostname="google.com")
    await stuff()
    write_log_entry("Doing stuff with Twitch", hostname="twitch.tv")
    await stuff()
    # etc

For this to work, a coroutine would have to immediately execute down to its first await point, and then the awaitable returned would have to be added to the queue of tasks (if you’re using asyncio, that’s asyncio.create_task, but if you’re doing your own asynchronicity, there’s probably a list of “stuff we’re waiting on” somewhere). Whatever comes back from that would need to still be awaitable, but now as a task. I think it can be done with a decorator but I’m not entirely sure.