Oh my mistake. I guess I got it backwards.
Yeah, the else only catches if there is no break. It makes sense because if it breaks, then we never hit the return statement, so we won’t be able to catch anything in else.
I suspect you’re more likely to find support for adding something like a return_value property[1] to generator objects than for adding else as syntax. With this, your opening example would be more like:
gen = generator()
for value in gen:
print(value)
else:
print(gen.return_value)
that raises
RuntimeErrorbeforereturnhas executed ↩︎
The existing use case for this capability is primarily a historical one: returning values from generators dates from a time when all Python coroutines were just generators with a suitable decorator wrapped around them. It remains in place for backwards compatibility with decorator based alternatives to the native async def coroutine definition syntax.
Personally, if we were to add dedicated syntax for “iterate and get result”, I’d just stick the as clause directly in the main for loop header rather than requiring the else clause:
for value in generator() as result:
print(value)
print(result)
I’d still be -0 myself (I don’t think the need comes up often enough to add dedicated syntax to handle it). I’d be -1 on altering the expected lifecycle of generator return values by having any references to the exhausted generator also keep the return value alive.
Offering another way to encapsulate this functionality in a utility library if it’s a frequent need in a given application (the resemblance between IterResult and asyncio.Future is not coincidental):
class IterResult:
def __init__(self):
self._value_set = False
self._value = None
def __repr__(self):
if self._value_set:
value = f"value={self._value!r}"
else:
value="<value not set>"
return f"{type(self).__name__}({value})"
@property
def value(self):
return self._value
@value.setter
def value(self, value):
self._value_set = True
self._value = value
def result(self):
if not self._value_set:
raise RuntimeError("Iteration result is not set")
return self.value
def capture_result(iterable):
itr = iter(iterable)
result = IterResult()
def iterator():
result.value = yield from itr
return iterator(), result
def generator():
for i in range(3):
yield i
return -1
>>> itr, result = capture_result(generator())
>>> result
IterResult(<value not set>)
>>> result.value
>>> result.result()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 20, in result
RuntimeError: Iteration result is not set
>>> list(itr)
[0, 1, 2]
>>> result
IterResult(value=-1)
>>> result.value
-1
>>> result.result()
-1
Maybe it could make sense to offer something along these lines in itertools (or the third party library more-itertools), but the idiom is rare enough that it doesn’t seem unreasonable to expect projects that use it heavily to define their own helper functions to handle it more neatly.
tl;dr Imo there are still some applications for coroutines that aren’t Coroutine objects (i.e. generators since PEP-380), but OP’s issue combined with how generators are typed makes it very unergonomic to write and use such coroutines.
Adding a +1 to some kind of ergonomic way to get a generator’s return value.
I don’t have a good idea on how… tbh I like the idea of a new property or method on Generator objects to hold the result, but I’m not knowledgeable enough in memory management to fully appreciate why Alyssa seems to not like the idea–I’ll just take her word for it that it’s Not Great.
Much of my motivation comes from an old blog post: Coroutines as an Alternative to State Machines. It’s old enough that “coroutine” is used in its more computer scientific sense, as Coroutine objects did not yet exist in Python. I found it an interesting idea, and even applied the concept in my own application. The examples there do not use return or even yield from, interestingly: only generator.send was actually important there.
In my case, I do need return. My generator has the type Generator[bool, SomeEnum, Client]. Initially, it was Generator[Client | None, SomeEnum, None].[1] This was ugly, but, worse, it meant I could never truly know when the caller would have access to the Client object. As the programmer, I’m pretty sure I know, but Pyright won’t shut up about it if I write the natural for-loop:
attempts = 0
steps = iter(SomeEnum)
step = next(steps)
g = mygen(creds) # type: Generator[Client | None, SomeEnum, None]
for test in g:
if attempts > max_tries:
break
if test:
cl = g.send(step) # Advance to the next state
# And yes, there's a bug here with both `test` and `cl` coming from
# yield. Bear with me, please! The important part is that `cl` may be
# None | Client on every loop, and the same would be true of `test`.
step = next(steps)
continue
attempts += 1
do_some_stuff_with_client(cl) # Error: object of type `None` cannot be assigned to `Client`
With the “cleaner” signature, there’s still a problem:
for test in g: # type of g: Generator[bool, SomeEnum, Client]
if attempts > max_tries:
break
if g.send(step):
step = next(steps)
continue
attempts += 1
else:
do_some_stuff_with_client(test) # Error: object of type `bool` cannot be assigned to `Client`
# There's no way to get the Client out of the generator once it's closed.
Even if I wrap it in another generator like some itt have suggested:
def outter_gen(g):
final = yield from g
yield final
g = outter_gen(mygen(creds))
next(g)
for test in g:
...
else:
do_some_stuff_with_client(test) # Error: object of type `bool` cannot be assigned to `Client`
# But `test` really _is_ a Client now!
I get to write a straightforward for loop, but…
Because outter_gen’s yield type will be the yield type of my_gen and the return type is erased, the type checker has no way to know that the type of test in the else block is Client. We could return from outter_gen to avoid erasing the return type, but then we’re right back where we started. (Unless the idea of variadic yield types a la tuple didn’t immediately make you throw up in your mouth…)
If I want to both capture the return value and have a happy type checker, I have to go with this:
while True:
if attempts > max_tries:
break
try:
if g.send(step):
step = next(steps)
continue
except StopIteration as e:
cl = e.value
break
attempts += 1
do_some_stuff_with_client(cl)
which… ew.
And the sad part is: the “coroutines as state machines” idea is actually really nice–refactoring the actual function in my app with this pattern dropped the cyclomatic complexity by five or six points and the end result is so much easier to read. However, the pattern is obscure to the point of being arcane because Coroutines cannot be used this way and the fact that Generators are coroutines (in the theoretical sense) is buried in twenty year old PEPs. And, even if one comes across an old blog post like I did or figures it out from first principles (like I wish I did), this pattern is discouraged by the poor ergonomics surrounding coroutines that aren’t async coroutines.
It would be great if the else clause could rebind the loop’s variable with whatever was in StopIteration.value, but then we run into the classic problem of “sometimes None is what you want.” A non-generator Iterator always puts None there, but a Generator might have an Optional[T] return type, so how is the runtime to know if it should rebind the loop variable for one iterator but not another? Maybe a PEP-661 sentinel gets used instead of None, but then we’d be waiting until, what, 3.22 before all supported Python versions have the new semantics? New syntax like else as y or for x in g as y could be “faster,” but what should that syntax be? The shortest path would be a generator.result property, but that has implications for garbage collection I don’t fully understand.
And yes, the Client does need to be constructed in the generator; it’s from a library I don’t control and it attempts to authenticate upon instantiation, making it useful to test if the server is ready to accept connections but also means I can’t pass a pre-made client into the generator responsible for doing that test, so the generator has to pass the Client back to the caller “somehow,” ideally in a well-typed manner. ↩︎
This was discussed and rejected way back when yield from was being
designed, as an alternative to attaching the return value to the
StopIteration instance. The argument against it was that it would keep
the return value alive longer than needed by the vast majority of use cases.
My solution to your immediate problem would probably involve yielding
some kind of wrapper object that can contain either a client or
something else.
However, I don’t think I would agree that using a generator like this is
particularly elegant, since figuring out what your code is trying to
accomplish is hurting my brain. I can’t help feeling that there has to
be a better way of approaching the whole thing, but I don’t have enough
of the big picture to say what it is.
On further thought, it occurs to me that, whether we realized it or not, what we’re asking for with “make generator returns first-class” is, essentially, to have something like yield for that doesn’t “infect” whatever it’s used in (by turning it into a generator function).
await coro is basically syntax sugar for yield from coro.__await__(), so what we’re indirectly (accidentally) asking for is native async withoutawait.
Which, yeah, that’d be great, actually, but realistic? If it were that easy, we’d probably have something completely different from async/await, no?
All I wanted coming into this was to loop all the yielded values and capture the returned value at the end of the loop without a ton of boilerplate, to fill the weird gap in between Coroutines, where we only care about the return, and Generators, where we (usually) only care about the yield. I certainly don’t want to relitigate how to design async Python.
It’s not hard to create a wrapper that will achieve this:
class Returnit:
def init(self, gen):
self.gen = gen
def iter(self):
return self
def next(self):
try:
return next(self.gen)
except StopIteration as e:
self.value = e.value
raise
def mygen():
for x in (1, 2, “buckle”, “my”, “shoe”):
yield x
return “bubblegum”
r = Returnit(mygen())
for x in r:
print(“Yielded:”, x)
print(“Returned:”, r.value)
% python3 forloop_withreturn.py
Yielded: 1
Yielded: 2
Yielded: buckle
Yielded: my
Yielded: shoe
Returned: bubblegum
%
Can be enhanced to handle send and throw as needed.
You’re not wrong, but that’s even more boilerplate than a while True try except StopIteration block. The issue is not that it can’t be done or that obvious wrappers don’t exist. I actually have a perfect wrapper type already sitting in my app that just needs to be passed in to the generator as an argument.
The issue is that it’s surprising behavior, in the context of how everything else in Python works, and using it is harder than it needs to be. Everything else that returns a value, including Coroutine (paired with await), could be replaced by the value that it returns without changing anything else at the call site. Somehow Generator is the only exception: you have to use specifically a while loop and catch a specific exception, or else invent your own ad hoc Future type just to get the return value—forget about referential transparency—unless you’re running the generator from another generator in a yield from, which gets back to my previous comment.
Chuhan put it succinctly:
The thing is Python does support return statement in a generator function. Then it doesn’t make sense that the return value is not treated as a first-class citizen, and being hidden from the surface. If we figured that it’s pointless to use return statement in a generator function, then it should be well removed completely from the language.
I think Coroutine and Generator are now sufficiently differentiated this could be done, were it decided that non-async coroutines aren’t something that Python needs to support.
Sudden thought: since this all (probably) goes away if for doesn’t eat the StopIteration, maybe the simplest syntax fix is to allow an except (orfinally) clause in for statements? Then, the programmer can easily extract the return value from the exception, and the runtime doesn’t have to deal with unnecessary references to the return value in the majority of cases where it’s not going to be used (because the programmer didn’t add an except clause). The semantics of the else clause don’t change, nor how the loop variable is bound, no new bits in the for statement itself (like as y), and the only new bit is already familiar to Python programmers. (It’s still not ideal in terms of referential transparency, but it would be easy to grok and use, I think?)
There was another discussion what could a hypothetic “loop control object” do.
for x in iterable as loop1:
...
Storing the StopIteration value could be one of its uses. Two other simple ones (also usable in while loops) are: an iteration counter (like enumerate) and a boolean “break command executed” flag. There were also several other ideas.
None of these features alone is worth a special syntax. But maybe they together could shift the balance.