I don’t think it’s worth the cognitive overhead. You would have to start reading every generator expression statement to see if someone wrote if or while. Right now you only have to care if something is for or if after that initial for.
I also don’t think this pattern comes up often enough to warrant the addition. Generator expressions are already a productivity optimization and don’t open up new doors of possibilities, so in this sort of instance I would say just construct the list manually.
Yeah, that’s fair. I’m biased about the cognitive overhead part, because of course what I wrote makes sense to me hah. But I can’t deny that it would only be convenient for a small percentage of folks.
Would it make any sense to allow while in a list (or generator, more
importantly) comprehension? For example, you can currently filter an
entire list or generator like:
[item for item in iterable if item < 10]
But suppose I wanted to do the equivalent of itertools.takewhile, that is consume elements until some condition is met. I feel like it’s succinctly summed up by:
[item for item in iterable while item < 10]
without having to import this function. Given that this function exists
already, this idea is probably already a low priority, but I'd be
interested to see what people think.
I tried this, but it doesn’t work in current Python:
>>> def until(x):
... if x > 10: raise StopIteration
... return True
>>> [ a for a in range(20) if until(a) ]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
File "<stdin>", line 2, in until
It doesn’t, and variants that mess with the iteration in other ways will end up becoming RuntimeError instead. IMO this is a good thing; it’s not possible in an unrolled loop to have a function interrupt the loop, and it would be very surprising if that happened in a comprehension:
"""Calling this function is like having a break statement"""
for a in range(20):
if a > 10: breaker()
Would you like it if there were code that could be placed in breaker() that would behave as described? It’d be pretty confusing.
So there are basically two approaches available for interrupting a list comprehension part way: changing the iteration, and enabling the break statement in some way. Given the way that a comprehension is stacked, I’m not enthusiastic about the second option, but if someone comes up with a really good syntax for it, my opinion might change. Not holding my breath.
Changing the iteration is what we currently have with takewhile. That’s why I think it makes more sense, if syntax were to be added, to make it a variation of the for loop. Consider, also:
stuff = [a for a in things if cond(a) for _ in range(a)]
stuff = [a for a in things if cond(a) while looper(a)]
In the first example, we definitely expect that the condition is checked once, and the inner loop is entirely inside that check. So what would it mean to have a while loop inside that? I’m not sure it makes the right sort of sense. OTOH, if a for loop can have a condition attached to it, this would simply not be valid, as the only loop is the for.
stuff = 
for a in things while cond(a):
# or #
stuff = [a for a in things while cond(a)]
REXX has this kind of concept. A REXX loop is always introduced with the DO keyword, and then it can have any number of clauses after that:
var = initial [to limit] [by increment] [for count]
Your classic “count by numbers” loop is do i = 1 to 10, equivalent to for i in range(1, 11) (REXX is double-inclusive with its range). Saying do i = 1 to 10 by 3 is like for i in range(1, 11, 3). And do i = 1 for 7 will do seven loop iterations and then stop, kinda like using islice on the iterator.
But the concept I want to focus on is do i = 1 to 10 while cond(). It’ll loop, just like any other, but check the condition each iteration. Once the function returns false, the loop will stop.
REXX doesn’t have the idea of “iterate over this collection”, so Python definitely wins there, but IMO it’s worth considering the possibility of adding conditions to the loop itself.
(For completeness: do 10 means “iterate ten times”, and is broadly equivalent to for _ in range(10); and do until cond() is slightly different from a negation of a while loop in that it’s checked at the end of the loop rather than the start - like C’s do-while loop. No direct Python equivalent. There’s also do forever which can’t be combined with other clauses, because a simple do … end will run once - it’s REXX’s equivalent of an indented block in Python, or a pair of braces in C.)
It’s been discussed various times on, e.g., the Python-Ideas mailing list. Here’s one example that I stumbled across while searching for an unrelated matter.
“Use takewhile, a while-loop, or a for-loop with a break.”
Sure, we can do that, but that reasoning equally applies to regular list comprehensions as well. We added comprehensions as a more readable, easier to use, alternative to imperative style for-loops and functional style map() and filter(). The same applies here: the only difference is that takewhile loops are a bit less common than filter.
“Comprehensions have a correspondence to nested for- and if-statements, and this would break the correspondence.”
No, it doesn’t break the correspondence, it merely modifies it to include a term that corresponds to something spelled differently.
A comprehension like [expr for x in seq if cond] maps neatly to nested for- and if-blocks:
accumulator = 
for x in seq:
except that the whole thing is buried inside a hidden function. Changing the if to a while would be just a small modification:
# [expr for x in seq while cond]
accumulator = 
for x in seq:
# NOT "while cond"
if not cond: break
We still have a correspondence, with a change of spelling of “while” to “if not … break”.
That’s okay. We also have for...else and while...else where the else statement has no connection to the if...else version, so having two meanings of “while” is no big deal.