Add optional `if break:` suite to `for/while`

zhangyx · November 27, 2024, 4:08pm

But they will be easier to type and look less like a variable name.

jamestwebber · November 27, 2024, 4:20pm

That’s true, one would have to use if not break if you wanted a stand-alone equivalent.

Basically the syntax for for/else remains entirely unchanged, and the soft keyword break after a loop allows for testing additional conditions in a generic way.

MegaIng · November 27, 2024, 4:41pm

It would not be a soft keyword, it would continue to be a hard keyword, since e.g. break = ... would not be valid. It would be similar to __debug__, None, False. True.

A soft keyword only makes sense in a position similar to if, match, case, i.e. introducing a statement. A soft keyword can never reasonably be inside an expression since it would never be clear if it’s a keyword or variable name.

jamestwebber · November 27, 2024, 4:43pm

Ah, sorry I didn’t understand the definitions properly.

NeilGirdhar · November 27, 2024, 5:47pm

Typing typically doesn’t matter when it comes to code because code is read many, many more times it’s written.

I understand this concern, but I think (ubiquitous?) syntax highlighting will distinguish these enough until your mind gets used to a keyword with an underscore in it.

mikeshardmind · November 27, 2024, 5:56pm

I assume this would be tracked per scope appropriately, and that a break or lack thereof in do_some_stuff_no_matter_what() can’t change the outcome. If so, I think it’s fine. It would essentially be turning the cases I already use a “break check” variable into a syntactic feature.

jamestwebber · November 27, 2024, 6:06pm

I’d definitely want scope tracked appropriately but I haven’t thought that deeply about what that means in complicated scenarios. It might turn out to be unworkable.

hprodh · December 2, 2024, 8:53am

There might be a lot of codes that already use the variable name “on”, so I think on break could be error prone.
Additionally, imagine the worst debugging hell possible whence your eyes miss an misplaced indentation at the line where you use if break… I think this is clearly not a good idea either.

xitop · December 2, 2024, 8:41pm

Please allow another proposal: an optional “loop object” holding data related to a for/while loop. For the discussed functionality we need just a single boolean flag. Examples:

# if_break example
for x in xlist as searchloop:
    if cond(x):
       break
if searchloop.break_executed:
    print(f"found {x}")
else:
    print("not found")


# example 2 (more compact than if_break)
for x in xlist as xloop:
    if cond(x):
       break
do_something(x if xloop.break_executed else None)


# enumerate example (unrelated)
for item in sequence as myloop:
    print(f"index={myloop.counter}, value={item}")

(update: added example #2)

Lucas_Malor · December 2, 2024, 9:05pm

My two cents: I agree with people that find if_break can be confused with a variable. I prefer ifbreak, onbreak, fubar or whatever.

Anyway it’s just a detail, I like the idea.

seberg · December 2, 2024, 9:07pm

Shouldn’t have mentioned it on break with a space, even if I don’t quite follow what the problem would be since break is a keyword.

I still think except break: would best choice anyway, even if happily admit that on first sight it seemed odd to me. And while some other ideas feel more obvious in some ways except break seems like it does a good job in all aspects to me, with little serious failure (including except break as value making sense if one wants it eventually).

ptmcg · December 2, 2024, 11:41pm

+1 on the original proposal, after changing if break to if_break (though I find the introduction of an “_”-containing keyword visually jarring, and would prefer ifbreak)
+1 for the happy serendipity that while/[ifbreak]/[else] and for/[ifbreak]/[else] can make for a nice clarification on what the heck that else thing is after the while or for, especially when currently the else is just dangling there, and code in the loop before break gets moved into the ifbreak block helps the reader to remember “oh yeah, that else is there for when we didn’t break”
+1 for the OP’s reserve in proposing the simplest change that supports this kind of code restructuring; if the loop has multiple breaks under different conditions, then this calls for preserving those conditions in some state variable, not bloating this simple syntax
-1 for replacing or synonymizing the else here with ifnotbreak or similar; as mentioned elsewhere, for-else and while-else have been there since the dawn of Python time, and have been a source of confusion for just about as long I expect. But I think the addition of this feature will make the post-loop else more self-evident, just as when writing if condition: ... else: ... in current Python, there is no need for else not condition:.
-0 there is still the need if breaking out from a nested loop to rebreak in all the containing loops; however, this syntax does support just doing ifbreak: break in those outer loops, so at least the throwaway found variable is no longer required.

tim.one · December 3, 2024, 10:04pm

Ya, I’m not trying at all to address that here, although, as you note, it is a little simpler if there are “few” nesting levels involved.

But breaking is only part of that puzzle space: there’s also, at times, a need to continue at an outer nesting level,

A more general approach (which I’m not at all adding to the current proposal) would be for break and continue to grow an optional new “label”, and allow for labelling loop statements.

while dat := buffer.read(1024) label getdata:
    for b in dat:
        if b == IGNORE_RECORD:
            continue getdata
        if b == FINAL:
            break getdata
if_break:
    # do stuff unique to a `dat` block containing FINAL

Is that realistic? I sure hope not

jamestwebber · December 3, 2024, 10:11pm

Might as well introduce a new keyword for that syntax, perhaps goto…?

jamestwebber · December 4, 2024, 12:30am

I was kidding. But I do think a labeled break is almost the same thing, and shouldn’t be encouraged for similar reasons.

dg-pb · December 4, 2024, 5:20am

From the class of more general extensions, this is my new favourite.

The main benefit is of course that this is more general interface to loop that can be used for many different things.

Given potential capabilities of such extension, I would say syntax is justified.

Slightly verbose, but can just use shorter names:

for x in xlist as lp:
    if cond(x):
       break
if lp.broken:
    print(f"found {x}")
else:
    print("not found")

hprodh · December 4, 2024, 8:24am

This seems brilliant !
Fusing the loop and the context manager concepts, thus yielding automatic flags, implicit access to loop values, generalized exception management over the whole loop.
Some builtin class (e.g. LoopContextManager) would be made for this, maybe implementing e.g. __on_break__, __on_continue__, __on_end__, __on_next__ aside the existing __enter__, __exit__ methods.
All overridable by subclassing !
This could enable a lot of loop management redundancy cleaning in codes I guess.

Of course it is already possible to this in some way :

for i in (LCM:= LoopContextManager(range(N))):
    ...

Yet it might imply non-standardized method names.
Also, the following usage way could simplify a lot of code :

for i in range(N) as gatherloop(gather=['data_1, data2']):
   data_1 = some_operation(i)
   data_2 = some_other_operation(i)
gathered_data = (gatherloop['data_1'], gatherloop['data_2'])  # lists of the consecutive values of the flagged variables at the end of every iteration

thus removing sometimes a lot of lines data_1_l = [] before and data_1_l.append(data_1) within some loops.

Also, while non-directly related, the issue addressed by the original proposal finds another solution, as well as other non-addressed issues :

for i in range(N) as loop(on_break=func_1, on_end=func_2, on_except=func_3, on_continue=func_4, on_next=func_4, on_iter=func_5, gather_vars=[varname1, varname2]):
    ...

(here, I wrote on_next and on_iter, because I think one can be ran at the start and the other at the end of each loop iteration).

dg-pb · December 4, 2024, 9:15am

Current else has a meaning of “if loop has run its complete course uninterrupted”. And although many people are struggling with it, the concept is IMO pretty solid.

But, if_break introduces certain inaccuracy in semantics.

def func():
    for i in range(5):
        return None
    if_break:
        do_sth()
    else:
        # IF NO BREAK
        do_sth_else()

So else semantically indicates that no break has happened. However, it will not execute if no break has happened in case the loop returned.

Great idea - one can already play with this concept to get a feel for it:

class LoopManager:
    def __init__(self, it):
        self.it = iter(it)
        self.interupted = True

    def __iter__(self):
        return self

    def __next__(self):
        try:
            return next(self.it)
        except StopIteration:
            self.interupted = False
            raise

for i in (lcm := LoopManager(range(10))):
    if i > 5:
        break
if lcm.interupted:
    print('A. BROKEN')
else:
    print('A. FINISHED')


for i in (lcm := LoopManager(range(10))):
    pass
if lcm.interupted:
    print('B. BROKEN')
else:
    print('B. FINISHED')

# A. BROKEN
# B. FINISHED

encukou · December 4, 2024, 9:54am

That powerful concept, one we arrived in a brainstorming session coming from a different direction.
Consider wanting to process some data in a loop, and reporting individual failures in an ExceptionGroup:

errors = []
for value in data:
    try:
        process(value)
    except Exception as exc:
        exc.add_note(f'handling {value}')
        errors.append(exc)
if errors:
    raise ExceptionGroup('processing failed', errors)

This feels like the the kind of situation where you could lift the error-handling logic into a generic context manager, but, AFAIK the usage for an ExceptionGroupBuilder context manager can’t look much better than the following. You need an “outer” context to know when the ExceptionGroup should be raised, and an “inner” context to capture individual exceptions.

with ExceptionGroupBuilder(ExceptionGroup, 'processing failed') as builder:
    for value in data:
        with builder(value):
            process(data)

If we had a fused iterator and context manager, that could be:

for value with ExceptionGroupBuilder(data, ExceptionGroup, 'processing failed'):
    process(data)

or

for value in data with ExceptionGroupBuilder(ExceptionGroup, 'processing failed'):
    process(data)

Half of the inconvenience – needing an inner context – often occurs with unittest.subtest where you often see with in a for.
And solving the other half – the outer context – would allow:

for line with open('file.txt'):

with the file closed even on break or return.

For the OP use case, we might have:

for value with data as searchloop:
    ...
if searchloop.interupted:
    ...

which would try to call a new __with_iter__, and if that didn’t exist it would default to giving you a “loop info/control” object.
And maybe that could allow breaking/continuing outer loops:

for row with rows as row_loop:
    for elem with row as column_loop:
        if elem == 'needle':
            row_loop.interrupt()

But, that’s now a very different idea…

dg-pb · December 4, 2024, 10:28am

All below seem to put a lot of weight and complexity on syntax, while it can be done quite elegantly without it:

exc_group = ExceptionGroup('processing failed', [])
for value in data:
    exc_group.capturing(lambda: process(value))
if exc_group:
    raise ExceptionGroup('processing failed')

Or even:

with ExceptionGroup('processing failed', []) as exc_group:
    for value in data:
        exc_group.capturing(lambda: process(value))

I can’t see any need to mix context managers with loops.
Unless there is something that I am missing here?

I would not use with here…
But yes, this path of introducing internal LoopManager object would pave the way for extensions such as this.

Maybe:

for row in rows as row_loop:
    for elem in row as column_loop:
        if elem == 'needle':
            break row_loop

S.t.:

def break(loop=None):
    if loop is None:
        loop = get_inner_loop()
    obj.interrupt()