Support unchecked iterables as tuple assignment sources

Gouvernathor · February 22, 2024, 11:43pm

It would basically allow the following syntax:

A, B, C = range(3)

Which is a no-boilerplate version of an IntEnum, and (in my experience) commonly used, to something like the following:

A, B, C = itertools.count()
# or rather
A, B, C, _* = itertools.count()

The problem is that the first example raises an exception due to (I imagine) count() not raising StopIteration after 3 values emmitted, and the second example obviously freezes to death trying to exhaust the count in order to fill the catcher variable.
Obviously we can’t change the meaning of the first syntax because a lot of code relies on this raising, and we can’t change the second for even more obvious reasons.

A possible syntax would be this:

A, B, C, * = itertools.count()

It would just instruct the interpreter not to check whether the iterable raises StopIteration after C gets a value, and instead leave it alone.

Thoughts ?

kknechtel · February 22, 2024, 11:59pm

Currently, one can write

A, B, C = itertools.islice(itertools.count(), 3)

But I agree that this is very clumsy for the task. It requires a library import if our source iterable isn’t already from itertools (one might want, for example, to “unpack” the first few lines of a file); it requires counting the unpack variables; and it requires another level of nesting.

Gouvernathor · February 23, 2024, 12:15am

Oh I posted this in the wrong part of the site for some reason, I wanted to go to Ideas. Is it possible to move the thread ?

Well, on its face, it’s just a longer version of the range(3) syntax. I understand the other possible uses of that feature (and the more, the merrier).
My base goal though was to remove the hardcoded 3 in the original line, which allows adding elements to the unpack (= values to the enumeration) without having to update that value, and also makes merge conflicts easier.

tjreedy · February 23, 2024, 1:12am

Since target lists are not objects, it is hard to count them at runtime. The following similates doing so

for name, value in zip(('A', 'B', 'C'), (1,2,3,4,5)):
    globals()[name] = value
    
print(A, B, C)
# (1, 2, 3)

but this ‘cure’ seems to me worse than the ‘disease’ of the user having to count. It only works for globals since compiling functions works best when the compiler can list and count targets.

kknechtel · February 23, 2024, 2:22am

Right; hence the desire for a syntactic feature that effectively counts them at compile time. Or rather, a syntactic feature that would grant access to a variant on the UNPACK_SEQUENCE opcode that doesn’t care if the unpacked iterable has more elements.

chepner · February 23, 2024, 3:41pm

I think it creates a special case that looks very similar to something with a very different meaning, just so you can define a large number of variables that should probably by keys in a dict instead.

constants = dict(zip(names, values))

ntessore · February 23, 2024, 3:51pm

+1 for using islice(), as this is the more general solution where you can keep the remaining iterator around for further processing

Gouvernathor · February 23, 2024, 5:22pm

You mean that the current implementation for tuple assignment, first exhausts the iterator, then computes the shape (and number of elements) of the target ?
But in any case, I think the counting of targets can be done from the AST, so I don’t understand why it would be hard to access that data at runtime.

jamestwebber · February 23, 2024, 5:42pm

Is the functional syntax for an IntEnum really that much more onerous?

E = IntEnum('E', ['A', 'B', 'C'])  # <- easy to extend!

The only downside is that you have to type E.A^[1] instead of A, which doesn’t seem like a huge problem and possibly would help readability in the long term.

or whatever name you use ↩︎

alicederyn · February 24, 2024, 1:05pm

Not the only downside. They no longer behave like simple integers, they have a different type, they print differently, they break reasonable C code.

kknechtel · February 24, 2024, 1:52pm

The point, as I understood it, is to be able to support partially unpacking an arbitrary iterable (for example, to read the first few lines of a file and store them in separate named variables). It is not simply about assigning increasing integers 0..N-1 to some variables (or otherwise setting up useful names for those integers).

mikeshardmind · February 24, 2024, 2:11pm

Do we really need more magic in syntax? If we assume there is value in this beyond the motivating example,itertools.islice, as was pointed out above exists for this, and the benefits are “saving” a relatively common import and not specifying the number of assignment targets. you can also write it without the use of islice if the import is that bothersome:

it = iter(some_iterator)
a, b,c, d = (next(it) for _ in range(4))

Without other examples of why this would be important enough to justify new syntax, the motivating example is presented as numeric constants without int enum and without caring about specific names being bound to specific numbers (presented equivalently to enum.auto()) while minimizing git diffs. I can’t think of a time where I cared about it actually being an int and not caring about matching the constant value to what was expected (ffi use) even so, this is also possible another way:

from types import SimpleNamespace

names = ["a", "b", "c", "d"]  # only line that changes, and can be modified to not use enumerate if you need to care about the actual numeric values
NumericConstants = SimpleNamespace(**{name: num for num, name in enumerate(names)})

Or even more simply (And the way I’d do myself), just add a line for each constant

a = 0
b = 1
c = 2
# what do we do when adding one?
d = 3

and import it if that gets too noisy from something like a file named constants.py

alicederyn · February 24, 2024, 2:14pm

with open(...) as f:
  header_line = next(f)
  separator_line = next(f)

I think I’d prefer to see this for that example. I don’t think the proposed new syntax makes it particularly clear that iteration stops at the , * — it looks too much like the syntax for continuing to consume values.

jamestwebber · February 24, 2024, 3:13pm

This enum usage was presented as the motivation for a much more general syntax change. There might be other use cases but they weren’t in the OP.

Gouvernathor · February 26, 2024, 8:48am

I had no idea that existed ! It looks great ! Though it doesn’t solve all of the issues.

Yes it was, for me at least.

Yup. And I don’t care that much that the objects aren’t real ints when using enum, it’s more that people have to understand how enum works at least in a basic way in order to understand the code, and that it requires more overhead. Even with the functional example given, I’m not sure all IDEs would highlight the string which defines a constant when you put the cursor on it, or take the string into account when renaming the constant, and so on.
Your solution of one line per constant adds git diff issues when inserting a constant in the set, which happened quite a few times in my experience (since you have to rename all the following constants). That’s why I went with range in the first place.

alicederyn · February 26, 2024, 10:17am

They are real ints, but they’re instance of a subclass of int.

Why not just use a regular enum then?

I don’t understand what you mean here.

Gouvernathor · February 28, 2024, 12:42pm

GET_A_DOG = 0
GIVE_A_TREAT = 1
PET_THE_DOG = 2

Now, say I want to insert a new constant for buying the treat, between the state of getting the dog and the state of giving the treat. The result is this:

GET_A_DOG = 0
BUY_A_TREAT = 1
GIVE_A_TREAT = 2
PET_THE_DOG = 3

You see that all the lines starting from the insertion get a diff, because every single constant after the insertion point has to have its hardcoded value updated.
Whereas when using the range recipe, you just have to add BUY_A_TREAT, and change range(3) to range(4).

Axe319 · February 28, 2024, 1:01pm

I apologize if I’m missing something, but isn’t this the use case for enum.auto?

Gouvernathor · February 28, 2024, 1:07pm

Yes, if you put all these lines inside a class S(enum.IntEnum): along with an import. But as I said earlier, Enums are a tough thing to learn how to make work:

Daverball · February 28, 2024, 1:15pm

It doesn’t need to be an enum.IntEnum in order for enum.auto() to work. The enum module is well documented, I don’t think it is that difficult to learn or teach for that matter. Enum is a pretty common data structure in programming languages, the only potential trip-up is that it’s implemented using a metaclass in Python rather than being a builtin type with its own syntax/keyword.

As far as motivating examples go I think yours is on the weak side. Not that it’s a bad idea in general, but it probably doesn’t come up often enough that islice isn’t a good enough alternative, if you don’t want to exhaust the iterable. Syntax changes need to provide a lot more value in order to be justifiable.