Using unpacking to generalize comprehensions with multiple elements

alexprengere · October 3, 2023, 4:14pm

For a while we have been able to use the unpacking syntax when building iterables:

a = [1, 2, 3]
b = [4, 5, 6]

[*a, *b]  # [1, 2, 3, 4]
{*a, *b}  # {1, 2, 3, 4}

A common pattern when refactoring code is to transform simple loops into comprehensions:

Flight = namedtuple("Flight", ["departure", "arrival"])

f1 = Flight("PAR", "NYC")
f2 = Flight("LON", "MAD")
flights = [f1, f2]

# Not so good
points = []
for f in flights:
    points.append(f.departure)

# Better
points = [f.departure for f in flights]

But in the case where you need to extract more than one value in each loop iteration, this no longer works as well:

points = []
for f in flights:
    points += (f.departure, f.arrival)

# Without relying in itertools.chain, no way to make this into a comprehension
points = list(chain.from_iterable((f.departure, f.arrival) for f in flights))

I was thinking that we could update the syntax to allow for:

points = [*(f.departure, f.arrival) for f in flights]

The unpacking nicely mirrors the [*a, *b] syntax where you expand one iterable.
As this is currently a syntax error, I believe this would be backward compatible, but I might have missed something.

Edit: sorry, just saw Why can't iterable unpacking be used in comprehension? which looks close

zware · October 3, 2023, 4:36pm

You can nest another loop in the comprehension:

points = [leg for f in flights for leg in (f.departure, f.arrival)]

It’s a bit less than ideal with an extra tuple creation and loop, but works. Edit: although, on another look I realize that the tuple creation was already in the original, and there’s an implicit loop in the unpacking anyway. So maybe this isn’t too bad, just a bit unintuitive.

And the original still works as well

redhog · October 4, 2023, 12:35pm

This is however a pretty unintuitive syntax in python, and many people get it wrong, while * and ** expansion are obvious…

alexprengere · October 4, 2023, 1:27pm

After some digging, it turns out this syntax was considered as part of PEP 448, in the variations section. Quoting that section:

Earlier iterations of this PEP allowed unpacking operators inside list, set, and dictionary comprehensions as a flattening operator over iterables of containers:

>>> ranges = [range(i) for i in range(5)] 
>>> [*item for item in ranges] 
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3] 
>>> {*item for item in ranges}
 {0, 1, 2, 3}

This was met with a mix of strong concerns about readability and mild support. In order not to disadvantage the less controversial aspects of the PEP, this was not accepted with the rest of the proposal.

Since this PEP was written and accepted 10 years ago, and most people are now familiar with the syntax it introduced at the time, I wonder if things would go differently today for that shelved part.

Rosuav · October 4, 2023, 1:41pm

That’s why PEPs are retained as historical documents! We can go back and look at it, and consider revisiting things. The arguments against this part of the proposal are laid out there: readability. Basically that’s all. So in order to bring it up for renewed discussion, someone has to be willing to champion it, and show that (a) it’s actually pretty readable, and/or (b) the value of it is enough to justify adding it.

Don’t forget that, for every new piece of syntax, there are costs; various tools need to understand them, other Python implementations have to support them, future Python developers have to handle more situations, etc.

So, what are the use-cases for this? One very common line of argument is “here’s how this could be used in the Python standard library”. If you do a bit of research into that, you might find some worthwhile transformations. Post those here, showing “how it now is” vs “how it could be with comprehension unpacking”, and if you don’t have enough examples from there, poke through some other popular libraries or major codebases.