Why no tuple comprehension?

Marco_Sulla · December 7, 2019, 6:15pm

As title.

PS: When I started to study Python, I expected that (x for x in it) would return a tuple, but it returns a generator.

steven.daprano · December 7, 2019, 11:29pm

Partly due to historical reasons, partly due to lack of need for tuple
comprehensions.

Historical reasons: the first comprehension added to Python was list
comprehensions, using square brackets. Why lists? Because we copied the
feature from Haskell, and that’s what they did.

The second comprehension added was generator comprehensions, using round
brackets (parentheses). It was only many versions later, in Python 3,
that dict and set comprehensions were added, but by then round brackets
were already used for generator comprehensions.

The other reason is that compared to generators, lists, sets and dicts,
needing to create tuples from a loop is comparatively uncommon.
Generators are the most important: if we could only have one kind of
comprehension, it would be generator comprehensions. Lists are probably
the second most useful and common. Tuples are the least: tuples are most
commonly created from a small number of heterogeneous items,
representing a record or struct:

(24, 'word', 2.5)

rather than a long sequence of homogeneous items, like lists, so there
is less need for a tuple comprehension. If you need a tuple, use a
generator comprehension and call tuple:

tuple(expr for name in items if cond)

storchaka · December 12, 2019, 8:58am

I agree with all that Steven had said, and want to add yet one reason. Unlike to list, set and dict, tuple is immutable. You cannot extend a tuple by adding an element, you only can create a new tuple, and this makes the complexity of the comprehension quadratic:

result = ()
for item in iterable:
     result += (item,)

It is possible to do this at linear time, but you need to use a temporary list:

result = []
for item in iterable:
     result.append(item)

It can be written using a list comprehension:

result = tuple([item for item in iterable])

Or, in more efficient but obscure form:

result = (*[item for item in iterable],)

A tuple comprehension is less fundamental than a list comprehension, can be expressed using a list comprehension, and this is the optimal form. So there are no benefits from adding a special syntax construction for it.

For the same reason there is no a frozenset comprehension.

Marco_Sulla · December 12, 2019, 8:20pm

No? And what about _PyTuple_Resize()?

asvetlov · December 12, 2019, 8:34pm

This is a private API.
Sure, you understand the difference good enough.

Marco_Sulla · December 12, 2019, 8:40pm

Not at all. A tuple comprehension could use the tuple private API. The reasons of Steven, even if I do not agree with “tuples does not need comprehensions because are less used”, have more sense of the reasons of Storchaka.

storchaka · December 13, 2019, 5:59am

How could it help? Do you know what does this function do and what are its performance characteristics?

steven.daprano · December 13, 2019, 10:58am

Marco Sulla wrote:

‘even if I do not agree with “tuples does not need comprehensions
because are less used”’

Which part don’t you agree with?

(1) Tuple comprehensions are less used.

(2) We don’t need special syntax for tuples comprehensions.

I had a quick scan of my code, and I found multiple hundreds of list and
generator comprehensions. I found a dozen or so set and dict
comprehensions. It would have been more except that a lot of my code
has to work in Python 2 as well as 3.

And I found exactly two uses of tuple(comprehension).

So for my code, part (1) is certainly true. Less than one percent of
my comprehensions are turned into tuples.

For part (2), I suppose it doesn’t matter whether you or I agree on
whether we need tuple-comprehension syntax. It’s too late: the round
bracket comprehension (…) is used for generator comprehensions.

Marco_Sulla · December 13, 2019, 9:33pm

It’s used by tuple() itself.

They can’t be less used, they does not exists
What you’re talking about it’s not tuple comprehension, is tuple(it), or the transformation of an iterable to a tuple. That is obviously more slow that a real tuple comprehension, because you have to create the iterable first, and make a function call. Remove the iterable creation, and you have a real comprehension.

Yes, I know… I just wandering if there’s an laternative syntax. Maybe <>?

Anyway, I think benchmarks are clear:

>>> y = range(1000)
>>> timeit("[*y]", globals={"y": y})
11.902503897203133
>>> timeit("tuple(y)", globals={"y": y})
11.917629012838006
y = list(y)
>>> timeit("[*y]", globals={"y": y})
1.9247933391015977
>>> timeit("tuple(y)", globals={"y": y})
1.797155560925603
>>> y = (x for x in y)
>>> timeit("[*y]", globals={"y": y})
0.13140351907350123
>>> timeit("tuple(y)", globals={"y": y})
0.17855184921063483

The time is more or less the same. If Python will have a tuple comprehension, we can have tuples instead of list at the same speed, but with less memory consumption:

>>> sys.getsizeof(list(y))
56
>>> sys.getsizeof(tuple(y))
40