Propose Review of Single Element Tuple Handling in Python

chepner · February 1, 2024, 4:13pm

The syntax is weird because there’s no real corresponding mathematical notion of a 1-tuple to draw from. We had to make it up.

In math, there are ordered pairs, ordered triplets, etc that arise from the cross product of 2, 3, etc sets. But what’s the product of 1 set (or of zero sets, for that matter: what’s an empty tuple)?

Python drew on this idea to define a single tuple type that is essentially more of an immutable list than a representation of mathematical tuples. In some sense, there is no need for either 0-tuples or 1-tuples: the 0-tuple is just None in disguise, and a 1-tuple is just the first element dressed up for dinner. They only really exist so that we can have a single tuple type instead of multiple 2-tuple, 3-tuple, 4-tuple, etc types.

Practically speaking, there are situations where a 1-tuple can be semantically different from its sole element, but that goes hand-in-hand with other design decisions in the type system. (Haskell, for example, does quite well without a single tuple type, and has no real notion of 1-tuples. The lone value of the unit type is effectively the empty tuple, as suggested by the notation () for both the type and its value, but other higher-order types pretty much fill any imaginable need for a 1-tuple.)

peterschay · February 1, 2024, 4:57pm

I completely agree.

It’s unfortunate that a precious punctuation character, the comma, has been used for an unusual, albeit cool and interesting construct, the one-item tuple.

Overall I love Python’s syntax and nearly all the decisions and additions over the years. I even like the walrus!

We have error hints now that often correctly suggest “perhaps you forgot a comma?” – maybe there is a way to detect common mistakes with an extra comma?

sirosen · February 1, 2024, 5:33pm

I’m showing my pedantic streak today, but I’m going to insist on not thinking of it this way.

1, 2 is a tuple in python (as an expression). It’s great! It’s symmetric with an unpacking assignment (statement):

x = a, b
a, b = 1, 2

1, 2 is a tuple because there’s a comma. Parentheses are optional. This is nice^[1].

So 1, being a single-element tuple is just a “predictable” degenerate case of the existing comma-based tuple syntax.

I do agree that the result is unintuitive for beginners and an occasional “gotcha” even for experienced developers. But why it’s there and how it’s actual consistent with the rest of the language is not so simple as this phrasing suggested.

We have error hints now that often correctly suggest “perhaps you forgot a comma?” – maybe there is a way to detect common mistakes with an extra comma?

Linting and autofixing isn’t part of the language. That’s the domain of tools – CLIs, IDEs, etc. So no discussion of Python changes is needed.

It’s actually a pretty accessible space. You can write a linter.
Try writing something using the ast package from the stdlib or libcst. (ast is pretty great and easy to use, I recommend actually doing this. libcst is harder to use but also great.)
Then, test your linter: clone some projects – maybe even cpython – and run the linter on them. Are the results good?

Probably you’ll just be flagging tons of valid and correct usage though.
If you can find a rule which is consistently providing value and which linters should implement, but which mainstream tools (e.g., black, ruff, flake8, flake8-bugbear) don’t have today, you can go and advocate for it.

Debatable. But I like it. ↩︎

peterschay · February 1, 2024, 5:46pm

It’s elegant, however commas do not only create tuples.
For example, it is incorrect to use tuple(1,)

This is part of the language, and very nice. Are there people who do not like it? Probably.

$ python -c 'def foo(a=1 b=2)'
  File "<string>", line 1
    def foo(a=1 b=2)
              ^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?

jamestwebber · February 1, 2024, 5:51pm

I don’t think the error message is part of the language. It’s part of one implementation of the language, but not required to be compliant with the language specification.

Rosuav · February 1, 2024, 6:01pm

Rosuav · February 1, 2024, 6:03pm

Peterschay:

sirosen:

Linting and autofixing isn’t part of the language.

This is part of the language, and very nice. Are there people who do not like it? Probably.
$ python -c 'def foo(a=1 b=2)'
  File "<string>", line 1
    def foo(a=1 b=2)
              ^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?

What you’re seeing there is two somewhat distinct pieces of behaviour.

It is an error to have def foo(a=1 b=2):
Having determined that the code is definitely wrong, what MIGHT you have intended?

The second one is closely related to linting, but linting is done on valid code. There’s very little provision in the language for valid code to produce messages like this (SyntaxWarning is used extremely sparingly, and only for constructs that are almost certainly wrong, often ones that are going to become outright errors in the future).

peterschay · February 1, 2024, 6:08pm

For sure! Earlier in the thread I was suggesting that it would be nice if a common category of extra-comma errors could be detected, as with the missing comma case.

Rosuav · February 1, 2024, 6:11pm

Ah. To be clear here, I did not mean “error” in the sense of “bug”, but in the much narrower sense of the SyntaxError. It is impossible to parse the token sequence def foo(a=1 b=2) into a valid Python syntax tree. In contrast, python = 1, from your original concern is NOT a syntactic error; it may very well be a bug, but it is perfectly meaningful. Thus it is valid code - code that has a real and reasonable meaning, albeit perhaps not the one you wanted.

peterschay · February 1, 2024, 6:22pm

Good point. It’s not possible to detect common problems on the spot, since it’s valid code. There are often exceptions very closely related to the problem, but then that probably is entering the domain of linters.
If it were possible to augment an exception message for something like this without much risk of being incorrect, that would be great – but I see it’s not as simple. I suppose it’s already too late, when the exception happens:

>>> class Foo:
...   a=1
...   b=2,
...   c=3
... 
>>> x = Foo()
>>> n = x.a + x.b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'tuple'

chepner · February 1, 2024, 6:54pm

Precious? There are a number of places where commas are used without defining a tuple, one-item or otherwise. Parentheses are used to disambiguate these. 1, vs (1,) is more like 3+5 vs (3+5): only necessary in certain contexts, but harmless in others.

[1,2]: a list with two elements
[(1,2)]: a list with one element
print(1,2): a function call with two arguments
print((1,2)): a function call with one argument.

Rather, I think self-imposed grammar restrictions made it impossible to find a suitable set of explicit delimiters for tuples, relegating them to the parentheses-optional form that we do use. (The only pair of matched delimiters available without going to Unicode are <...>, and using them raises issues, for example, of identifying < as the opening of a tuple or as a comparison operator.

peterschay · February 1, 2024, 8:56pm

Thank you and everyone for replying to my ideas/questions.

That’s why punctuation and other symbols are “precious” - there are a limited number of them and using symbols while balancing many goals in the language design is an artistic challenge.

A one-item tuple is an odd duck and undeserving of all this attention, IMHO. It was lucky to get its comma in the first place.

Multi-item tuples with the comma(s) are natural and aesthetically pleasing to beginners and experts alike.

peterschay · February 1, 2024, 9:11pm

Cython uses them!

chepner · February 1, 2024, 9:23pm

Cython is a separate language with its own parser.

chepner · February 1, 2024, 9:24pm

What do you think can’t use a comma because of how 1-tuples are defined?

peterschay · February 1, 2024, 9:55pm

A beginner.

Using the comma for something so esoteric creates pitfalls and readability barriers for beginners due to this very special, almost academic purpose.

Overusing a symbol dilutes its meaning and confuses people. Try to figure out what the _ means in Scala!

jamestwebber · February 1, 2024, 10:09pm

Tuples aren’t esoteric though, they’re a fundamental type. Seems good to learn about it early?

peterschay · February 1, 2024, 10:14pm

Yes and yes. But a one-item tuple is a curiosity, and shouldn’t merit confusion-causing use of a fundamental symbol like the comma.
Commas are perfectly suited for multi-item tuples.

jamestwebber · February 1, 2024, 10:23pm

Okay ¯_(ツ)_/¯ I don’t think that this is particularly more confusing than anything else in programming, it just takes a little experience.

In any case it would be impossible to change this now, so I think this one is pretty much done.

peterschay · February 1, 2024, 10:49pm

Is there a way to close a topic? We finished it!