Syntax error for naked trailing comma for 1-tuple

boxed · November 2, 2022, 10:13am

Python has a foot gun with naked trailing comma. I got bitten by this again today:

url = invoice.url[len(prefix):],

The trailing comma shouldn’t be there, so url is now a tuple, and that fails much later as the url variable is passed around a bit before it’s being used.

In my time on the Unofficial Django Discord I see this mistake quite often. And very often it takes quite a while for anyone to notice what the problem is, even when several very senior developers are looking at a full traceback and the code.

I think naked trailing commas for 1-tuples should be deprecated and then changed to causing a syntax error directing the user to use the explicit (x,) syntax.

barry-scott · November 2, 2022, 5:36pm

Since this is valid and will be used correctly in existing code it cannot be changed to SyntaxError.

Do the linters check for this? Pylint, mypy etc?

Kurt · November 2, 2022, 6:18pm

I think linters are the right solution here. And perhaps a corresponding pep8 recommendation? Something like “The form (x,) is preferred for 1 element tuples”. This would ensure that linters are soon updated accordingly.

effigies · November 2, 2022, 6:25pm

Pylint flags this, and black reformats it. Flake8 is silent.

test.py:

"""Test module"""

a = 1,

$ pipx run pylint test.py
************* Module test
test.py:3:0: R1707: Disallow trailing comma tuple (trailing-comma-tuple)

------------------------------------------------------------------
Your code has been rated at 0.00/10 (previous run: 0.00/10, +0.00)
$ pipx run black --diff test.py
--- test.py	2022-11-02 18:23:19.291318 +0000
+++ test.py	2022-11-02 18:24:25.810325 +0000
@@ -1,3 +1,3 @@
 """Test module"""
 
-a = 1,
+a = (1,)
would reformat test.py

All done! ✨ 🍰 ✨
1 file would be reformatted.

boxed · November 2, 2022, 7:24pm

will be used

Well… yes. That’s why I argued for a deprecation warning now, and a syntaxerror somewhere down the line.

linters

This is just not a good enough solution. This is mostly a problem for exactly the type of people who don’t use linters: beginners.

notatallshaw · November 2, 2022, 11:56pm

Not good enough in what sense?

Beginners are learning and there are many tools to help learn. IDEs often recommended to beginners can have built in linting and help such as PyCharm and VS Code.

Python as a language can’t anticipate every mistake a beginner might make and trying to code for them at the expensive of everyone else in the language doesn’t seem like a good trade off. Thus making a syntactically backwards incompatible change for this one case a beginner not using a tool to help them out doesn’t seem like a good trade off.

But perhaps you had though of some other solution?

boxed · November 3, 2022, 2:58am

Not good enough in what sense?

In the sense that it doesn’t make a dent in the problem

Python as a language can’t anticipate every mistake a beginner might make and trying to code for them at the expensive of everyone else in the language doesn’t seem like a good trade off.

That’s not the case here though. I’m not suggesting a change that will make it easier for beginners and worse for experienced. I’m suggesting a change that makes it better for everyone. At worst neutral for experienced devs or those who use black on-save.

And I am still arguing for a, potentially very long, deprecation period. Could be a decade! But the future is long, so I think that’s fine.

notatallshaw · November 3, 2022, 3:12am

Backwards incompatible changes to the language is worse for large code bases that use Python, and it’s caused a lot of pain in Python in the past, as it can significantly delay projects from upgrading to new versions of Python. And the issue is it can affect code they don’t control, as it the problem can arise in your dependencies or further downstream in transitive dependencies. Any benefit needs to be weighed against this.

boxed · November 3, 2022, 6:06am

100% agree. Not arguing against that at all.

vovavili · November 3, 2022, 8:08am

One useful case for this is concisely unpacking an iterable into an ordered collection if you’re in an interactive Python session or feel like code golfing:

s = "qwerty"
t = *s,

Which saves you one keystroke over:

t = [*s]

storchaka · November 3, 2022, 10:59am

It is not just valid syntax, it is widely used syntax. I think that raising an error or even just a warning about it would lead to much more costs for testing, debugging and rewriting the existing code and adding wrappers for third-party code that is not updated as quickly. This would lead to chaos for several years.

boxed · November 3, 2022, 11:57am

Widely used? Do you have a source for that? It seems to me that since pylint warns for it, and black reformats it, it seems it should not be common at all.

And if it is in fact common, it’s always interesting to know how many % of those uses are latent bugs that someone either hasn’t been bitten by yet, or has worked around by adding a [0] somewhere else because they don’t know why they got a 1-tuple.

boxed · November 3, 2022, 12:03pm

Hah. Googled a bit and found there’s a third option: a 1-tuple created by mistake and directly thrown on the ground, causing no issue. A majority of these for example: PYL-R1707 · Trailing comma tuple detected

AndersMunch · November 3, 2022, 12:42pm

I imagine it varies wildly by codebase.Some never use it, others don’t hesitate to.

The standard library has its share, I found 61 in 3.11. Some are innocious errors, but most are clearly intentional. None are obvious errors, though that can be hard to tell.

My script to find them:.

import tokenize, token, sys, glob
total_count = 0
for arg in sys.argv[1:]:
    for fn in glob.glob(arg):
        count = 0
        prev_tok = None
        try:
            for tok in tokenize.generate_tokens(open(fn, encoding='utf-8').readline):
                if tok.type==token.NEWLINE:
                    if prev_tok is not None and prev_tok.string==',':
                        print("%s:%s:%s: %s" % (fn, tok.start[0], tok.start[1], tok.line.rstrip()))
                        count += 1
                prev_tok = tok
        except UnicodeDecodeError:
            print("couldn't read", fn)
        else:
            total_count += count
print(total_count, "total")

Note that this only looks for statement-ending commas, and will not find trailing commas in assignment targets.

apalala · November 3, 2022, 12:47pm

With yearly releases of Python versions, deprecation of this error-prone syntax makes sense, even if it takes some years to make it a syntax error.

I agree that newcomers won’t enable a linter, and will eventually get hit by these difficult -to-debug errors.

pochmann · November 3, 2022, 2:16pm

I’ve used it quite a few times and wouldn’t like to lose it.

Some thoughts:

Parentheses for the 1-tuple would reduce consistency with the other lines:

foos = 1, 2
bars = 3,
quxs = 4, 5, 6

I couldn’t quickly out-comment like this anymore:

tests = test1, #test2, test3

If it applies to targets as well, I couldn’t do

for value, in query_results:

anymore and it would reduce consistency with loops like for x, y in points:. If it doesn’t apply to targets, then we lose consistency between targets and tuples.

NeilGirdhar · November 3, 2022, 5:54pm

I think all of those examples make reading your code unnecessarily more difficult. The for loop examples probably needs a comment to remind the reader that there’s a comma there.

NeilGirdhar · November 3, 2022, 5:59pm

I still sometimes forget to remove a trailing comma, not notice it, and try to figure out why something’s not working. So I don’t think blocking this syntax is “at my expense”. On the contrary, it protects code writers from these mistakes.

Blocking the syntax also protects code readers from having to make sense of code that doesn’t have clear comments.

I think I everyone would benefit from such a change in the long run. As others have mentioned, there would be a little bit of pain in the short run. But since many of the large codebases are linted, it’s not clear that there’s that much pain.

boxed · November 3, 2022, 7:11pm

Consistency is subservient to practicality though.

I personally would assume all your examples were bugs if I read them while reviewing code.

pochmann · November 3, 2022, 7:14pm

I don’t see why. It’s perfectly valid normal code.

(Except the example with out-commenting wouldn’t go into a review, I do that just temporarily while developing.)