What's going on with SyntaxWarning: invalid decimal literal?

kg583 · May 5, 2024, 7:55pm

Currently, a snippet like

[i+1for i in range(10)]

will raise SyntaxWarning: invalid decimal literal, due to the 1for. This is to warn against deprecation (it was changed from a DeprecationWarning, but nevermind that choice for now) of the syntax due to parser inconsistencies documented in this GH issue.

Said issue led quickly to a PR from @storchaka to implement the mentioned warnings. It seems there was initial hope to turn these warnings to errors as soon as 3.12, though this has been (thankfully) put off (though now it’s unclear when it is happening).

Fundamentally, this is a good warning to have. Examples like the above code are ambiguous (especially when involving hex and octal literals) and almost always a mistake by the programmer. I like that it’s a warning.

What I don’t like is the eventual plan to turn them into full-fledged errors, at least without any further regard to the points I’m about to list.

1. There are some selfish motivations for being against these errors: code golfing. A few people have mentioned the utility of these whitespace savings in code golf, where the goal is to accomplish some task in as few characters of source code as possible. No real regard was given to these commenters; not that they should inherently hold such sway, but direct responses would have been nice.

At any rate, errors would break thousands of solutions on sites like the Code Golf Stack Exchange and code.golf; the latter of these is also “live” competition, in the sense that no-longer-passing solutions to holes are removed, so we would expect sweeping deletions. Either that, or a “break” in Python as a language on these sites, <3.x and >=3.x, which seems a bit silly. The only time a Python version has caused suck a break was 2 → 3, which makes perfect sense given its scope. Every version since has only helped golfers, e.g. 3.8 with the walrus operator :=.

2. More generally, the depth of discussion surrounding this change has been a bit disappointing. @storchaka’s PR was written and merged rather quickly following the simple collection of examples of relevant parser bugs, which is admirable, but also a tad concerning. I need only link the final comment in a still-open GH issue opposing the warnings outright. I don’t completely agree with exander77’s sentiments, but I do agree with the comment. @mdickinson recommended in that thread that responses move to the ideas forum, so here were are.

I’m somewhat reminded of the @classmethod @property debacle, wherein “class properties” got added then removed within 2 versions thanks to some resulting issues elsewhere in the codebase. For as slowly and carefully Python development does rightly move, I’m rather annoyed by these instances of hasty decision-making.

In all, I don’t think its unreasonable to claim that, while fixing the parser and warning programmers of the upcoming changes needed to do so is great, the roadmap for making it all happen is poor. I am peeved by the lack of addressal of the concerns raised by exander77, not because they specifically have not received a response (the issue is from January), but because the quotes they collected did not receive adequate responses at the time (over 3 years ago at this point).

Of course, asking to retract progress entirely would be fruitless and just plain dumb, so my actual “idea” boils down to just one request: an interpreter flag or environment variable. Let us use the old parser, bugs and all. As far as I can tell, it’d be pretty inexpensive.

It’s either that, let “backwards compatibility be damned” and kill all the old golfs (that’s a rather upsetting quote to hear from Guido to be honest), or some secret third thing, which I’m happy to explore.

MegaIng · May 5, 2024, 8:17pm

IMO, as a general rule, code golfing should not be consider when designing a general purpose language that is supposed to be used in production and be used to teach beginners. While it’s always fun to see such solutions, they will use and abuse every little (mis-) feature and bug in a language that exists. Python doesn’t respect people messing with bytecode, internal CPython ABI or internal stdlib APIs details when talking about backwards compatibility. Ofcourse, breaking changes for any community are annoying, but some communities are more important than others. If the only problem caused by making this a syntax error is “code golfers will be annoyed”, then I am sorry, but the reduction in confusion and surprises for everyone else is just worth it.

(this doesn’t address the other points you raised. I don’t have an opinion on that)

kg583 · May 5, 2024, 8:49pm

If the only problem caused by making this a syntax error is “code golfers will be annoyed”, then I am sorry, but the reduction in confusion and surprises for everyone else is just worth it.

That’s fine; I would just appreciate the core team saying outright “no, we don’t care, sorry” (or “sure, here’s a flag”). The change going through in some capacity seems inevitable, and in that regard I cannot provide adequate opposition to it as a whole.

pf_moore · May 5, 2024, 9:11pm

@storchaka is part of the core team. So is Guido. What more exactly are you looking for in terms of “the core team saying outright” that this isn’t sufficient reason to block this change? If it helps, I’m also a core dev, and I’ll also say that preserving the validity of code golf solutions isn’t something we care about.

kg583 · May 5, 2024, 9:35pm

I’ll also say that preserving the validity of code golf solutions isn’t something we care about.

Cool. Thank you. The golf sites will prepare accordingly for whenever it goes through.

The linked issue/mailing list comments were only “implicitly” responded to (one of them received two reactions and nothing more, which doesn’t leave a great taste in my mouth). exander’s collected comments were similarly unaddressed, though he did not move discussion here as suggested.

storchaka · May 6, 2024, 10:10am

My original concern, expressed in the 2018 thread, was about inconsistency of the Python parser. Why 0else and 1or are recognized by the parser, but 0or is an error? It is because there is a special code for resolvign the ambiguity for number followed by else and the floating point literal, but not for number followed by or and the octal literal. As the general solution that would eliminate future ambiguities I proposed to make a whitespace separator between a number and the following word to be required. At first, this idea was rejected. But new examples were discovered in 2021 (unambiguous to the parser, but ambiguous to humans). This time Guido spoke out in favor of banning such syntax, and he was supported by other active core developers. No one spoke against this time, so there was no reason to delay this change. References to code golf came up immediately, but it was not taken as a serious argument.

kknechtel · May 7, 2024, 1:31am

For what it’s worth, the issue described in 2021 was also publicized on Stack Overflow, apparently the same day as the mailing list exchange.