(Edit: I started writing this before @Nineteendo reposted the catalog of warnings from the previous thread, but was interrupted, and then didn’t realise this thread had moved on when I came back and finished it)
It isn’t that they happen before __main__ starts running, but rather that, if they happen at all, then they happen when modules are imported. With eager imports, that’s often during test discovery rather than during test execution. They also end up being reported for all code in dependencies, not just the code that is actually used by the running application (or the code under test).
On that front, I realised that we should explicitly note all the syntax warnings that are currently emitted (just by searching the CPython code base for SyntaxWarning):
- unknown backslash escape sequences
>>> "\g"
<python-input-5>:1: SyntaxWarning: invalid escape sequence '\g'
'\\g'
- invalid octal escape sequences
>>> "\400"
<python-input-17>:1: SyntaxWarning: invalid octal escape sequence '\400'
'Ā'
- subscripting a type known not to support subscript lookup
>>> lambda: 1[2]
<python-input-23>:1: SyntaxWarning: 'int' object is not subscriptable; perhaps you missed a comma?
<function <lambda> at 0x705dfb04b920>
- subscripting a known container type with an invalid literal type
>>> lambda: [][""]
<python-input-3>:1: SyntaxWarning: list indices must be integers or slices, not str; perhaps you missed a comma?
<function <lambda> at 0x705dfb04bb00>
- using
is(oris not) to compare against a non-singleton literal:
>>> int is 1
<python-input-6>:1: SyntaxWarning: "is" with 'int' literal. Did you mean "=="?
False
- assertions with non-empty tuples (e.g. attempting to use parentheses to span multiple lines)
>>> assert (False, "message")
<python-input-8>:1: SyntaxWarning: assertion is always true, perhaps remove parentheses?
- attempting to call things that definitely aren’t callable
>>> lambda: []()
<python-input-9>:1: SyntaxWarning: 'list' object is not callable; perhaps you missed a comma?
<function <lambda> at 0x705dfb04bce0>
- running keywords up against a preceding numeric literal
>>> 1and 2
<python-input-21>:1: SyntaxWarning: invalid decimal literal
2
- the new control flow in finally warning added by PEP 765
Tangent: I found the “perhaps you missed a comma?” hints to be a bit cryptic, so I filed an issue to suggest amending the hint wording to be more explicit.
If we were to make emitting syntax warnings a new runtime opcode instead of emitting them eagerly at compile time, I think that would solve a bunch of problems with them:
- they’d always be emitted when the code ran, regardless of whether the code was precompiled or not
- in test cases, they’d be emitted at test execution time, not test discovery time
- it would fix the current problems with them not having correct location information without needing major changes to pass module info to the various compilation APIs
- in dependencies, we’d only get syntax warnings for the code we actually run rather than implicitly scanning every line of code
- the various warning actions like
alwayswould work properly with them
A potential technical difficulty with the idea is that not all syntax warnings are emitted in the opcode generation step: some warnings related to invalid escape sequences in f-strings and t-strings are emitted in the tokenizer, while the one for missing whitespace between numeric literals and a trailing keyword is emitted by the lexer. (While the PEP 765 warnings are currently emitted in the AST parsing phase, that caused unanticipated problems, so they’re going to move to the opcode generation step with most of the others)
(cc @storchaka since this is a potentially different approach to solving the warning location problem)