Merge typed_ast back into CPython

brettcannon · November 22, 2018, 6:44pm

To try to bring this back around then, I think the feature request is:

# type: ... comments be translated to appropriate type hints on AST nodes
# type: ignore somehow be represented
Maybe support # noqa

It seems some have suggested to solve all three by having line comments preserved across the AST and then let mypy do the comment handling themselves. I know @ambv wondered:

I do remember there being a discussion somewhere about trying to standardize on the code quality comment structure (and I feel like @barry was in on that conversation for some reason), but I can’t find a reference to it.

To me, it seems like if we can reasonably carry at least line-level comments with the appropriate AST node that’s the most general solution. But if that turns out not to work out well then adding in support for just the type comments since they are standardized in a PEP and have direct relations to the appropriate AST nodes makes sense to me.

barry · November 24, 2018, 3:58am

I’ll have to see if I can remember where that happened. I feel like it was on the code-quality mailing list, but it could have been the mypy or flake8 issue tracker. The problem we were running into was needing to add tag comments for multiple tools and getting forced too far off the right side.

It feels like this could be PEP worthy. Maybe comments aren’t the best way to specify this (maybe they are though!). If you’re going to modify the AST, would it be possible to add syntax that made this kind of thing explicit? A with statement? A statement decorator? Even if we stick with comments, some kind of standardization might be useful.

guido · November 26, 2018, 3:53pm

(I just found out that Astroid also uses typed_ast.)

guido · November 26, 2018, 4:04pm

It is used to parse type comments used in type signatures, which contain syntax that’s not a valid Python expression, e.g. # type: (int, int) -> str.

guido · November 26, 2018, 4:22pm

I’d like to propose to reduce the scope of the project to just the features that typed_ast currently adds (thanks Nathaniel for referencing the list). I guess I’ll have to work on a PR for this, based on typed_ast. Łukasz has said he’s merge it. While adding # type: noqa support would be nice, the linters currently all have a solution for that, and I propose to punt on that for now.

The PR I have to produce should roughly mimic the instructions in typed_ast for updating, except instead of making a free-standing fork I’d do it as a PR for CPython (master, 3.8).

My only worry would be the code to conditionally parse older versions of the syntax. For mypy this would require supporting 3.4 and up, basically all versions which are still officially supported in some way. (For 2.7 we have a separate backport of ast in typed_ast that in general does not need maintenance.)

After this has been accepted into CPython, producing a new version typed_ast would then be hugely simpler: instead of the elaborate update process mentioned above it would just require copying a bunch of files from CPython into the typed_ast repo with minimal modifications. (This wouldn’t save any effort for the upcoming typed_ast version, but it would make future ones much less work, so we can start supporting new Python versions in mypy sooner.)

guido · January 17, 2019, 5:06pm

I’d like to circle back on this. I now have a thorough understanding of what typed_ast does, and I think it would be straightforward to port it upstream. We’d need to define two new tokens to represent # type: ignore and # type: <whatever>, and tokenizer code to recognize these. Then we need a new flag to be passed to the tokenizer (via the parser) that enables this behavior. We make a small number of changes to Grammar (inserting optional TYPE_COMMENT tokens and to Python.asdl (adding fields to a few node types to hold the optional type comment), and a fair number of changes to ast.c to extract the type comments.

By default, ast.parse() does not return type comments, since this would reject some perfectly good Python code (with a type comment in a place where the grammar doesn’t allow it). But passing an new flag will cause the tokenizer to process type comments and the returned tree will contain them.

I could produce a PR with this in a few days (having just gone over most of the process for porting typed_ast from 3.6 to 3.7).

There’s one more feature I’d like to push for – a feature_version flag that modifies the grammar slightly so it resembles an older version of Python (going back to 3.4). This is used in mypy to decouple the Python version you’re running from the Python version for which you’re checking compatibility (useful when checking code that will be deployed on a system with a different Python version installed). I imagine this would be useful to other linters as well, and the implementation is mostly manipulating whether async and await are keywords. But if there’s pushback to this part I can live without it – the rest of the work is still useful.

Thoughts?

UPDATE: I created an issue: https://bugs.python.org/issue35766

guido · January 29, 2019, 8:04pm

I’d like to merge this. I have everything working (not counting the optional backwards compatibility feature I’ll submit separately). Can I get someone to review it? It might actually make 3.8.0a1!

guido · February 4, 2019, 5:23pm

(And yes this did make it in!)