Resources for diving into the internals of Python type checkers?

Hello there!

I’ve been a passionate user of typing and mypy in my Python projects and am looking to delve deeper into the behind-the-scenes mechanics. While I haven’t delved into the source code of any Python type checker yet, I have experience with type checking algorithms in Haskell and ML. I’ve also contributed to a few open-source Python projects and am interested in understanding the low-level details of type checkers in Python.

My main objective is to understand the bridge between the Python Enhancement Proposals (PEPs) related to typing and the actual implementation in type checkers like mypy. While the PEPs and discussions provide a high-level overview, I’m keen on any resources that offer a deeper dive – be it code walkthroughs, tutorials, or any other form of documentation.

For instance, I’m curious about how type checkers manage and implement type rules. Is there some kind of meta language defined for this, or is it primarily procedural Python code? By “meta language”, I mean a domain-specific language or structure used to define type rules and logic. (I’m sure that would be easily figured out by looking into the source, but I think it was useful in prompting this question.)

Additionally, are there blogs or individuals in the Python community that focus specifically on typing internals that I should follow, apart from the Mypy blog?

It may well be that the best way to approach this is to just start reading the code of some type checker. As a user, I’m familiar with mypy. Is that a good place to start? Also, and I realize this probably depends on which checker I’m looking into, is there some friendly community where I could as a newcomer to the internals ask questions once I’ve done my best to understand the code?

Thank you in advance for any guidance or resources you can share!

Thanks for your interest! In general, I don’t think there are in-depth resources of the kind you are asking for. I’ll try to answer your specific question and link to a few resources that exist.

As far as I know, no type checker uses a “meta language” of the kind you outline. Typing rules are simply implemented in code, and every new type system feature that extends the language requires some new code in the type checker.

I am not aware of any blogs dedicated to the internals of typing.

Mypy has a wiki (Home · python/mypy Wiki · GitHub) explaining some internals. Mypy is written in Python, but in my experience, the codebase isn’t always easy to make sense of.

Pyright is another popular type checker. It has some documentation of its internals (https://github.com/microsoft/pyright/blob/main/docs/internals.md), but it is written in TypeScript.

The best place to ask questions about internals of Python typing right now may be the #type-hinting channel on the Python Discord (Python).

1 Like

Different type checkers use different approaches because they’re supporting different needs:

  • mypy was the first PEP 484 typechecker and is best at bulk typechecking entire directories.
  • pyre is another typechecker for bulk typechecking with a special focus on speed, because it was needed to typecheck very large company-internal Python projects.
  • pyright powers in-IDE typechecking for VS Code and therefore specializes in incremental typechecking, because typechecking has to happen frequently on files the user is actively typing in.
  • pytype uses a fairly different type inference strategy than all the other typecheckers I mentioned. It appears to be designed to work well on Python code with minimal explicit type annotations. (I have the least experience with this typechecker.)

So your answer would be on a per typechecker basis.

Yeah I think starting to read the source code of a typechecker you’re already familiar with as a user is a good place to start.

In the case of mypy specifically - which I’m most familar with - it’s worth checking out the mypy wiki on GitHub as a starting point.