Common static type analysis tools

I would like to see some standard functionality which deals with types of objects.

I have seen that mypy has tons of code which collects type information about entities in a program. I suppose that other static type checking programs have to do much of the same things.

So I would like to see as much functionality as would be generally applicable to static type checking to be made available in a separate package, or even included in typing as a sub-module. It should include anything that a static checker “is expected to do” as mentioned in various PEPs.

Here are some possibilities:

  1. Classes representing static types of whatever complexity. A common base class for all type info objects. Any type object is an instance of this type into class (we don’t have to have the type class be an actual subclass, but it could use isinstance() customization.) Includes overloads and callables.

  2. A variable has a scope, which is a module, or a class or function within that module. It may also have a type. A name appearing in a scope can be a variable with that name in an enclosing scope, or possibly a variable imported from another module.

  3. Parse code (including stubs) and build tree of class and function scopes.

  4. Parse expressions and get their types, using known type information for elements of the expression.

  5. Set the type of a variable, such as when processing an assignment or other name binding operation. Detect changes in the variable’s type.

  6. Type narrowing in conditional blocks. Within these blocks, some variables will have a narrowed type.

  7. Filter out unreachable code from the program’s syntax tree. Should interpret sys.platform and sys.version_info comparisons. Should interpret condition expressions, as in an if statement, if they evaluate to a constant value (at least in reasonably simple cases). An optional set of always-true or always-false names can be provided.

  8. Subtleties of runtime execution. A variable might be bound to different types while its scope is being executed. A local variable in a class refers to a global variable of the same name before it is actually bound within the class def. Calling a function can have side effects on variables in enclosing scopes.
    There should be the capability of executing a module step by step, updating the status of variables at each step.

  9. Special treatment of imports. Any imported name is the same variable as the same name at the module scope in the target module. For import *, examine __all__ in the target module, if it exists, and import all the given names; otherwise if the target module is fully analyzed (including stubs), import all the public names; otherwise, the existing public names will be imported, but there may be more names to import later, as the target module is further analyzed.

  10. Handling import cycles and unresolved names. If a particular name is imported circularly from one module to the next, and not defined in any of them, then this is an error in the programs, and the name will be undefined everywhere. If the name is defined in one module, it will be propagated to all the modules. For import *, any public name in any module is propagated to the entire cycle. All cyclic imports should be detected and reported.

  11. Interpret typing.TYPE_CHECKING as being False, but any code guarded by it (that is, import statements) should be kept separately. The imports are not considered part of import cycles. Names imported here can be used to provide values for otherwise undefined variables.

I have some code which will do some of the things I’ve mentioned above.

  1. Parses a Python module and builds a tree of Scope objects. Determines where each name is actually located (either in the same scope or some enclosing scope). It understands lambdas and comprehensions. It handles walrus expressions in comprehensions.
  2. Parses the Python module and builds a parallel tree of Namespace objects. These are dynamic versions of Scopes, and can track assigns, reassigns, and deletions of variables. It handles lookups of local class variables that aren’t currently assigned a value, by searching the module namespace. It tracks the type of each variable whether or not it has a current value.
    Any code within, or enclosed within, any function is considered to be not executed at the time it is parsed, but any binding operations will set a type (if known) to the variable in whatever namespace it exists. Any other code is executed immediately and will assign or delete a value from the variable, as well as provide a type.
  3. Ignore unreachable code, as determined by analyzing if and match statements and constant truth values. This is essential in preventing extra information in the scopes tree and contradictory types assigned to variables.

By the way, parts 1. and 2. will serve to document the Python scoping rules, and they come with a thorough test program which constructs namespace trees and verfies the actual and expected values for variables. I personally feel that this would be an important contribution to the Python community and I am looking for somebody to take on the maintenance and distribution of this code.