PEP 561’s module resolution order seems incorrect

The typing spec details the module resolution order that type checkers are supposed to follow when resolving a module name to a concrete source file containing Python code. The resolution order is mostly derived from PEP 561[1]:

  1. Stubs or Python source manually put in the beginning of the path. Type checkers SHOULD provide this to allow the user complete control of which stubs to use, and to patch broken stubs or inline types from packages. In mypy the $MYPYPATH environment variable can be used for this.
  2. User code - the files the type checker is running on.
  3. Stub packages - these packages SHOULD supersede any installed inline package. They can be found in directories named foopkg-stubs for package foopkg.
  4. Packages with a py.typed marker file - if there is nothing overriding the installed package, and it opts into type checking, the types bundled with the package SHOULD be used (be they in .pyi type stub files or inline in .py files).
  5. Typeshed - only for modules in the standard library.

In my opinion, the typing spec is incorrect here in one important respect, and should be amended.

Precedence of the stdlib over site-packages

If at runtime, I have the (long defunct) asyncio backport installed from PyPI, import asyncio will not resolve to the module I have pip-installed. Instead, it will resolve to the stdlib asyncio module: at runtime, modules in the standard library nearly always take precedence over the same module name in site-packages.[2]

I experimented locally by installing the asyncio backport package and then manually adding a py.typed file to the site-packages/asyncio directory in my virtual environment. After doing so, it appeared that mypy would still correctly resolve import asyncio to [the typeshed stub for] the stdlib module, accurately understanding the semantics at runtime. Pyright, however, did not: it resolved import asyncio to the py.typed package in site-packages. But although mypy “gets it right” and pyright “gets it wrong”, pyright is the one that is accurately following the spec here according to the module resolution order given above.

I believe the spec here is incorrect: to accurately resolve modules in a way that reflect’s Python’s runtime semantics, typeshed’s stubs for the standard library must take higher priority than modules installed into site-packages.

Where typeshed must come last

The original module resolution order in PEP 561 mentioned that vendored typeshed stubs for third-party modules should be placed last in the module resolution order, alongside vendored typeshed stubs for the standard library. In this respect, I think PEP 561 got things right.

At the time when PEP 561 was written and accepted (in 2017-2018), mypy (at the time, by far the most significant type checker) vendored all of typeshed’s third-party stubs. It stopped doing so in 2021, but pyright and pyre still vendor either some or all of typeshed’s third-party stubs.

Although not all type checkers vendor typeshed’s third-party stubs nowadays, if a type checker does vendor any of these stubs, I think it’s important that these vendored third-party stubs should continue to come last. If a user installs a non-typeshed stubs package (say, docutils-stubs), it would be highly confusing and frustrating for the user if the type checker nonetheless continued to resolve import docutils to the type checker’s stubs for docutils in its vendored copy of typeshed.

Another example of where typeshed coming last comes in useful here would be if a user needs to continue using an older version of docutils at runtime due to some incompatibility at runtime, but their type checker only contained typeshed’s vendored stubs for the latest version. Vendored third-party stubs from typeshed taking lower priority to site-packages would allow them to manually install an older version of the stubs package, which would then override the vendored stubs from typeshed.

Unlike the original version in PEP 561, the current version of the module resolution order given in the typing specification does not specify where vendored typeshed stubs for third-party packages should come.

Proposed update to the module resolution order in the typing spec

Pursuant to the above arguments, I propose that we make three changes to the module resolution order:

  1. Move typeshed’s standard-library stubs higher in the resolution order, above any items that originate from site-packages.
  2. With the change from point (1), it becomes more important that type checkers provide a clear and easy way for users to override the vendored copy of typeshed’s standard-library stubs with a custom directory of standard-library stubs if they want to. Most major type checkers already implement this (mypy provides the --custom-typeshed-dir option; pyright provides the typeshedPath configuration-file option; pyre provides the --typeshed option). The proposed updated wording formally specifies that doing so is highly encouraged.
  3. Bring back an explicit mention in the typing spec of where vendored typeshed stubs for third-party packages (if there are any that the type checker has chosen to vendor) should come in the module resolution order (last!).

The revised module resolution order that I propose is as follows:

  1. Stubs or Python source manually put in the beginning of the path. Type checkers SHOULD provide this to allow the user complete control of which stubs to use, and to patch broken stubs or inline types from packages. In mypy the $MYPYPATH environment variable can be used for this.
  2. User code - the files the type checker is running on.
  3. Typeshed stubs for the standard library. These will usually be vendored by type checkers, but type checkers SHOULD provide an option for users to provide a path to a directory containing a custom or modified version of typeshed; if this option is provided, type checkers SHOULD use this as the canonical source for standard-library types in this step.
  4. Stub packages installed into site-packages - these packages SHOULD supersede any installed inline package. They can be found in directories named foopkg-stubs for package foopkg.
  5. Packages in site-packages with a py.typed marker file - if there is nothing overriding the installed package, and it opts into type checking, the types bundled with the package SHOULD be used (be they in .pyi type
    stub files or inline in .py files).
  6. If the type checker chooses to additionally vendor any third-party stubs from typeshed, these SHOULD come last in the module resolution order.

  1. It was slightly amended in [spec] Update typeshed language to conform to reality by srittau · Pull Request #1571 · python/typing · GitHub ↩︎

  2. yes, I know it is possible to install a package in site-packages that prioritizes itself over the stdlib, by clever use of pth files. But this is hardly common practice nowadays, and not something that I think type checkers should worry themselves with. ↩︎

7 Likes

Reading through the mypy source code, it seems there is a comment here explicitly noting that typeshed is deliberately placed higher than site-packages in its implementation of module resolution, for this very reason: mypy/mypy/modulefinder.py at 8dd268ffd84ccf549b3aa9105dd35766a899b2bd · python/mypy · GitHub

I’ve put up a PR implementing these proposed changes here: Tweak the typing spec's module resolution to more closely emulate Python's runtime semantics by AlexWaygood · Pull Request #1772 · python/typing · GitHub.

Minor wording suggestions can be suggested directly on the PR, but please post any substantive feedback on the proposed changes on this Discourse thread.

I agree that the runtime behavior should inform the typing spec here. How does the runtime implement this mechanism? I may have the wrong mental model, but I thought that the runtime simply resolved imports in the order dictated by the sys.path variable, and site_packages is just one of the paths in sys.path. In other words, I didn’t think that the runtime treated site_packages any differently from other paths. Is that incorrect?

I’m aware that the runtime treats a few specific imports (like sys) specially and completely skips the normal import resolution paths for security reasons.

@erictraut, I think your mental model is correct (though I’m not an expert on Python’s import system). I believe it’s just that site-packages always comes last in sys.path at runtime, so it always has lower precedence than first-party user code (represented by the empty string in sys.path) or the various entries for the standard library.

With a virtual environment activated:

Python 3.12.3 (main, Apr 30 2024, 10:12:02) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/Users/alexw/.pyenv/versions/3.12.3/lib/python312.zip', '/Users/alexw/.pyenv/versions/3.12.3/lib/python3.12', '/Users/alexw/.pyenv/versions/3.12.3/lib/python3.12/lib-dynload', '/Users/alexw/dev/typeshed/.venv/lib/python3.12/site-packages']

Without a virtual environment activated:

Python 3.12.3 (main, Apr 30 2024, 10:12:02) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/Users/alexw/.pyenv/versions/3.12.3/lib/python312.zip', '/Users/alexw/.pyenv/versions/3.12.3/lib/python3.12', '/Users/alexw/.pyenv/versions/3.12.3/lib/python3.12/lib-dynload', '/Users/alexw/.pyenv/versions/3.12.3/lib/python3.12/site-packages']

Your mental model of sys.path is correct. The discrepancy arises only because typeshed doesn’t correspond to any single entry on the runtime sys.path. The stdlib portion of typeshed belongs at a different location in the search order than the third-party portion of typeshed.

Ah right, I forgot that site-packages is typically last in sys.path.

The change @AlexWaygood is proposing here makes sense then, and the proposed wording looks good to me.

2 Likes

This is a marginal suggestion, but it might be nice for the standard library to go after stub packages (but still before py.typed). It’s possible users are relying on current behaviour to check against alternate standard library stubs. E.g. you could use it to write a stub package for MicroPython’s stdlib or something.

This would trade off the unlikely case of “there’s a stdlib backport package and a separate stubs package specifically just for that backport” for “users have more options to customise their stdlib type checking” (but maybe the claim is this is an anti-feature)

Separately, should we also maybe mention typeshed’s VERSIONS file here? I could see folks writing new tooling just not knowing about it.

Edit: actually, maybe this suggestion is a little annoying, and users can already do whatever they want via the bullet point 1 MYPYPATH equivalent. So unless someone with a non-hypothetical use case shows up, I’m happy to pretend I never posted this

Another consideration… what about typing_extensions? Is it considered a stdlib module for purposes of this discussion? It’s included in the typeshed stdlib type stubs, but it is also installed in site_packages. Prior to the proposed change, this subtlety didn’t matter. Now we may need to specifically call out the intended behavior. Or maybe it’s enough of an edge case that it doesn’t matter?

1 Like

Hmm, I don’t know. I feel somewhat opposed to treating any part of site-packages as having higher precedence than the standard library. If you really want a third-party package to have higher precedence than the standard library for some reason, there are ways of doing that, either via the “extra paths” mechanism specified in step (1) of the module resolution order (MYPYPATH in mypy, stubPath for pyright, etc.), or via --custom-typeshed-dir.

If anybody is relying on the current behaviour, moreover, it would work with pyright but not with mypy since, as already noted, mypy already appears to give the typeshed stdlib higher precedence than anything in site-packages.

I think we definitely should, but this update seemed significant enough on its own, so I had planned to propose that separately.

2 Likes

I’m not sure this proposal really changes much with respect to typing_extensions. Prior to this update, type checkers were expected to resolve typing_extensions imports to the stub for typing_extensions in typeshed’s stdlib directory; following this update, that will still be the case. If anything, I think this proposed update clarifies where typing_extensions should be resolved to, since it’s now unambiguous that typeshed’s stdlib stubs should always take precedence over anything in site-packages.

I’m not sure I fully understand your concern here. I suppose the language could be tweaked to more strongly state that typeshed should be seen as the single source of truth when it comes to the standard library – but I’m not sure that would be a good idea; it might be overspecifying things. For example, I think it would be reasonable to allow type checkers to fall back to inferring types from the standard library at runtime if typeshed doesn’t include a certain standard-library module that’s being imported in user code.

If anything, I think this proposed update clarifies where typing_extensions should be resolved to, since it’s now unambiguous that typeshed’s stdlib stubs should always take precedence over anything in site-packages .

Yes, I agree with your analysis.

1 Like

I’ve submitted this proposed revision to the Steering Council for a pronouncement: Typing spec update revising typeshed's position in the module resolution order · Issue #30 · python/typing-council · GitHub