Beyond just preventing bugs, increased type annotation coverage in Python code is correlated with better IDE experiences and better performance for other tools that depend on type annotations.
In 2025, Meta is partnering with Quansight to improve the quality of types for third-party packages by contributing inline types or writing type stubs, as well as ways to help measure/maintain type annotation coverage, which will hopefully benefit the entire community.
We have come up with a list of top libraries where we think adding coverage can benefit developers, and we would also like to hear from members of the typing community, type stub maintainers and library authors.
Which libraries would you like to see have better coverage? Please comment below if you have any ideas or suggestions.
While the Array API package isn’t commonly downloaded, it’s the probable future of libraries written for NumPy, PyTorch, Tensorflow, etc. In my opinion, it’s more important than it seems.
(Guess this is as good a place as any to thank @jorenham for all his work!)
Measuring type annotation coverage; now that’s interesting! I’ve been looking for a tool that’s able to measure the coverage of the type-tests for a while now, but haven’t been able to find anything. Is this also what you’re talking about here? Or are you talking about measuring something like the “% of the public API that with type annotations”?
I’d say Numba, which has no typing coverage at all. As far as I can tell, there aren’t any human-made stub-packages out there. If I ever find the time to do so (which doesn’t feel very likely at the moment), I might even have a go at it myself.
Wagtail is also on my typing wish-list, which, thanks to the very impressive django-stubs (shoutout to @sobolevn), might even be possible to do now.
And last time I checked, pyscript and pyodide were also untyped. For what it’s worth, typing these two is probably a lot easier than typing numba or wagtail.
I think there’s a lot to gain in terms of usability and maintainability from typing the scientific ecosystem. In particular, make generic versions of objects like numpy arrays (so you can have something that looks like X: Array[Float32, UInt32] and y: Array[Classes], where Classes might be the levels of your variable) and DataFrames like Polars and pandas (so you can have df: DataFrame[Users], where Users is a class defined somewhere else - perhaps even in your backend or another application.
There are also challenges for popular C++ projects that are packages as Python wheels, where they need to generate Python types for a C++ API. OpenCV is one example and has over 80k stars on GitHub. In this issue input types are not properly preserved across a lot of generic functions
I am working on boto3-stubs project for quite a while. It provides type annotations for boto3 and botocore. There are also types-aiobotocore and types-aioboto3 packages for unofficial AWS SDKs
Since boto3 is the most downloaded package on PyPI, this package improves type coverage for a large amount of projects.
However, there are some issues:
PyCharm still has a bug that causes high CPU usage on @overload functions with Literals (this is the reason why boto3-stubs-lite exists)
building types for a specific version of boto3 requires some extra work for a user, and is not consistent with how you usually use 3rd party type annotations
usually it is impossible to check with keys in botocore output structures are required or optional. The best way is manually mark them.
I would be happy if someone could help me to improve the project.
Thanks for all the suggestions everyone, this has been very helpful!
We’ve started reaching out to maintainers and creating PRs, notably with some good progress in pandas-stubs by @MarcoGorelli.
We’ll start reaching out to more maintainers in this thread in the coming weeks to learn more about workflow and pain points.
In the short-term, we plan to make more direct contributions to packages related to the scientific stack, but we’re also looking at opportunities to build tools/automation that could useful for everyone.
Thanks - some of this work landed in the new release from this week, which resolves several issues related to untyped arguments (e.g. I was pleased to see CI fail in Narwhals when the newest pandas-stubs release meant that some # type: ignores were no longer neccessary)
Hats off to the main pandas-stubs maintainer (Irv Lustig) for keeping things moving quickly and providing great feedback!
Sharing my response to Danny in the Pyscript repo here for visibility:
There is an initial release for the webassembly port of MicroPython that including an initial stub for the pyscript module that is released to PyPI : micropython-webassembly-stubs · PyPI
That is based on the MicroPython specific tools I created, and parttly stubs created by pyright --createstub
The challenge is to create processes and tools to maintain consistent stubs and documentation without increasing the efforts, or complexity to do so for all maintainers.
Missing typing finctionality:
One thing that is missing specifically for micropython is that today it is not possible for type checkers to select on the MicroPython version, nor on the port (webassembly in this context)
So rather than maintain one set of stubs with conditions, multiple packages are needed, increasing the complexity for both maintenance and use
IDE Tooling:
to simplify maintenance - IDEs could / should provide better support for mixed language files:
.c file with embedded documentation in .RsT or .md and Python samples
.pyi file with embedded .RsT /MD with embedded Python samples
It’s exciting to see the progress here. Augmenting and improving the type information in top libraries helps the entire ecosystem, so this work is very impactful! Thanks to all who are contributing to this initiative.
In case you’re not already aware, I added a feature to pyright a while ago that allows library and stub authors to find places where they are missing type annotations within their library’s public interface. The pyright --verifytypes command will also generate a “type completeness” score from 0 to 100%. I guess this isn’t the same as measuring the coverage of type tests, but it should give you some insight into the coverage. Documentation can be found here. If you see ways to improve this tool, let me know.
Pyright’s --verifytypes is indeed a very useful tool. If I understand correctly, the “type completeness” is the relative number of public(?) symbols in the Python sources that Pyright is able to fully determine the type of. Did I get that right?
@MarcoGorelli recently suggested that we might be able to use this score in NumPy to help us prevent typing-related regressions. His idea was to requiring a certain lower bound on the type completeness score in CI. Ideally that would be 100%. But at the moment there are parts of the NumPy’s (mostly numpy.ma) that are not fully annotated yet. And since --verifytypes will fail for any score below 100%, it would require a custom wrapper script.
Another challenge is that we weren’t able to get pyright to exclude certain irrelevant parts of the codebase such as the unit tests (which live in-project in numpy). The exclude config does not seem to be used by --verifytypes, and there is also no # pyright: ignore[reportVerifyTypes] or something.To illustrate, running pyright --ignoreexternal --verifytypes numpy (without the wrapper script, and with pyright==1.1.402) on the current main branch of numpy, the type completeness score is something like 44%. But the majority of the reported errors are related to the tests. So if we would require a score of e.g. >40% now, then adding unit tests could cause the CI to fail because it would misinterpret this as a typing regression. This is something a wrapper script could also work around. But I can imagine that we’re not the only ones that would find it useful if --verifytypes was able to exclude certain parts of the relevant codebase.
When I was talking about “type coverage” in my earlier post, I was thinking of an analogue to “testing coverage” metrics that tools like coverage.py provide.
In NumPy, we the bundled stubs using type-tests. Specifically, we have a collection of .pyi files (in numpy/typing/tests/data) that use typing.assert_type for acceptance testing (the reveal subdirectory), and .pyi files with # type: ignores for rejection testing (the fail subdirectory). This way, “running” the tests is the same as running the static type-checker (currently only mypy).
So what I had in mind when I heard “type coverage”, was something like the % of the stubs that are covered by these type-tests, analogous to the (runtime) test coverage of e.g. coverage.py.
In case of libraries with cpython extensions a different form of “type coverage” could also be useful: The % of (public) symbols that are “covered” by stubs (or inline annotations). But I believe that pyright won’t be able to detect it if some public module that’s purely written in C has no annotations. In case of NumPy, there’s quite a lot of C code, so there’s a chance that Pyright’s current “type completeness” score is too optimistic.
But I understand that such a feature would require pyright to be equipped with stubtest-like runtime capabilities, which I’m guessing is considered to be out-of-scope for Pyright.