I did some analysis of stubs packages in Pypi, and thought the data might be interesting to share here.
For the purposes of this analysis, I define a stubs package as packages with the naming convention “-stubs”, per PEP561. This is probably not a complete listing, since there are other packages that are obviously stubs which do not follow this naming convention, such as boto3-stubs-lite.
The code to generate the results is in my fork of @lolpack’s type coverage script: GitHub - yangdanny97/type_coverage_py
An updated HTML report of top 2k packages can be viewed here: Git-Forge HTML Preview
How common are stubs packages?
Not that common - of the top 8k downloaded packages, 106 (1.3%) have stubs packages, and 15 (0.18%) are stubs packages. Separate stubs packages are more common in the most widely-used packages - 15 of the top 100 packages have corresponding stubs packages.
How about typeshed? Of the top 2000 packages, 114 have typeshed while only 74 have stubs packages. Interestingly, 10 packages have both typeshed stubs and stubs packages.
Full data here: pastebin
Are stubs packages up to date?
Having stubs in a separate pypi package with a different release schedule and potentially different maintainers raises some questions of whether these stubs are reliable and up-to-date.
To answer this question, I looked up each package with a corresponding stubs package and compared their latest release dates. I also compared the release frequency of packages v.s. stubs.
Staleness
We define staleness as the days between the latest release of the main package and the latest release of the stubs package. A negative value means that the stubs package is newer than the main package.
Ideally, we would expect stubs to be updated soon after a main package release, so well-maintained stubs packages should have a small negative staleness value.
In practice, this ideal scenario is rare. While there is a huge variance in staleness figures - the median stubs package is around ~280 days stale. As expected, stubs for more popular packages are more likely to have low staleness.
See the attached chart for a detailed breakdown. Full data here: pastebin
There are a lot of positive and negative outliers here, with a wide variety of reasons:
-
rfc3986-stubs - The stubs are several years ahead of the main package, because the main package hasn’t had a release in years. Presumably, users are downloading/consuming the latest package source directly.
-
pyjarowinkler-stubs - The main package hasn’t been updated since 2016, but in 2023 someone made a stubs package as a practice project
Update Frequency
Sometimes, stubs packages are made as one-off practice projects or are unmaintained/poorly maintained compared to the main package. The maintainers of the stubs may not be the same as for the main package, and it’s clear that most stubs are not set up to be automatically bumped when the main package is released.
In constrast, typeshed has stubsabot, which submits PRs to bump typeshed’s stubs when the main package releases, and any API changes are flagged to maintainers by CI.
Some data: the median package in this analysis has 46 releases, while the median stubs package has just 3. Out of the 106 stubs packages I looked at, over a quarter (23) have only a single release and another 19 have only two releases; 81/106 have <10 releases.
To me this raises a lot of questions on how much we can trust standalone stubs packages v.s. a centralized stubs repository like typeshed. The latter appears to be far better maintained and is a more popular way to distribute types across existing libraries.
Where should types go?
I’m curious to hear what everyone’s thoughts are on how best to add static typing for popular packages.
For library maintainers who don’t want to annotate their code directly with types, it seems like there are three options.
-
Generate stubs alongside the code & set up CI to ensure they are updated along with the source code.
-
Generate the stubs and ship it as a separate package & set up CI to ensure it is released along with the main package.
-
Generate the stubs in typeshed.
#1 or #2 would probably need to be done on a case-by-case basis and might be a hard sell for some package maintainers. If I were a package maintainer I would probably prefer #1 or adding the types inline over having a separate stub package because it seems like less work.
#3 seems like it would have the most consistent standards since all the infra and CI is already set up & all the typecheckers use it, but if we were to add stubs for, say, 1000 packages into typeshed I feel like the maintainers wouldn’t be able to keep up with updates.
Some possible ways of reducing the maintenance burden for typeshed:
-
Have some way of marking stubs as fully-generated, and automatically regenerate them each time. This would probably require big improvements to the current stub generation tooling, and the resulting stubs would be lower-quality than handwritten ones.
-
Have some way of marking ownership of typeshed stubs, so that when a version is bumped a bot automatically opens an issue against the package’s source repo reminding the maintainers to also update typeshed, instead of relying on typeshed maintainers to stay on top of all the updates.
Thoughts?