Function Coverage for Official Python Documentation

At the most recent Documentation Community Team Meeting, a question arose regarding the fraction of public functions that are documented at docs.python.org. This was kickstarted by a recent Github Issue where a user found a public function in source code for the importlib module, but did not find it documented and proposed a contribution in a PR.

We think it’d be a net gain to systematically determine which public functions have corresponding documentation. If anybody knows of prior discussion about this, could you link us to that? I’ve checked the existing communication channels and couldn’t find anything for search terms like documentation coverage.

Also, we would welcome feedback about building a coverage report for the official Python docs, something like a calculated file like:

  • importlib [95%]
    importlib.metadata.distribution
    importlib.metadata.distributions
    importlib.metadata.metadata
  • io [100%]

We hadn’t discussed how to build it, so thoughts on that would be welcome, too. Otherwise, one seemingly straightforward way to get started is comparing dir(<module_name>) and searching Doc/library/<module_name>.rst for each function.

Also, we welcome considerations around edge cases regarding what is a “public function” that should be in the denominator of the coverage calculation. Naively, it could be any exposed function on the module, but someone brought up an edge case in a standard library module that is intentionally not documented, so we would layer on hardcoded exceptions like that as they arise.

4 Likes

exceptions

This would be quite nice, it could be combined with Machine-readable Specification for Deprecated and Removed APIs of CPython to exclude deprecated APIs from the coverage.

I have assumed that most everything is covered, but I must admit I don’t really know that. Good question.

I should think a tool could be put in the tools directory. It should exclude idllelib and tests.

PS. It does not have to be initially perfect.

The title above is misleading as this thread has nothing to do with docstrings. Should be “Function coverage …”.

1 Like

How do you define “public”? We’ve always maintained that if it’s not documented on docs.python.org, it’s not public.

If you’re thinking that every function, class or variable whose name doesn’t start with an underscore is public, that’s not the intention of the core development team.

5 Likes

How do you define “public”? We’ve always maintained that if it’s not documented on docs.python.org, it’s not public.

The goal is to check whether the documentation site has every function* that is intended to be there. If “public” is the wrong word for “intended to be used by Python coders” we can switch to some more preferred word.

If you’re thinking that every function, class or variable whose name doesn’t start with an underscore is public…

No, that’s not where we were headed. It’s more like “for every importable function, check if it is documented unless it: A) starts with an underscore, B) is named main, C) is on the following list of intentionally undocumented functions for any reason including deprecation…”. So a combination of rules and an explicit blacklist of additional functions to not document. The end result is a systematic way to check which “public” (<-- maybe substitute new word here) functions are neither documented nor marked as intentionally undocumented.

I can take a first pass at this and either post here or open a draft PR to demonstrate the above in code.

* I think keeping the scope to just functions is a sensibly way to get started, but classes/variable could come after (or first, if folks here feel strongly).

This still sounds odd to me. You will find tons of unintended importable functions (etc.) not to mention methods.

When you find something undocumented you should assume that’s intentional.

Many times the argument is used “It’s not documented so you shouldn’t use it, and if you do it’s your problem if it changes in a later release.”

Even the presence of a docstring has no weight here.

Your idea may work for certain other libraries but I don’t think it would add value for the Python stdlib maintainers.

1 Like

@tjreedy Fixed now; thanks!