At the most recent Documentation Community Team Meeting, a question arose regarding the fraction of public functions that are documented at docs.python.org. This was kickstarted by a recent Github Issue where a user found a public function in source code for the importlib module, but did not find it documented and proposed a contribution in a PR.
We think it’d be a net gain to systematically determine which public functions have corresponding documentation. If anybody knows of prior discussion about this, could you link us to that? I’ve checked the existing communication channels and couldn’t find anything for search terms like documentation coverage.
Also, we would welcome feedback about building a coverage report for the official Python docs, something like a calculated file like:
We hadn’t discussed how to build it, so thoughts on that would be welcome, too. Otherwise, one seemingly straightforward way to get started is comparing dir(<module_name>) and searching Doc/library/<module_name>.rst for each function.
Also, we welcome considerations around edge cases regarding what is a “public function” that should be in the denominator of the coverage calculation. Naively, it could be any exposed function on the module, but someone brought up an edge case in a standard library module that is intentionally not documented, so we would layer on hardcoded exceptions like that as they arise.
How do you define “public”? We’ve always maintained that if it’s not documented on docs.python.org, it’s not public.
If you’re thinking that every function, class or variable whose name doesn’t start with an underscore is public, that’s not the intention of the core development team.
How do you define “public”? We’ve always maintained that if it’s not documented on docs.python.org, it’s not public.
The goal is to check whether the documentation site has every function* that is intended to be there. If “public” is the wrong word for “intended to be used by Python coders” we can switch to some more preferred word.
If you’re thinking that every function, class or variable whose name doesn’t start with an underscore is public…
No, that’s not where we were headed. It’s more like “for every importable function, check if it is documented unless it: A) starts with an underscore, B) is named main, C) is on the following list of intentionally undocumented functions for any reason including deprecation…”. So a combination of rules and an explicit blacklist of additional functions to not document. The end result is a systematic way to check which “public” (<-- maybe substitute new word here) functions are neither documented nor marked as intentionally undocumented.
I can take a first pass at this and either post here or open a draft PR to demonstrate the above in code.
* I think keeping the scope to just functions is a sensibly way to get started, but classes/variable could come after (or first, if folks here feel strongly).