While reading the PEP 727 discussion, many posters were of the opinion that the goals of that PEP can be achieved with docstrings. I started thinking about how the stdlib could be extended to help parse and validate docstrings and I wanted to share my ideas for further discussion.
The inspect
module is extended with the following functions:
inspect.getfielddoc(object)
,inspect.validatefielddoc(object)
, andinspect.register_docstr_parser(docstr_parser)
which works as follows:
inspect.getfielddoc(object)
will parse the docstring and return a dictionary where each key is the name of a field from the parsed docstring, and each value is a pair (type, description)
with the type and field description from the parsed docstring. The reason for returning a pair is that some developers document the types in the docstring and don’t use type annotations. This API would primarily be consumed by documentation generators like sphinx. Not sure exactly how to deal with the key of the return value but I’m leaning towards marking that with the key "->"
. It could also be possible to have this function return documented exceptions.
inspect.validatefielddoc(object)
will parse the docstring and validate that the fields in the docstring and function definition are the same. This function could have some flags, e.g. making validation fail unless each field has a description, although separate functions could be used instead of flags. This API would primarily be consumed by linters and LSPs.
inspect.register_docstr_parser(docstr_parser)
lets the consumer of the first two APIs register docstring parsers. A docstring parser is a function that takes a string as input and returns the same kind of dictionary returned by inspect.getfielddoc(object)
. If the consumer wants to parse numpy-style docstrings, they can register a NumpyDocstrParser
with inspect.register_docstr_parser(NumpyDocstrParser())
, and now inspect.getfielddoc
and inspect.validatefielddoc
will work with all objects using numpy-style docstrings. A consumer can register multiple docstring parsers, each of which will be tried in turn until a non-empty dictionary is returned. If this registration procedure can’t work for reasons I’m unaware of, the docstring parser could instead be provided directly to inspect.validatefielddoc
. (inspect.getfielddoc
wouldn’t be needed in that case.)
A note on the names of the first two functions: their names are not snake_case to mirror the inspect.getdoc
name.
This proposal is based on the idea that we want to codify, in the stdlib, the concept of field documentation in docstrings. Whether or not this proposal would achieve those goals (I have zero experience implementing documentation generators and linters), this proposal is useless if we don’t wish to put this feature in the stdlib. I think it would be most fruitful if we start by discussing if the API proposed above, or any similar kind of API to work with field descriptions in docstrings, is suitable for the stdlib. If there is consensus that such an API is a good fit for the stdlib, then we can continue discussing the details of such an API. I added the proposed API to a) serve as inspiration for the discussion and b) because I’ve been thinking about it for about a week and needed to get it out of my system.