By Leonard Dye via Discussions on Python.org at 16Sep2022 23:33:
I found this example of ‘docstrings’. This means that our code is to
end up being more verbose than the code itself. Why is this, if Python
is written Pythonic style, it’s good pseudocode on it’s own.
Well, if one follows the precepts of Literate Programming, that can be a
good thing: Literate programming - Wikipedia
Anyway, what you’ve got below is pretty wordy. I’ll show what I would
write by comparison. Remembering that this is a matter of taste and
inclination in some degree.
There’s a section in the Python docs about docstrings too:
Just seems like over kill, since most of the Python code I have seen
have little or no docstrings. Is this how it is really to be done in
the real world? I’m looking for a correct concept for my standard for
programming. Are docstrings put in at the end when the code has been
tested and works?
A docstring should help the user use the code, telling them what
parameters it accepts and their meaning, and what the outcomes are. If
pertinent, some discussion about limitations or bugs. For a lot of small
things, little is needed.
But the code is how the function implements things. (a) the
docstring should help someone who doesn’t have ready access to the code,
or if the code is difficult and (b) the docstring should describe what
the user can expect of the code in terms of outcomes. How that’s done
internally isn’t relevant. And of course the implementation might
change.
Docstrings, like opening comments (they often displace opening comments
either fully or partially), are often, perhaps preferably, written
before the code is written, if the code is nontrivial. FOr myself,
writing a docstring for a tricky function helps me get clear in my mind
what has to be done. If I can’t write a docstring, the precise operation
of the function is often unclear.
If I’m in prototype mode, where I know what I’m trying to solve but
still mucking around with how I’m going to solve it, I’d likely write a
1 line docstring, if any, until the function is stable.
And for genuinely trivial stuff I write the docstring later, often. My
linters whinge about missing docstrings ![:slight_smile: :slight_smile:](https://emoji.discourse-cdn.com/apple/slight_smile.png?v=12)
Let’s look at your example. I like MarkDown myself:
https://daringfireball.net/projects/markdown/syntax
It is pretty succinct, and it’s also the markup we use on the Discourse
forums
I tend to use GitHub flavoured MarkDown, when the dialect
matters: GitHub Flavored Markdown Spec
So, to your example:
class Animal:
“”"
A class used to represent an Animal
…
Attributes
says_str : str
a formatted string to print out what the animal says
name : str
the name of the animal
sound : str
the sound that the animal makes
num_legs : int
the number of legs the animal has (default 4)
Happy days. What the class is for, what public attributes it provides.
I’d write this in markdown like this:
class Animal:
''' A class used to represent an Animal.
Attributess:
* `says_str`: a formatted string to print out what the animal says
* `name`: the name of the animal
* `sound`: the sound that the animal makes
* `num_legs`: the number of legs the animal has, default `4`
'''
Methods
says(sound=None)
Prints the animals name and what sound it makes
“”"
I’d omit this entirely because each method will also have its own
docstring. But the leading section might have explainatory text or
examples if the class has significant complexity.
def init(self, name, sound, num_legs=4):
“”"
Parameters
----------
name : str
The name of the animal
sound : str
The sound the animal makes
num_legs : int, optional
The number of legs the animal (default is 4)
“”"
Again, I’d write:
def __init__(self, name, sound, num_legs=4):
''' Initialise the `Animal`.
Parameters:
* `name`: the name of the animal
* `sound`: the sound the animal makes
* `num_legs`: optional `int`, default `4`: the number of legs the animal
'''
Their says()
docstring:
def says(self, sound=None):
“”"Prints what the animals name is and what sound it makes.
If the argument `sound` isn't passed in, the default Animal
sound is used.
Parameters
----------
sound : str, optional
The sound the animal makes (default is None)
Raises
------
NotImplementedError
If no sound is set for the animal or passed in as a
parameter.
"""
I’d write:
def says(self, sound=None):
''' Print what the animal's name is and what sound it makes.
Parameters:
* `sound`: optional `str`, default `None`:
the sound the animal makes
If `sound` is provided and no sound is set for the animal
this method will raise `NotImplementedError`.
'''
You don’t even need to have a separate section for the parameters; I do
it when there’s more than a few. If I can put the relevant detail in the
opening sentence/paragraph, no need for something distinct. Example:
@typechecked
def deduce_type_bigendianness(typecode: str) -> bool:
''' Deduce the native endianness for `typecode`,
an array/struct typecode character.
'''
The type annotations (:str
, ->bool
) already describe the types of
the parameters and the return, and the function does something very
simple.
Here’s another function whose docstring embeds the parameters in the
opening sentence, but elaborates on the meaning of the unit
parameter:
def as_datetime64s(times, unit='s', utcoffset=0):
''' Return a Numpy array of `datetime64` values
computed from an iterable of `int`/`float` UNIX timestamp values.
The optional `unit` parameter (default `'s'`) may be one of:
- `'s'`: seconds
- `'ms'`: milliseconds
- `'us'`: microseconds
- `'ns'`: nanoseconds
and represents the precision to preserve in the source time
when converting to a `datetime64`.
Less precision gives greater time range.
'''
Now, the builtin help()
Python function doesn’t do anything very
complicated with the docstrings, but it does bundle up the class
docstring with all the method docstrings. Example:
>>> import cs.timeseries
>>> help(cs.timeseries.TypeCode)
Help on class TypeCode in module cs.timeseries:
class TypeCode(builtins.str)
> TypeCode(t)
>
> A valid `array` typecode with convenience methods.
>
> Method resolution order:
> TypeCode
> builtins.str
> builtins.object
>
> Methods defined here:
>
> struct_format(self, bigendian)
> Return a `struct` format string for the supplied big endianness.
>
> ----------------------------------------------------------------------
> Class methods defined here:
>
> promote(t) from builtins.type
> Promote `t` to a `TypeCode`.
and so on.
And various documentation tools will gather all this stuff up for you. I
cobbled together something to produce README.md
files for my modules
built from their docstrings. As a bonus, the README.md
also gets
presented in PyPI.
So to continue the extended example with cs.timeseries
:
It isn’t beautiful, but it is at least fairly effective. The
cs.timeseries
module is fairly long, and a lot of the docstrings are
verbose to get things straight in my own head.
The smaller your code, the less verbosity you will usually need.
Cheers,
Cameron Simpson cs@cskk.id.au