Slash and star again in stdlib docs! (function signatures)

It was EB decision that sphinx docs should include correct signatures: Function signatures should use slash/star as needed by nedbat · Pull Request #1344 · python/devguide · GitHub (see also discussion thread: Editorial Board decisions) and this is documented in the Python Developer’s Guide.

Is this for new code only or EB has some plans to change existing docs?

The current state looks horrible. Take e.g. the builtin functions page. Already here we have signatures with slashes (e.g. bool, int or complex) and without (e.g. any or bin). On the Built-in Types page some above signatures are documented with funny square bracket syntax (which actually not explained somewhere). See the range() as an example and CPython issue: Built-in function `range` params discrepancy across versions · Issue #125897 · python/cpython · GitHub.

It was argued (see this as an example), that omitting some required markers in the function signature — makes the docs much more usable for readers, especially newcomers. But I believe it might be wrong sometimes. Take current range(start, stop, step=1) signature from the builtin functions page. Does step=1 means (1) just that the range has a default value for the third argument, or (2) it accepts a keyword argument step? For range — (1) is true, but not (2). But nearby we have e.g. the round(number, ndigits=None) with same syntax, that does accept keyword argument ndigits ((1) and (2) meaning).

We can start adding required syntax markers to docs (again, see Doc: Make functions.html readable again. by JulienPalard · Pull Request #99476 · python/cpython · GitHub), but maybe there is a better way. Why not add support for keyword arguments in builtin functions? Then “simple” signatures (like range(start, stop, step=1)) — will be correct. Also this will lower mental barrier between C-coded functions and pure-Python code (where per-default keyword arguments are allowed). IIRC, PyPy’s builtins, implemented in RPython — allow keyword arguments too.

I would guess, before inventing the Argument Clinic this was a complex task and we had positional-only arguments for builtins as a historical artifact of this. But now it’s more or less trivial.

Possible issue is some overhead for argument processing. I tested conversion for cmath’s functions in my pr and got ~5% performance penalty:

In the main:
$ ./python -m timeit -r11 -s 'from cmath import sin;z=1j' 'sin(z)'
1000000 loops, best of 11: 312 nsec per loop

With patch:
$ ./python -m timeit -r11 -s 'from cmath import sin;z=1j' 'sin(z)'
1000000 loops, best of 11: 330 nsec per loop

Maybe it’s not too high price for consistent, readable and accurate docs. I would expect this could be more or less true for most of builtins/stdlib.

1 Like

Either fix docs or signature fine by me. I think it is important to be accurate. It would be great to add types as well (slightly off topic).

1 Like

I’d argue we should add functionality (in this case, the keyword argument) before we take away documentation.

The critical part is that reading the documentation and then reading some example code should make sense. It’s not a precise specification of how Python (particularly CPython) is implemented - it’s a description of how Python should be used.

3 Likes

I should note, that in many cases (take above cmath module as an example), we will not touch documentation at all.

This is more or less — a question of implementation. Should we do this change in one shot, per one release, or do this gradually? Can we backport changes in argument processing (and thus — changes in docs)? Etc. Lets focus on this later and decide first if the idea acceptable or not.

Well, for me signature line like range(start, stop, step=1) means exactly the later. And it tells us (wrongly) that we can do range(1, 10, step=1) or even range(start=1, stop=10, step=1).

1 Like

No, it’s a change in behaviour, not a bugfix.

1 Like

Well, lets stick with a simple example of cmath.sin(). Right now it’s documented both as

>>> help(cmath.sin)
Help on built-in function sin in module cmath:

sin(z, /)
    Return the sine of z.

and in sphinx docs:

cmath.sin(z)
    Return the sine of z.

Strictly speaking, both signatures can’t be true. If we prefer second meaning, support for keyword argument might be considered as a bugfix. If first one — this will indeed a compatibility break. Though, it will affect some code only if some one rely on a very special behavior, i.e. that sin(z=1) raises a TypeError.

2 Likes

IMO functions such as cmath.sin() must not accept keyword arguments, but only have positional only parameters. Also, positional parameters should be documented like that in Sphinx.

1 Like

Generally and historically, we fix these issues by aligning the docs with the implementation. Not the other way around. A docs fixup (or rather: clarification) is never a breaking change, as opposed to pretty much any behavioural change, often no matter how subtle.

4 Likes

Totally agree with this. This is also true for builtins like for example max and min, IMO.

1 Like

Why not? If this function was implemented in pure-Python (it’s a sort of implementation detail, right?) you will have to do extra efforts to disallow keyword arguments. See e.g. mpmath:

>>> from mpmath import *
>>> mp.sin(1)
mpf('0.8414709848078965')
>>> mp.sin(x=1)
mpf('0.8414709848078965')
>>> fp.sin(x=1)
0.8414709848078965

(As the mpmath only recently dropped support for 3.8, there was no way to avoid this behavior.)

Positional-only parameters have some benefits: (1) argument names aren’t part of API and (2) argument processing is slightly faster.

The price is a unusual (wrt other languages, for example) syntax. I trust e.g. Raymond’s teaching experience, and his opinion that this poses problems for newcomers. Unless people start directly with the language reference, or carefully read the Python Tutorial. But when they will search for recipes “how to do such or such thing in Python” — IMO, there is a very little chance people see “/”-powered function definition in Python.

2 Likes

This is in a different category, due to there not being an obvious name for the parameter, and even if there were, it wouldn’t improve readability to use it (it’s obvious what sin(<one argument>) is doing, but not so obvious what range(1, 10, 2) is doing).

One at a time, as usual. That may mean one module rather than one function, but we always try to avoid doing mass changes across the code base.

Making a position-only argument named isn’t a breaking change, and the performance impact is negligible (and avoidable by users, who can simply not use the name and avoid 99.9% of the cost).

I agree with you. I just presented a generic argument that applies everywhere, rather than one specific opinion which people are likely to take and misapply (e.g. to sin).

1 Like

And sometimes this is blatantly an anti-user thing to do :wink:

Making the docs more complex and technical to avoid changing the implementation slightly doesn’t actually help anyone except those who insist on coding to the docs and never testing their code. If updating the implementation is what’s needed to keep the docs straightforward and satisfy the more pedantic among readers, then let’s just update the implementation (or present a good argument for not doing it - and I’d be totally okay with sin include a line like “The argument x cannot be provided like x=<value>”, which is vastly more friendly and helpful than sin(x, /)).

3 Likes

It’s less disrupting than a breaking change.

1 Like

Nobody’s proposed a breaking change yet, that I’ve noticed. I certainly haven’t.

Yes, but the point is not using parameter name, most people don’t do this even in pure-Python. The goal is — more simple function signature (yet accurate).

I think that performance impact should be investigated on case-by-case basis. Probably, for some parts of the stdlib, in some scenarios — this might be important and we will prefer “/”-powered API on this ground. (But on first glance I don’t see obvious candidates for this.)

Note also (see my original post), that you can’t avoid some performance impact even if you aren’t using argument names, here is a quick test:

+-----------+--------+----------------------+
| Benchmark | ref    | patch                |
+===========+========+======================+
| sin(1j)   | 394 ns | 407 ns: 1.03x slower |
+-----------+--------+----------------------+
# bench.py
import pyperf
from cmath import sin

runner = pyperf.Runner()
z = 1j
runner.bench_func('sin(1j)', sin, z)
1 Like

BTW, the EB decision thread mentioned that previous discussions (on github issues for cpython and SC repos) were considered, but not this previous forum thread that proposed another direction: Signatures, a call to action

I don’t think this thread is relevant here. Yes, the inspect module has known shortcomings, that aren’t fixed yet, e.g. multiple signatures. Or annotations support in extension modules.

But this is not an obstacle for docs: sphinx already can display multiple signatures or show annotations, we don’t use something like autodoc to get this stuff from the inspect module introspection capabilities. I think there is more or less a consensus, that if a callable supports several logical signatures — documentation should list them all, one by one, each being a correct (in pure-Python) signature. It’s already possible in sphinx. E.g. for range this will be:

class range(stop, /)
class range(start, stop, step=1, /)

And most callables aren’t suffer from this problem. E.g. in the cmath module — only one function has multiple signatures.

1 Like