I agree with Petr and earlier comment by Alex that we focus on what works best for the reader. In general, I believe spelling out of “or” or “optional” would cause the least user confusion.
I’ve come around to using English phrasing for docs.
I like using “int, optional” (or a longer form as Petr shows) instead of either “int or None” or “int | None”. This is especially true where the argument isn’t meant to be an explicit None, but None is for omitted arguments.
I like having terms like “file-like” available for describing the type of arguments. Of course those should be linked to a description somewhere if it’s available.
I hope “number” would be a linkable term that mentioned the specifics, though I fear different functions might have a different idea of what a number could mean. If there isn’t a crisp unambiguous definition to link to, then we should use a written description like Petr shows.
Sorry, the Editorial Board dropped the ball here. We’ve been asked for a formal decision (at least on ‘or’ vs ‘|’) in our GitHub tracker. We discussed this at today’s EB meeting, and decided that
a. we should start by deciding on the least controversial items, and
b. I should write up what the EB thinks (mixing in my own views marked as such).
We can then have a slightly more focused discussion here and the EB will hopefully confirm the final decision at our next meeting and then record it both in GH issue 7 and our own EB change log.
So here goes, from least to most controversial:
Least controversial is to use “or” instead of “|”. We will recommend using “or” here.
Also uncontroversial is allowing pseudo-types like “file-like”. My own idea: link these to a glossary entry, creating one if needed.
I’m not so keen on using “optional” for anything, and I think I have mostly convinced the other two EB members present today of this. The reason is that in common parlance “optional” could mean “you can leave this out” but in the Python Type System (PEP 484 and friends) the notation Optional[T] means T or None. We ought to distinguish between “None is one of the allowed values”, and “you can leave out this argument”. The latter should be expressed by stating “the default is X”. If None triggers special behavior (as it does for s.split(), where s.split(None) triggers behavior that cannot be expressed by passing a string) that should be written up explicitly following the phrase stating the default. If the default is some private value (there are some functions that do this, e.g. the initial (3rd) argument to reduce()) this will just have to be explained in the docs.
I’m personally not convinced that it would be wise to recommend “number” for anything, and there was no consensus in today’s EB meeting. Maybe to beginners it’s “obvious” that this is short for “int or float”, but for many it would be confusing: does it allow Fraction? Or Decimal? Or Complex (part of the numerical tower)? It may make sense in JavaScript, where there’s only one numeric type, but believe it’s too confusing in Python, which has the “numbers” module and PEP 3141.
Based on principle (a) above I expect that most likely the EB will approve items (1) and (2), closing issue 7, and put items (3) and (4) (and any other question brought up in this thread) off till a future date – or perhaps even deciding to discourage those usages.
Discuss, please! (Silence will be taken as assent.)
After (fairly) recently sync’ing my PR with main, I discovered that this syntax is surprisingly not supported for attributes. I need one of the Sphinx wizards to chime in here (poke @AA-Turner): Is it a bug? Is it a missing feature? Perhaps we are missing a Sphinx extension? I don’t know.
If we cannot use this markup for attributes, I would instead keep status quo.
Sorry, I’m missing something here. What exactly is it that we want Sphinx to do? I think I’ve been assuming that English (like “int or str”) would be written explicitly, not generated from another notation.
Can you show us what you had hoped would happen, and what is happening instead?
When I initially proposed my PR, only parameters (not attributes) had such a markup.
My PR proposes to change all the |s to or.
I expected the :type: option of .. attribute:: to be parsed and interpreted exactly as the :type: and :param: options of .. class::, .. function::, and .. method::. Instead, for the former, Sphinx produces a syntax error.
For me, I write “int” when I really want a integer value, and “float” when the value can have any real value, so in practice “float” really means “float or int”. I think that the cases where passing an int type instead of a float are very rare, now that 1/2 == 0.5. If it was the case, I think that I would explicitly document it.
Edit: And I completely agree with the 3 other points.
As I said, Sphinx does not allow this syntax for attribute types. This is the problem I’m trying to communicate. I’m not sure how to more clearly express myself, so I suggest taking a look at the commit log on the mentioned PR.
That’s how Python’s type system works too. But we’re inventing new conventions here that don’t rely on the user understanding such subtleties, and if e.g. sleep() were to be documented as taking a float, some users would be reluctant to say sleep(60), using (uglier) sleep(60.0) instead, or even write (inefficient) sleep(float(n)) if n was an int. So by convention we use int or float for such cases. There are also likely extensions that require a float and barf on an int, since you need to write a separate code path for each.
After discussions with Erlend and Ned, I’ve opened this PR for Sphinx to clarify :type:. Specifically for the attribute, data, and property directives, :type: means two entirely different things depending on if it is part of the directive argument/options or the directive body.
There are options for changing Sphinx’s behaviour here, but I would welcome thoughts on what Sphinx should do. The info-list options date back to v0.4 (2008), just after Python 3.0 beta 1 was released. Type annotations were far less developed then than they are now, and perhaps as a result of this Sphinx’s info lists support three container syntaxes (list of int, list(int), and list[int]), as well as both the or and | union operators. It might make sense to standardise on “runtime typing” in documentation, or we might want to extend support for or and of to the parts of Sphinx where they currently aren’t permitted.
Hopefully it’s back here. The EB got together with @AA-Turner, and we came up with the following:
Because this is the “Library reference”, we should prefer precision and accuracy.
We recommend using the | notation wherever it applies, in favor of or.
To solve the problem of including things like file-like object etc., we recommend putting those in string quotes (either ' or "), and modifying the rendering code to look in those strings for linkable topics like file-like object.
Adam has volunteered to make the necessary changes to Sphinx (or a Sphinx package, wherever this lives).