Counter-argument: āitās not typingā. Annotated was designed to annotate types, true, but also says:
As such, Annotated can be useful for code that wants to use annotations for purposes outside Pythonās static typing system.
Unfortunately the typing module is the only meaningful location in the standard library for Doc (next to Annotated). There isnāt a generic documentation-related standard module (maybe we could create one ?). Even if many do not use or consider typing primarily as documentation, it is still also documentation.
Counter-argument: āitās too verboseā. Not really. Spacing taken aside, you can alias Annotated as A, then A[int, Doc("string")] is ~10 characters more than int, for each documented parameter. But from that we could substract the average length of parameter names, since we donāt have to repeat them in function/class docstrings.
Counter-argument: āit makes big signaturesā. Big signatures can be collapsed (partly or completely) with adequate tooling. Even without Doc, signatures with annotated types can grow quite big, so it would make sense to me that IDEs and text editors would allow collapsing each parameter in signatures, maybe just showing the (truncated) unannotated type and (truncated) default value.
Counter-argument: āitās not pretty or as pretty as function docstringsā. Personal preference
Counter-argument: āaccepting it in the stdlib puts pressure on the ecosystem to adopt itā. Recent thought I had: True, but the opposite is true too. By not providing a standard way to structure docs data, pressure is put on the ecosystem to rely on non-standard, divergent, unspecified and sub-optimal docstring styles.
Anyway, I hope Iām not starting a debate over again Iām actually happy if Doc lives in a third-party library Just wanted to share my perspective again as someone who works a lot with docstring styles (and grows to enjoy them less and less). I hope we can continue working and discussing towards a standard solution even after this PEP is withdrawn or rejected
It already does, as typing_extensions.Doc, and you are free to use it. Weāll probably keep it in typing_extensions indefinitely even if the PEP gets withdrawn or rejected, for backwards compatibility reasons.
You are free to use it in your own code using the typing-extensions version. If usage of typing_extensions.Doc becomes widespread, that will be a good argument for accepting the PEP and putting it in the standard library.
If @ is deemed too cryptic by many, then how about -?
- can be intuitively read as em dash, used in places where a set of parentheses or a colon might otherwise be used:
def some_function(
some_parameter: SomeType - "Some documentation goes here",
) -> SomeReturn - "Some details about this return type":
""" Documenting the function itself here """
And as a bonus - is both a binary operator and a unary operator so thereās the potential to allow a parameter to be annotated without a type:
def some_function(
some_parameter: SomeType - "Some documentation goes here",
**kwargs: - "Additional keyword arguments"
) -> SomeReturn - "Some details about this return type":
""" Documenting the function itself here """
Some might argue that em dash is really a long dash, usually represented in ASCII as two dashes, which we can also consider as an alternative for better aesthetics:
def some_function(
some_parameter: SomeType -- "Some documentation goes here",
**kwargs: -- "Additional keyword arguments"
) -> SomeReturn -- "Some details about this return type":
""" Documenting the function itself here """
This would make the syntax more in line with SQL comments.
I donāt think itās a problem because it currently produces a TypeError. Thereās otherwise no intuitive use for it as a unary operator for a string anyway.
Yes, but my point is just so we can keep type annotation optional.
Even if this was worth pursuing, and I donāt think it is, @ was used in that hypothetical because there is no conceivable future to need matmul for types. Using - is problematic due to the potential for difference types in the future (ie. Iterable[str] - str, being any iterable of strings that isnāt just a string), paired with strings being used for forward references in typing.
Keep in mind that there were plenty of other things brought up in this discussion against this idea that donāt matter what the syntax actually would be.
Good point, so type - 'string' is problematic because 'string' can be used as a forward reference of a type, but I think type -- 'string' can still work because - 'string' can be made a Doc('string'), which type.__sub__ can special-case and make Annotated[type, Doc('string')].
Yes, the syntax just occurred to me and I thought Iād share it here just in case the other arguments against this idea get resolved later.
EDIT: Actually, I think I may have just resolved what I believe to be the biggest argument against this idea, namely that the original proposal makes the annotation too verbose, to the point that it adversely affects readability.
So instead of:
def some_function(
some_parameter: Annotated[SomeType, Doc("Some documentation goes here")],
**kwargs: Annotated[Any, Doc("Additional keyword arguments")]
) -> Annotated[SomeReturn, Doc("Some details about this return type")]:
""" Documenting the function itself here """
It can now be:
def some_function(
some_parameter: SomeType -- "Some documentation goes here",
**kwargs: Any -- "Additional keyword arguments"
) -> SomeReturn -- "Some details about this return type":
""" Documenting the function itself here """
which looks a lot less verbose and more readable to me.
Perhaps itās enough to nudge this proposal back on track for a reconsideration?
just as a refresher of the arguments against, i read the first ~60 comments, and most of the arguments were against the syntax and social pressure.
I continue to think that the primary benefit of this PEP isnāt even in the syntax, so much as it is in the standardization of any location at all for this information to exist for runtime inspection.
The existence of a standard location for the information to exist enables better syntax/bikeshedding in some future PEP. It enables some standard post-processor for āattribute docstringsā so that you dont need to traverse the ast to obtain it. It enables a post processor for numpy/google docstring parsers to put that information into the standard location.
The point being that it becomes easier for libraries/tools that need runtime inspection of the values to ājustā inspect the annotations for Doc instances, rather than 5 different mutually incompatible options.
Iām actually struggling with this in a current project, to the point that I have to copy paste between functions, for fear of not being consistent. Having to type it once would be of great use.
Iād be cautious of using this to deduplicate parameter descriptions. Iāve had description deduplication working before and immediately regretted doing it. Reading an API reference thatās full of duplicity has all the caveats of reading code thatās full of duplicity. i.e. Itās hard to find what youāre looking for past all of the repetitive boilerplate that youāve read a gazillion times already.
I found it a lot better to just write a canonical description once then rely on cross references. Then all you have to duplicate is the fairly immutable string see :func:`xyz` .
Eh, this seems like such an overgeneralization without having the example source code.
Take pandas for example, you know how many methods between a series and data frame have the inplace or axis argument? Would it not be more jarring for there to exist the logically equivalent but differently written description for the same argument in two methods? For what reason of course, because two different developers wrote the two different methods?
Afaik pandas handles a lot this by having runtime docstring manipulations through decorators.
What meaningful piece of information would you put in the description for a dataframe parameter thatās applicable to every time itās used and isnāt already implied by the type hint? āA pandas dataframeā? āThe dataframe to processā? Unless thereās some unique property a specific functionās dataframe requires, Iād say any description for that parameter is more noisy boilerplate than signal.
My bad, only now I do I see youāre talking about a dataframe method rather than a parameter.
And to be honest, to me the runtime docstring manipulations that a lot of the scientific packages do are already leaning towards the wrong side of what Iām describing. 100% of the help for numpy.sum() for example is just generic numpy.
I suppose I donāt consider it impossible for deduplicated descriptions to be useful. Merely that I think itās more likely to be abused in the name of do not repeat yourself and leave us with an ecosystem made up of vague, boilerplatey API references.
I donāt consider deduplicating descriptions by making the code less useful to developers a good tradeoff. (Iām not saying that pandas replacements at runtime are this, but I think we have clear examples of how noisy Doc[...] gets in diffs and merge conflicts already, as well as several hundred lines of just parameters for a single function)
Documentation tools already have a way to handle repeated notes, warnings, and descriptions without needing to repeat them literally, but instead reference them to be inserted here. If you want your documentation to improve, the best way is to take the time to set up documentation tooling and write more of it yourself. Trying to assemble more of it programmatically while trying to avoid needing to write any surrounding documentation will always be noticeable to a reader that this was stitched together, and documentation generated this way with no adjustments has always felt worse to read and reference.
Iām aware this means that users need to view rendered documentation, I think thatās acceptable. Libraries have gotten significantly more complex with time, and we have the tools to not only render documentation and host it, but also to make downloadable versions of that rendered documentation available, as well as ensure that users can locally build it themselves.
I disagree, I think Python code should always be fully understandable from itās source, including docstrings. Anytime I see bespoke documentation includes [1] in docstrings, I get frustrated and my flow is ruined.
I would hope that dedication can be argued against, and taught to be avoided, but used in specific cases where the benefits outweigh the downsides, as determined by experience. I of course have not been on the receiving end of spam PRs.
Also, I would expect basically no users download offline documentation for Python libraries, and even fewer would build the docs from source.
Because this came up over and over again, I must 2nd this as well as add my point of view. Having taken a look at the FastAPI signature oft pointed to, I find it far more readable, in all contexts, than the old unstandardized docstring param list alternatives.
The docstring approach requires a need to jump around the source code whereas this new style keeps the documentation tightly coupled. Arguments against this tight coupling remind me of over-abstracted code bases where the proliferation of tiny functions requires the reader to jump about the text, all while maintaining a āmental call stackā.
Both docs approaches lead to good auto generated docs - though docstrings would be prone to error whenever a signature changes - but generated docs or other āreadableā docs should never be necessary. This PEP improves the source code readability, lowers the total LOC, makes linting docs easier, eases maintenance, and reduces surface area for documentation-caused bugs.
Obviously, function signature readability is a matter of opinion, therefore I am offering a strong one. The source code is the documentation, and this PEP improves the source code.
Thatās a good point. Hereās what Iād expect, in a normal case:
def foo(
...
bar: Annotated[Bar, Doc("A short description of this param")],
...
) -> None:
"""Docstring begin...
vs
def foo(
...
bar: Bar,
...
) -> None:
"""Docstring begin...
Args:
bar (Bar): A short description of this param
So 6 LOC (+ remaining docstring) for new style and 9 LOC (+ remaining docstring) for Google style. The extra 3 lines are 1) newline before Args: section, 2) Args: line, 3) doc line. It becomes more of a wash if the argument docstrings are long.
Worst case scenario at scale seems to to be +1 LOC for every function parameter.
JP
PS: As much as I value explicit defs and behavior, Iād probably do import typing.Annotated as A or similar if this became standard.
def foo(
...
bar: A[Bar, Doc("A short description of this param")],
...
) -> None:
"""Docstring begin...
Or untyped Doc annotation option:
def foo(
...
bar: Doc["A short description of this param"],
...
) -> None:
"""Docstring begin...
Or a typed Doc annotation option with doc as 2nd positional
def foo(
...
bar: TDoc[Bar, "A short description of this param"],
...
) -> None:
"""Docstring begin...