PEP 727: Documentation Metadata in Typing

mikeshardmind · September 30, 2024, 9:16pm

That’s really not the case, we have real-world examples from use in FastAPI as a test bed that were brought up earlier in the thread.

Doug Hoskisson:

I first got interested in this PEP because…

I had a situation where I put a lot of information in a doc string, and then I made a new class and I thought “I want that same information in the doc string for this class.”

But, of course I don’t want to duplicate information - violating DRY - maintaining the same info in 2 places. So how can I have the same information in 2 doc strings without having to maintain it in 2 places?

It’s not clear to me whether this PEP even addresses that, but that’s why I got interested in it.

When I first saw the proposal, I was like “err… um… maybe…”,
but then seeing this example https://github.com/tiangolo/fastapi/blob/df4c501136c76a2ef83e3c7e8330c15b5f84491b/fastapi/applications.py#L51-L646
I’m more like “no”.

How am I supposed to see the parameters to the function? I don’t want to have to scroll through 600 lines of documentation to just get an idea of what parameters the function takes. It’s a significant amount of work for my eyes to pick them up. And the default values are so far away from the parameters.

And the fast API example also involves referencing other docs by URL, so if we were to hold what some people said is a goal of not doing that, it would be even worse.

JPHutchins · September 30, 2024, 9:39pm

Sorry that this was unclear. The old Google Style would be worst cast scenario +1 LOC per parameter vs this new style because old style requires the parameter name (and maybe type) to be repeated on two lines.

I think that the FastAPI init is quite an extraordinary case in which there are many parameters and each one has multiline documentation, going so far as to include example usage. Because this forces the type to be on its own line, and because the docstrings begin on a newline for clarity, I will concede that the FastAPI init would be fewer LOC in a traditional style vs the new style. Nevertheless, I find the doc style of that init signature to improve the readability of the function signature, source code, and documentation overall because the documentation is tightly coupled with the parameter.

rsdenijs · October 1, 2024, 4:27pm

There is no standard way of making see:func: xyz jump to docs being referenced, no?
I want to put the docs for frequently used variables in a central place, explicitly not disrupting the flow of the code, but leveraging the IDE to just look at the variable if I want to, just like with any other variable definition. I don’t want to compile and jump back and forth between the IDE and the docs, that is a really bad dev experience IMO.

bwoodsend · October 1, 2024, 5:54pm

see :func:`xyz` is Sphinx syntax. It won’t do anything clever for just calling help(some_function) in a REPL. I think PyCharm might try to render Sphinx docstrings (it’s a while since I used it) and therefore possibly cross references. The general answer though is no, it’s a genuine caveat that most of the ways you can look at a docstring are just plain text.

On the flip side though, mouse hovering over a function and a function help display dialog so large that it doesn’t fit on the screen popping up isn’t particularity great either.

mikeshardmind · October 1, 2024, 6:17pm

If the alternative of standardizing the big 3 docstring formats happened instead of this, it’s possible that more IDEs would be willing to add support for following/viewing references in these standard formats, but for sphinx at least, this would mean that IDEs would either only support fully qualified references from the project + the standard library and builtins, guess when it comes to intersphinx, or need to be able to understand and look for intersphinx (not that this is difficult, but that it’s a consideration)

There are certainly other less disruptive ways here that we can look to for improving people’s ability to navigate documentation without needing to open a fully rendered docs site.

pawamoy · October 13, 2024, 9:32pm

I was just made aware of “docments”: Docments – fastcore.

Without docments, if you want to document your parameters, you have to repeat param names in docstrings, since they’re already in the function signature. The parameters have to be kept synchronized in the two places as you change your code. Readers of your code have to look back and forth between two places to understand what’s happening. So it’s more work for you, and for your users.

Furthermore, to have parameter documentation formatted nicely without docments, you have to use special magic docstring formatting, often with odd quirks, which is a pain to create and maintain, and awkward to read in code.

That rings a bell

bwoodsend · October 14, 2024, 12:05am

God help us if writing a parameter name more than once becomes a crime…

def do_nothing(x):
    return x  # Ahh! I wrote 'x' twice! The duplicity, it burns!

teobe · October 16, 2024, 8:32am

I’ve been reading through the conversation with some mixed feelings. I currently use reST docstrings and don’t feel like switching to Annotated+Doc because of decreased readability (just my personal opinion).

Maybe, similarly to how | is now preferred over Union, there could be a way to simplify Annotated, leveraging type.__matmul__ to convert some_type @ (ann1, ann2, ...) to Annotated[some_type, ann1, ann2, ...]?

Then we could write docs as follows, letting the “real” type have its priority over the “annotations” (@ as in Annotation?) while reading this:

def some_function(
    some_parameter: SomeType
    @ (
        Doc("Some documentation goes here"),
    ),
    some_quantity: float
    @ (
        Doc("""
        Some multiline documentation
        """),
        Unit("m/s"),
    ),
) -> SomeReturn @ Doc("Some details about the return value")

Just wondering if this could stimulate the conversation further (given the scope, it could be a separate PEP as well).

Note: I’m using Unit from annotated-types just to show how this could play with other annotations.

DanCardin · October 16, 2024, 12:27pm

I feel the need to keep beating the drum of: Insofar as this is useful for individuals, great, but it’s ultimately largely a stylistic choice that it feels like there’s not much point in debating. And to me, it feels like the it’s missing the primary value of the PEP.

But again, (imo) this PEP ultimately provides a standard location for this information to exist at runtime (for runtime introspection libraries, like pydantic/fastapi/sphinx) whether or not users ultimately are using that syntax or not. And it also enables one to document class attributes or module members, or other things that dont exist in a context where there is a signature with a __doc__ to attach to (again, useful only to programmatic introspection, otherwise a comment would suffice)

Perhaps I’m being too optimistic, but in my mind this PEP enables there to exist “off-the-shelf” adapters for “attribute docstring”, “google format docstring”, “numpy format docstring” that normalize those forms into Doc annotations, such that they’re more readily accessible during runtime introspection across tools.

If this PEP were accepted on those grounds, then you could conceive of future PEPs that made it more concise or ergonomic to use as a user. Certainly making Annotated more concise is a cross-cutting concern well beyond the scope of Doc annotations.

dimaqq · November 19, 2024, 4:57am

Now here’s the question, does Doc document the type itself or all variables of said type?

ssweber · November 19, 2024, 11:52am

Thanks for sharing this. I quite like the documents simplicity! Now to add support for it into IDE’s…

def add(
    a:int, # the 1st number to add
    b=0,   # the 2nd number to add
)->int:    # the result of adding `a` to `b`
    "The sum of two numbers."
    return a+b

mikeshardmind · November 19, 2024, 1:45pm

Beating the drum on this does nothing if you aren’t going to address things that are seen as showstoppers for this pep.

By placing documentation in Annotated, you undercut the ability to prevent docstrings from being loaded into memory (-OO) which is used in some embedded cases to limit memory use to what’s needed to run the application.
It causes absurd git diffs and function parameter lists that span hundreds of lines. This isn’t a good user or maintainer experience in real world code (the pep author’s library was shown as an example of this)
It creates ambiguity over what is being documented, the parameter or the type. Annotated is largely used to add additional information outside of the type system to a declaration, but some of the examples people have for this describe the parameter and not the type, while others want to reuse it documenting the type for all places it is used. This seems to indicate that we might be better off with standardizing a way to attach docs to type alias statements such that 1 and 2 arent an issue either
It’s unclear why we need this, when standardizing the existing docstring formats can give the same benefits without any of the other issues, and without requiring every library using one of the big 3 existing docstring formats to have churn to support this.

MegaIng · November 19, 2024, 2:26pm

A somewhat unrelated FR might be to add an option to strip annotations. This would also prevent dataclasses from working, but maybe there is some version of this that could work. (E.g. decorator names that make the compiler keep annotations or something)

Melendowski · November 19, 2024, 3:55pm

By placing documentation in Annotated, you undercut the ability to prevent docstrings from being loaded into memory (-OO) which is used in some embedded cases to limit memory use to what’s needed to run the application

I find this point less important. Individuals doing embedded work can simply work around it, as I am sure they are doing currently. In applications where -OO is necessary the individual is more than likely already taking extra care in what 3rd party libraries they can even use due the fact that many popular ones do runtime manipulation of docstrings and using -OO breaks the library.

DanCardin · December 1, 2024, 1:15am

Of these points, i think only #3 is potentially relevant to the point I’m making. 1/2 are, again, about subjective per-user DX. Whereas my whole point is, none of that is relevant because this PEP doesn’t need to be a feature meant for end-users. it’s fundamentally providing space for a thing that doesnt exist.

I dont find this especially damning personally. fastapi, pydantic, sqlalchemy (, my own libraries,) and other libs increasingly use Annotated to stash data relevant to the field, but which is irrelevant to the type. that seems largely the point of Annotated, ergo it makes sense for using it for this feature (or at least i dont think it’s confusing the purpose of anything)

a way to attach docs to type alias statements

How it that not just FooType = Annotated[T, Doc("Foo")]? I feel like this is very much the point of why this is a fundamental difference vs strings being attached at the much looser function/class scope.

I really dont think any sort of docstring format gives you the same benefit. And i think it would be a net reduction in complexity for everything that needs to introspect docstrings if they could “just” use a set of docstring2doc/attribute_docstring2doc transformers that normalized everything, and tools for which this is a relevant concern wouldn’t need to all have their own bespoke handling of the many different ways fields can be documented.

wu-clan · October 24, 2025, 4:29am

I am not against this standard; it looks good and is in line with the purpose of Annotated, but the IDE must provide a collapsing feature for it, otherwise it will be a disaster, especially when you try to read the source code signature; you will go crazy.

wu-clan · October 24, 2025, 4:32am

If you are currently reading the fastapi source code, it will be painful; please also provide an automatic collapse plugin before this PEP is accepted.

tiangolo · October 24, 2025, 3:33pm

Thanks all for the discussion and feedback here!

Some final updates to hopefully, finally, finish this discussion.

PEP 727 is now withdrawn.

@Jelle helped me handle updating that part, as it was taking me some time to get back to this.

It seems the current best approach is to keep this idea out of the standard library, as a third-party tool, with a much more constrained scope.

I just published GitHub - fastapi/annotated-doc: Document parameters, class attributes, return types, and variables inline, with Annotated. for this, just for my tools (FastAPI, etc), and whoever, if anyone, wants to re-use the same idea.

I migrated FastAPI to use this new annotated_doc.Doc. This would make it easier for the typing_extensions team to decide if, when, and how to deprecate and remove Doc from there.

@pawamoy also just added support for annotated-doc in GitHub - mkdocstrings/griffe-typingdoc: Griffe extension for PEP 727 – Documentation Metadata in Typing. (that was fast ), so that the FastAPI reference docs continue to work normally.

I’ll continue to use this idea and this new small tool for my own projects, but now as an external effort with a much smaller scope, in my own constrained corner of Python, so it shouldn’t affect anyone concerned here about having this in the standard library.

Thanks all!

guido · October 24, 2025, 4:53pm

It’s a bummer, really. I have an application that needs this – we use a library, typechat, that uses this: the class definitions including Doc comments form a schema that ends up in a prompt that guides an AI to generate an appropriate JSON, which is automatically (by typechat) translated back into instances of the schema classes.

Example: search_query_schema.py.

JoBe · October 24, 2025, 6:52pm

That’s unfortunate to hear, I was really looking forward to this (and I thought the PEP got accepted at some point), as this would have been useful for so many things, e.g.: Help text generation for explaining parameters.

Perhaps a similar feature could be introduced alongside this proposal.