PEP 727: Documentation Metadata in Typing

I have great respect for your work with FastAPI and Typer, the latter of which I’m an avid user of, and you’ve inspired me to use type annotations for run-time use in my own code. However, I’m not a fan of this proposal for reasons that have already been brought up (mostly that it adds information in one of the most information dense parts of a function definition). It mirrors my least favorite part of Typer, that the help text goes right next to the parameters in the function header (not saying it’s a bad design choice, I honestly can’t think of any other way to do it). I think it’s ok in Typer, even if it makes working on documentation harder IMO, but I would not want to see that on the regular in other code.

For those wanting to discuss how the stdlib could be used to get the same information out of docstrings, I started a new topic. I see such an API as orthogonal to PEP 727, which would codify the existing approach to documentation.

2 Likes

Thanks for writing this PEP! :slight_smile:

I haven’t double-checked that I’m not repeating something that someone else pointed out already. Apologies if I’m repeating something that’s already been discussed/mentioned. I am commenting on the PEP as written, having done a cursory read of the discussion here.


I want to push back on this statement, albeit because :sparkles: nuances :sparkles:.[1]

pyproject.toml as a whole wasn’t “yet another standard”. It came from a place of needing to determine build-time dependencies generically without needing code execution, a problem that we had no existing solution for already.

PEP 518 has an extensive rationale for why a novel approach and new file was proposed for it. PEP 517 has a similarly long rationale for enabling an ecosystem of build-backends, rather than forcing everyone to use setuptools which had proven difficult to evolve.

(I’m likely biased, since packaging changes tend toward really long rationales for design choices, but I was surprised at how short the rationale section is in this PEP :sweat_smile:)

The piece within this that I agree on being “one more way to do things” is the [tool] table within pyproject.toml. It originated as “let’s keep build configuration and allow build tool-specific configuration too” , and it evolved into “let’s allow all tool-specific configuration” since “what is a build tool” is an arbitrary and somewhat useless line of argument to have. I don’t think a similar line of argument applies here.

(and yes, with the power of hindsight, I do wish a better job was done of setting community expectations around this new file and the tool table. :sweat_smile:)

To use the parallel drawn, even though no one outlawed having tool-specific configuration files, or using non-TOML configuration files, nearly every popular development tool has had its users request/argue for having/contribute pyproject.toml’s [tool]-based configuration support. The expectation that having this standard would not mean that library/tooling authors will be pushed to adopt it, is at odds with that experience. It’s not a 1:1 situation, obviously, but I think some of the learnings transfer.

PS: The above is all based off of memory, so I might be wrong about some detail on this. It’s been 4-7 years, and I was still a curious teen back when the initial discussions were taking place. Please correct me if I got something wrong here!


I’ll echo this sentiment.

This was my first reaction while reading this PEP as well as the initial discussion here. I think introducing one-more-way, especially as standard syntax/library code, ought to cover why improving existing solutions isn’t a viable option more thoroughly or, at least, have a stronger reasoning for that choice than what’s provided. For me, I’d set the bar at “we tried and it’s not feasible because (list of socio-technical reasons)” but I’m also cognisant that not everyone might think it needs to be that “high”.

To draw (again!) on the referenced parallel of pyproject.toml, the rationale for both PEP 517 and PEP 518 extensively discusses why the options are being considered over the existing solutions. Alternative approaches were considered and tried over the course of years before we got to the point of discussing using a new file for conveying that information.


At the cost of being reductive, the main selling points of this proposal (to me, anyway) seem to be programmatic access to the docstrings and the ability to reuse the arguments.

However, you need to either execute or pseudo-execute (i.e. implement all the type system “magic” of handling certain ifs, try-excepts etc; or, at least, keep logical track of aliases while respecting namespacing) to actually extract accurate type alias information. Compare that to using docstrings, which allows fetching all the information to be done from an AST directly; without needing to do a full execution of the module or handling any of the execution.

Both of these, of course, have a restriction: dynamic logic doesn’t work on either static analysis approaches. Arguably, there’s a tiny amount of dynamism allowed under the type system based model in exchange for significant complexity.

While Sphinx’s built-in autodoc executes code and it’d manage to adapt for this with some tractable reworking, the AST approach is how sphinx-autoapi works. (IIUC, mkdocstrings does this too – don’t quote me on that!) As currently written, this PEP would require AST-based API documentation generation tools to take on the complexity of extracting documentation information from types that might be aliased, and that those aliases might be conditionalized since the type system supports that.

Realistically, IMO this means that AST-based documentation generation tools won’t be supporting this PEP’s proposed model (not fully, anyway).

OTOH, removing support for it (type alias and/or all the conditional magic that come with the type system) would mean that the only thing this provides is an alternative syntax that could live outside the standard library and is a strictly equivalent alternative to standarding on a docstring format ecosystem-wide. It’d drastically weaken the argument for adopting this model as the standard model.

This could be, of course, reduced by adding restrictions on how the TypeAlias is assigned and managed, but that’s a symptom that we’re not using the right abstraction model here IMO. :slight_smile:

While this isn’t a showstopper issue, it is certainly an argument against the proposed model IMO.

This actually leads me nicely into…


… The open issue about mixing documentation and typing. From the PEP:

It could be argued that this proposal extends the type of information that type annotations carry, the same way as PEP 702 extends them to include deprecation information.

Yes, but it’s not a strong argument IMO. :wink:

To use PEP 702: it discusses why the type system is a good vehicle to provide that information, type checkers are actually using this information, type checkers using that information results in useful behavious and the PEP includes references to prior art in other ecosystems showing that deprecation information leveraging the typing mechanisms isn’t a novel concept.

None of those are true for this PEP, IMO. To be clear: I’m not saying this PEP needs to do the same things or make the same arguments, but that it needs provide a stronger rationale for its design choices (especially, the socio-technical choice of not trying to settle/tackle the lack of standardisation problem).


Given that one of the motivating goals with this PEP is to provide richer errors/IDE experience, with the motivation explicitly calling out lack of ability to syntax check in IDEs, it’s interesting that it doesn’t bless a markup format. By not picking a markup format, we’d be punting the problem to a different layer (no invalid markup stuff in the strings you have in annotated). While “it’s unclear what to pick” is something that I agree with, I think the decision to not pick a markup format is a consequential one.

IMO, it’d be worth calling out in a more visible manner in the rationale (or rejected ideas, or wherever the authors deem appropriate). Even a position like “we don’t think it’s as important to have syntax corrections for the markup in those strings” would be fine here – it’s a judgement call, and I think the PEP makes the right one – but it seems like an important-enough detail for implementors and users to call out more visibly.


  1. I’ve once again failed xkcd: Duty Calls because this statement stuck out to me and it’s now 1:50am. ↩︎

9 Likes

I’ve seen more than one person mention that Annotated isn’t necessarily restricted to typing, and I feel that’s only half-true at best.

You mentioned one component of this – the fact that it comes from the typing namespace – but that’s mostly cosmetic. It could also be aliased into a new home if there were sufficient appetite for that.

The other thing is that Annotated requires a type annotation. So you can’t opt in to using a feature built on Annotated unless you also opt in to applying a type to whatever you’re annotating.


Is there no value in this being a tiny package on pypi? Most of the participants in this thread could create such a thing in an hour or two – it hardly needs a complex testsuite or other difficult infrastructure.

I’m not sure I follow why it should go into typing_extensions or typing, which seems to be a key part of the rollout plan here. A package can run ahead of the stdlib, and then transform into a rolling backport if it is pulled in.


Regarding rst vs markdown… As an older millennial (I can’t say I’m a younger developer anymore! :sweat_smile:), markdown is way more comfortable to me. But I would still argue for rst, on the grounds that it is much better defined. Markdown is to rst as yaml is to toml.

If markdown is used, I think that also adds the burden of defining which flavor and subset of markdown is allowed. And I just can’t see all that effort as being worth it, in exchange for, mostly, single backticks rather than double.

4 Likes

Thanks for the PEP, @tiangolo!

I’ve used Annotated as a way of providing per-parameter documentation in the past, and I can see the attraction of standardising it so that third-party tools can understand this kind of pattern. However… I’m afraid I also just find the proposed syntax here much too verbose. It wouldn’t be something I’d consider using in my own code, unfortunately: I think readability of code is pretty important. I also think this sentence in the PEP isn’t entirely true:

Nevertheless, this verbosity would not affect end users as they would not see the internal code using typing.doc() .

End users would very much be affected by the increased verbosity if they called help() on a function to get documentation in the interactive REPL (unless the plan is to have inspect.signature or pydoc strip the documentation from the function signature before displaying it in the output of help()).

Similar to Laurie’s comments in PEP 727: Documentation Metadata in Typing - #50 by EpicWink, I wonder if a better solution might be adding new special forms to typing? Even if you don’t like the idea (there are lots of disadvantages – see below), it might be worth discussing in the “rejected ideas” section. Perhaps you could have something like this, where Param and Attr are both treated identically to Annotated when it comes to type-checking (the metadata is completely ignored by type checkers), but at runtime, you could enforce that only a single metadata element is provided, which is a string, with the expectation that people provide a documentation string for tools like Sphinx to use:

from dataclasses import dataclass
from typing import Param, Attr

def foo(
    x: Param[int, "must be less than 5"],
    y: Param[str, "must be four or more characters"]
) -> None: ...


@dataclass
class Foo:
    x: Attr[str, "needs to be at least five characters"]
    y: Attr[int, "Must be less than 10"]

Pros of this alternative idea:

  • It’s less verbose; I find it much more elegant

Cons of this alternative idea:

  • We’d need to add new special forms to typing that type checkers would need to add support for, whereas the current proposal works “out of the box” with type checkers
  • Annotated[] is supposed to be the general-purpose way of adding metadata to annotations that’s irrelevant for type-checking; adding special forms for this purpose would be sort-of an implicit admission that Annotated is too unwieldy for some use cases
  • The specification would probably be more complex – how would these new special forms interact with Annotated[], or with get_type_hints?
2 Likes

As a maintainer of Jedi I was asked to provide feedback here. Thanks for reaching out!

I generally agree that it would be nice to have documentation for params in a more structured
way. There are a few things that could use improvement:

  1. I feel like there should be some sort of typing.get_doc() call for annotations at runtime. It’s just very normal that one is able to extract information from the typing module. Unlike PyCharm/mkdocs/VSCode, Jedi isn’t just a static analyzer, it can also be used within IPython/Jupyter and extract runtime information.
>>> def f(a: int): ...
...
>>> a = inspect.signature(f).parameters['a']
>>> a.annotation
<class 'int'>
>>> typing.get_doc(a.annotation)
  1. Like others have pointed out: It feels a bit wrong to use Annotated for this. If this is a really a Python stdlib provided feature, I would probably prefer something like Doc[str, "The name of the user"], which is also shorter and more concise.
def foo(
    x: Doc[str, "Lorem Ipsum"],
    y: int,
) -> Doc[bool, "Lorem Ipsum Dolor"]:
    ...

I feel like that’s way more readable and for the tools like Mypy/Jedi/VSCode/etc, this is not a lot of extra work.

1 Like

This feels like the key thing here. If I can indulge in a little history, originally annotations had no defined semantics, and they were explicitly for people to use for whatever they wanted, to encourage experimentation. Typing was always one anticipated purpose, but not the only one. But nobody really did much with annotations - there were a few experiments, but not much mainstream use.

So then the decision was made that annotations were for typing - it was always an intended use case, and no-one else had come up with other compelling uses, so let’s make the decision.

But once types became accepted, and common, people started finding other things that could be attached to variables, parameters, etc, and all those use cases that had never come up before started appearing. And so we have Annotated, marking things as deprecated, and discussions like this, about how we can cram non-typing annotations into the type system.

Maybe what we need to do is take a step back, and thing about how we can make non-typing uses of annotations sit more comfortably with type annotations? Not with Annotated, which frankly feels like a hack (and an unappealing, clumsy one at that…), but with some form of first-class way of allowing other use cases to grab a part of the annotation space for themselves, separately from typing.

Something like Doc works a bit like this, although it’s still not obvious how I’d use it to attach a docstring to something I didn’t want to declare a type for.

I guess what I’m really saying is that Annotated doesn’t really work, because it’s based on a presumption that anything that’s not a type is a “second class citizen”. And what we need is to re-think Annotated and produce something that’s less biased towards the “everyone uses types” mindset that tends to prevail in the typing community (for obvious and understandable reasons).

17 Likes

My current opinions:

  • document is not type. It would be nice if docstring is available without typing module.
  • It is really nice if docstring is available from both of runtime and statically (AST or CST).
  • I am worrying about annotation and Annotated are overused. I’d like annotation is just type hint.

So I prefer one of these approaches:

  • Add new syntax for function argument docstring.
    • e.g. def (a "comment a" : int, b "comment for b" : str)
  • Add some formalized text structure for function docstrings.
    • At least C# has this, although I don’t like XML.
    • PHPDoc looks like almost standard.
12 Likes

I’d also like new syntax for this, maybe

def add(a: int ("an integer"), b: int ("another integer")) -> int ("result"):
    return a + b

Those are parsed as function calls and you can’t avoid that due to runtime introspection potentially wanting to use function calls to produce some object that represents the type.

I like the idea of having documentation closer to the definition of the field. It’s easier when you have both of them in the same place.

I also like the idea of having a standardized convention of defining documentation per attribute and thus a standardized way to introspect these.

However, I have a few problems with the proposed syntax:

  1. It feels awkward to use the Annotated feature for this. The need to import something from typing in order to achieve this does not seem fun.

  2. Documentations may be quite long, writing them in the field definition will mostly “force” you to write the annotation over multiple lines (assuming you want to adhere to PEP8).
    It may disturb the ease of reading the actual definition.

While I assume it would be a harder implementation to write, may I suggest an alternative like expanding the current __doc__ feature to work in other contexts?

Something that will (roughly) look like so:

def foo():
    """
    Currently, this documentation will be automatically set on foo.__doc__
    """

class Bar:
    a: int
    """"
    The new feature allows this is documentation to be automatically set on Bar.a.__doc__
    """"
1 Like

Similar to the class attribute docstrings, “K&R style” declarations could be nice, as they separate the runtime and “static” (type, docstring) information, while being backward compatible. (It would still need a new place to store parameter docstrings, e.g. a new Documented[] in __annotations__.)

def frobnicate(widget, param, value):
    """Set widget's parameter to value."""

    widget: BaseWidget
    """The widget to frob."""

    param: str
    """The parameter to frob."""

    value: int
    """Parameter value to set."""
7 Likes

I released typing-extensions 4.8.0 yesterday with support for typing_extensions.Doc: typing-extensions · PyPI

2 Likes

I’ve updated this PEP 727 example to use typing-extensions:

3 Likes
Collapsing because it's a bit off-topic:

I find @ntessore’s suggestion very interesting.

  • it makes signatures ultra short and readable
  • typing information is immediately available at the beginning of the function body
  • parameters documentation as well

That doesn’t solve the case for documenting return values, or other common things like exceptions, warnings or deprecations. But it makes me wonder if this suggestion could be expanded a bit more:

def frobnicate(widget, param, value):
    """Set widget's parameter to value."""

    widget: BaseWidget
    """The widget to frob."""

    param: str
    """The parameter to frob."""

    value: int
    """Parameter value to set."""

    ...

    warnings.warn(
        "The `value` parameter is deprecated and will be removed in a future version",
        DeprecationWarning,
    )
    """value: When the `value` parameter is used."""
    # This docstring is here to document the warning.
    # The deprecation is detected thanks to DeprecationWarning,
    # and `value:` at the beginning lets the analysis tools know
    # that the subject of the deprecation is the `value` parameter.
    # `frobnicate:` instead would target the function itself.

    if condition:
        raise exceptions.CustomError("message")
        """When a certain condition is met."""

    ...

    return foo(bar)
    """optional_name: A transfooed bar."""

As much as I like it, it’s a static-only solution: none of these docstrings can be picked up at runtime.

1 Like

Writing up my thoughts on this after thinking about it for a bit and reading this discussion.

You can always use type aliases:

Users = Annotated[list[User], doc("A paginated list of users")]

def foo(users: Users): ...

As pointed out above this is great for reusability:

Consider the case of an APIs that accept an API key or similar, that type gets used in multiple endpoints. What Annotated lets you do is form something like:

APIKey = Annotated[
    str,
    FromHeader("x-api-key"),  # web framework metadata
    StringConstraint(pattern=r"\w{32}"),  # data validation metadata
    doc("The user's API key"),  # the doc stuff discussed in this PR
]

As long as the various tools can understand each other the web framework can also use the doc() part and the StringConstraints() part to generate it’s JSON schema. I often find this beneficial if nothing else to give gross large types a meaningful name and to clean up the function declaration.

I do recognize that it can be strange to have this information before the type is used in a function signature. But it’s not like docstrings were any closer to the function parameter (see comment above about scrolling back and forth). This the situation is still not all that grave: unlike docstrings, you can click on the : Users part and be taken to the definition, be it 2 lines above or in a completely different file. In fact sometimes you want to move that somewhere else, like in the case of the APIKey type I showed above.

Nonetheless, I do agree that as it exists right now Annotated is way too verbose. Especially with the import. I wish there was a way to make it a builtin or we could use some valid but otherwise unused syntax to avoid having to type out Annotated all over. I don’t see that as a reason not to use it, rather to the contrary: if it become popular and is used a lot we just need to figure out a way to make it less verbose to use.

Regarding standardizing this via a PEP: I empathize with both sides of the argument. I think the answer to this is to experiment as a 3rd party library first but get good buy-in from the ecosystem at the same time. I feel that the ecosystem can fall into this rut of chicken and egg: no one implements anything until it’s “official” but we can’t make it official until there’s extensive usage in the wild. I won’t sit here and say that IDE and tooling developers should all just put in more work to implement experimental proposals like this, but I will tip my hat off to the folks that do like pyright, typing-extensions, mkdocstrings/Griffe and others.

What to me would be the ideal solution (which was somewhat mentioned above) would be to preserve docstrings added to variables and parameters, thus allowing examples like this to work at runtime:

APIKey: Annotated[str, ...]  # or not using Annotated, doesn't matter
"""The API key for the user"""

class Foo:
    key1: APIKey,
    key2: APIKey
    """Overridden doc for APIKey"""
    key3: str
    """A brand new doc"""

def foo(
    key1: APIKey,
    key2: APIKey
    """Overridden doc for APIKey""",
    key3: str
    """A brand new doc""",
):
    """A docstring for the function, without documentation for the parameter"""

The class version and free variable version pretty much work and IDEs support them, there’s just no information at runtime so FastAPI, Pydantic, etc. can’t use them. The function version would require syntax changes. I think this option is better overall but harder to implement since it really does need buy in for syntax changes before it can be viable and adopted by IDEs and other tooling. So maybe doc() is a good starting point to build towards this and explore uses of Annotated.

6 Likes

I prototyped a library last year called sigdoc that implemented something very similar to this, but as runtime __doc__ generation (and obviously no static analysis/generation support).

sigdoc uses separate P(arameter) and R(eturn) types within the Annotated sections. Both support a type_hint= arg (to override very verbose runtime resolved hints) and P supports a default= arg (to tidy str representations or describe dynamic/conditional defaults). The class/function is then decorated with @document, which stitches together the main __doc__ + the Annotated metadata into a new __doc__. @document accepts a style argument that determines what format to output (numpydoc, etc). I never got around to adding a Raises annotation.

With the standardization in this PEP, I think sigdoc could either be greatly simplified (to only do the __doc__ generation) or, even better, made obsolete.

I think the current PEP’s decision to punt on additional metadata or standardizing a docstring style is reasonable.


In this PEP, should __doc__, help, or other places docstrings are rendered at runtime include the doc(...) info, as they would normally be with “traditional” docstrings?

That would probably require picking a standard/default docstring style to render and generally be a bit too magical if it were to update __doc__. Though, it might ease:

  • using doc() with older tools that are unaware of it (probably not worth any constraints on new tools’ ability to render it how they like though)
  • use in libraries without changing how users inspect/debugging (though again, help already includes func sig)

This might bring more complexity than benefit long term.


I think this was partially mentioned above, but doc(...) could be powerful with ParamSpec and Concatenate to allow documentation for even dynamically added/modified parameters. This ability to add/remove would be one other advantage over parsing from standardized (but still static) docstrings.

This could get tricky in the unlikely event this PEP does any of the __doc__ manipulation I mentioned above.

I hope this never becomes the norm for documentation. Paramspec and concatenate are for wrapping other functions, and just generating something saying it abstractly wraps another function instead of the author of the wrapper documenting the purpose of the wrapping seems like documentation becoming for machines and not for people.

7 Likes

I’m not sure where I implied this should just generate something abstract. Isn’t the “author of the wrapper” the one who creates the specific ParamSpecs - and thus able to document what they want?

If you’re using ParamSpec to:

  • add a parameters like a lock, why shouldn’t the new param be documented for users?
  • remove a parameter that was written by hand in the """docstring""", then the docstring is now wrong

In other words, this would allow easily documenting ParamSpec params for the user.

I’m sorry if I misunderstood, but the use you described, seemingly in support of this, is definitely stitching together documentation.

This is the beahvior I never want to become the norm. It’s not useful to a human reader.

I’m not. Paramspec is useful for handling user provided callbacks with arguments and (for the user) ensure they match. It doesn’t do much more than that, its very limited. I use it (And concatenate) with a decorator pattern for route handling. it doesn’t make sense for me to document anything by type here, I don’t know what the user provided type is, if I did or if I was enforcing one, I’d use a protocol, not a paramspec. Paramspec doesn’t make sense to use entirely internally as it provides worse checks in the case that you know the args and kwargs already.

The best documentation I can add for such a decorator is “This decorator inspects the type hints of the provided function to generate an IPC route and register this function as handling it. The first argument must be of type: IPCContext and will be injected prior to the ipc route arguments being handled” The typing on this just warns the user if they didn’t have a parameter for IPCContext.

2 Likes

Paramspec doesn’t do this, as was pointed out already. If you’re looking for this, you actually want one of a few other proposals, and good documentation could be linked to in each relevant function

There’s the proposed ability to use typing.Unpack on typing.TypedDict for kwargs. Then if you have a bunch of functions with the same kwargs and purpose, the TypedDict can represent them and have an appropriate docstring which is local to the kwargs. This can be done without the proposed typing.Doc but is also just broadly useful even for more expressive typing.

Or to extract and re-use kwargs from functions which is explicitly about direct re-use, and may provide a good solution for that problem.