PEP 727: Documentation Metadata in Typing

I ask for your guidance about how to proceed.

As I previously mentioned, my recommendation is to abandon the use of Annotated for documentation and build on the existing docstring mechanisms that are supported by existing tools.

Here’s another idea that addresses most (but perhaps not all) of the issues you’re trying to solve for FastAPI. You could stick with one of the three standard function docstring formats (all of which are already supported by existing tools) and then write a simple CI script that validates that the docstrings are consistent with the parameters in your function signatures. This is similar to what I suggested but without the added work of writing a spec or publishing a library. You could do this entirely within your own CI/build system with relatively little work.


I’m not sure who would be willing to put the effort to try to bring consensus among the several current conventions…

I think you may be overestimating the work involved in documenting, standardizing, and parsing one of the existing formats, but I understand if you’re not interested in championing such an effort.

For what it’s worth, I suspect that you would receive quite a bit of community help if you were to pursue this approach. :slight_smile:


I added the features I’m interested in to the PEP…

Yes, I saw those, but I didn’t think they were compelling.

  • Editing & Rendering: These are already supported by all popular IDEs and language servers for existing docstring formats. The proposed PEP doesn’t add anything here other than a redundant mechanism.
  • Deduplication and elimination of inconsistencies: As I suggested in my alternative proposal, there are other (arguably better) ways to address this through tooling, and this would benefit the thousands of library maintainers and millions of developers who have already invested in existing docstring formats.
  • Reuse of documentation using type aliases: As I’ve said, the latest draft of the PEP is very problematic and inconsistent in how it’s treating type aliases. If you were to fix this (what I think is a) significant design flaw, the proposal would no longer address this problem.
  • Access to documentation at runtime: This is also addressed by my alternative proposal.
  • No microsyntax to learn for newcomers: The proposed PEP requires newcomers to learn yet another way to specify documentation — one that differs from existing conventions. The microsyntax rules are not onerous to learn, and AI-powered copilots already know how to generate them. As mentioned in my alternative proposal, tooling could further simplify this through automation and validation.
  • Parameter documentation for ParamSpec: This is already handled today with traditional function docstrings. The proposed PEP adds no new value here.

I presume that only a subset of these are problems that you personally face as the maintainer of FastAPI. For example, the “microsyntax” issue is presumably not a problem for you. It would be useful to understand which of these problems represent pain points for you personally. That might help us find a solution that meets your needs. I’m guessing that you’re primarily focused on “deduplication and elimination of inconsistencies”?


I see that Pyright now supports some non-standard formats (e.g. strings under variable names)

Pyright has long treated string literals that appear immediately after an attribute or type alias declaration as a docstring for that symbol. Other IDEs do this as well.

Attribute docstrings are discussed in PEP 257. It says String literals occurring immediately after a simple assignment at the top level of a module, class, or __init__ method are called “attribute docstrings”. PEP 257 was written prior to the introduction of type aliases, but I think it’s reasonable for tools to treat type aliases consistently in this regard.

It’s important to understand that attribute and type alias docstrings rely only on constructs that are built in to the language. They have no dependency on third-party libraries or their (non-standardized) behaviors. Pyright has no knowledge of any third-party libraries or their behaviors beyond the type information provided by them. That’s a hard-and-fast rule that I’m unwilling to violate — for the same reason that it would be completely inappropriate for the TypeScript compiler to have intrinsic knowledge of certain third-party libraries.

Third-party libraries do not go through the same level of design scrutiny that stdlib does, they are not guaranteed to retain consistent semantics over time, they are not version-locked with Python releases, and they do not have the same backward compatibility guarantees as stdlib. If you want pyright to add support for a feature that requires knowledge of specific classes, these classes must be part of the stdlib, and their intended usage and semantics must be documented in the official Python documentation.

It’s possible that you could convince the pylance team to add support for an annotated-doc library at the language server level. They’ve done this in a (very small) number of cases where there is demand voiced by many pylance users over a sustained time period. However, adding support at the pylance layer is less robust than something implemented at the type checker layer. For example, it wouldn’t be possible for pylance to retain parameter comments for decorated methods that use ParamSpec because that requires the type checker to track the documentation across type transforms. I can’t speak for the pylance team, but given the process they use for prioritizing features, I think it’s unlikely that you could convince them to do this as an experimental feature. You could reach out to them and gauge their interest.


I appreciate that you are trying to address a real problem that you face in maintaining FastAPI. Unfortunately, I don’t think your proposal is aligned with what is best for the Python community.

Implementing your current proposal in FastAPI with the hopes that you can force tools vendors to support another non-standardized documentation mechanism is not the approach I would use. You might succeed in the long run, but you will create additional fragmentation and potentially generate ill will in the process.


I want to make it clear that the statements above are my own opinions. I’m not speaking for the recently-formed Typing Council, of which I’m a member. There may be other viewpoints among the other TC members. We haven’t had any discussions on this topic yet because it hasn’t been submitted to us for consideration.

8 Likes

This doesn’t really sound like a simple script to write.

Actually current popular IDEs support rendering, not editing. Maybe PyCharm, but VS Code doesn’t provide any help while editing docstrings.

All the other solutions you propose to these points mean building something new from scratch, a new library, for a problem that is solvable by using Python syntax (as in this PEP). It’s a valid point that it’s solvable, to write a parser, linter, refactoring tools, new editing features, etc. But all that is work that hasn’t been done, it can’t be compared to the work that was already done to parse, lint, refactor and edit standard Python syntax.

The microsyntax definitely is a problem for me. Editors don’t provide me good support for editing these formats, nothing else on top of allowing me to type characters inside of a multiline string.

I understand PyCharm has somewhat better support (if not perfect), but I wanted to be able stay in VS Code, and I also wanted not to depend on an extended feature provided by some editors that is not really writing Python code.

I want syntax highlighting, underline red lines for syntax errors, missing names, extra names, refactoring and renaming a parameter should update the name in the docstring, removing a parameter should remove its docstring, altering the order of a parameter should alter the order in the docstring, checking that two similar functions have the same documentation for a parameter should be doable without having to write and/or use an entire custom docstring parser.

I understand, thank you. Maybe this is a valid approach that wouldn’t affect the rigorousness required for Pyright. Totally understandable it wouldn’t be as robust as Pyright, but it might be a simpler and more volatile way to try the experiment of using this idea.

This is a biased and very strong claim against my intentions and I’m sad to hear you say that.

I have mentioned several times that it can work as a test bed for this experiment. Since the beginning, the problem has been that for something new like this, it would require a standard, but a standard would require consensus, consensus would require some real world usability, and real world users would need tooling support, so it’s a cyclic dependency. I took the hit in FastAPI of putting the work to use it despite not having the wanted tooling support to try this out.

Of course, I hoped tools would consider adding at least some experimental support for it given it would only affect a controlled group (FastAPI users) but at the same time it would provide user data for the experiment from real users. I’m taking the risk of making my project and users the A/B test group.

Your claims completely disregard any possible good intentions as motivation.

Thank you for the clarification.


About the PEP, I understand what you are asking me is to retire it. To abandon the idea of putting documentation in Annotated.

That’s definitely the shortest path of action.

Nevertheless, I would be willing to go through a few more edition iterations before disregarding the idea completely.

That’s where I ask for your guidance.

I’ll remove the mentions of transferring type alias documentation.

From your point of view, what else in the current state makes the PEP immediately discardable from your point of view? I need to know what else to remove before editing.

From my perspective, the inconsistent treatment of type aliases is the biggest issue. Thanks for offering to make that change.

The other issue that I mentioned earlier is that the PEP doesn’t indicate whether the documentation should be rendered as markdown or plain text. I recall that you were reluctant to take a stand on this, but without specifying it, library maintainers and IDEs will inevitably make inconsistent assumptions.

This is great feedback, I’ll work on it now.

Meanwhile, let’s talk about Markdown.

I personally would favor Markdown, but I know this topic is probably more controversial. I’m trying to think how to conciliate both ideas, of having a defined expected behavior and not ruling out other preferences.

I imagine people favoring non-Markdown flavors would probably be already comfortable with their current in-docstring systems, so there’s a lower chance that non-Markdown users would use this, which makes me think most potential users of this would favor Markdown, like me.

So, how about adding another optional argument format to define the format of the string, similar to how pyproject.toml defines the format of the README, using Markdown as the default.

An argument content_type and explicit MIME types sound very long to be typing it in code (what is supported in pyproject.toml), that’s why I propose format. I would think a shorter syntax with the main predefined formats with short names and a default of Markdown.

E.g.:

from typing_extensions import Annotated, Doc

def say_hi(name: Annotated[str, Doc("The user name", format="rst")]): ...

# OR

def say_hi(name: Annotated[str, Doc("The user name", format="md")]): ...

# Default to Markdown

def say_hi(name: Annotated[str, Doc("The user name")]): ...

What do you think @erictraut? (or anyone else).

Frankly, I think that if a documentation string for an argument needs to use Markdown or ReST, it’s already way too long for me to be happy seeing it in the argument list of a function (as opposed to the docstring, which is intended for long-form textual information). And equally, if you’re writing enough text to care about ReST vs Markdown, saving a few characters doesn’t seem worth the cost of not conforming with the existing (pyproject.toml) approach.

7 Likes

I agree that in most cases it will be a very simple string, that would actually be the same content in Markdown, ReST, or anything else.

I can only think of its usefulness for adding bold or italics. :man_shrugging:

But still, here we are, it has to be defined in some way, I guess.


@erictraut here’s the PR updating the usage of this for type aliases. Would be great if you could give it a check to confirm it addresses your concerns around type aliases. PEP 727: Specify `Doc` in type aliases documents the type alias symbol, update rejected ideas by tiangolo · Pull Request #3581 · python/peps · GitHub

Regarding Markdown or anything else, I’ll update it afterwards in subsequent PRs as needed, depending on what we conclude here in the conversation, but I wanted to start with the type alias as the conclusion for that is quite clear already.

So why do you even need editor support if the typical case will be a short string of text with (almost) no markup? I’m confused.

2 Likes

The intent of my comment was just to point out that the status quo is an acceptable outcome here. The previous post was written as if to suggest we must change things in some way and it was just a matter of details.

The proposals in this thread are definitely more powerful, but they come with downsides.

2 Likes

pylint has had support for this for many years, with support for all three main styles described here. You could use it to create a CI check that everything is still consistent or just copy the code and change it to your desired format. (Contributions are also welcome of course)

4 Likes

:-1: Putting doc per-arg: cue the arguments about how to render them, sortation, etc
:+1: Hand-write the docstring. Arguments don’t live in a vacuum (they relate to each other, are sometimes constrained by each other, etc). The function is the relationship between inputs and outputs, so it should be documented as a whole. Docstring is also the only reasonable way to document *args and **kwargs

I probably wouldn’t use this PEP and I’d resent the pressure to use it if it were accepted :upside_down_face:

3 Likes

Since it hasn’t been mentioned, I thought I’d share the PyCharm features here as prior art of how Google Style docstrings can be parsed and provide a good user experience (it wouldn’t be hard for Pylance/VS Code to implement this I bet).

PyCharm supports setting your docstring format to “Google”:

When this is enabled, the first thing PyCharm will do is that when you type """ after a function, it will pre-populate the docstring Args:

If you are missing/mispell a parameter, you get a warning:

Screenshot 2023-12-13 at 9.37.34 AM

Pressing Alt+Enter quick fix on the missing parameter will allow for auto-inserting it into the docstring:

The docstrings are rendered nicely for the user when using Quick-doc:

Screenshot 2023-12-13 at 9.39.09 AM

Argument names are autocompleted:

The section title blocks are autocompleted:

Google Style Docstring Community Support

Google style docstrings are already supported in a variety of other tools:

12 Likes

I’d like to add that one of the reasons I use the built-in Sphinx RST fields-based style is because those other styles cause performance hits during parsing. As an example, the darglint tool that validates them can get thousands of times slower as the style is switched :man_facepalming: (Performance issues in `google` and `numpy` style parsers · Issue #186 · terrencepreilly/darglint · GitHub)

OTOH, it seems like there’s a pylint plugin that might not suffer from this problem.

1 Like

After dedicating several hours to thoroughly reviewing this PEP, along with the accompanying discussion and related PEPs mentioned in the comments, I felt compelled to share my perspective. My experience includes a strong involvement in the FastAPI community, where I gained substantial familiarity with its codebase and design principles. However, due to a shift in personal priorities, my contributions diminished in the past year. Recently, as I revisited FastAPI to update myself on the latest developments, I was struck by the significant increase in code complexity, notably in files that expanded from approximately 900 lines to 4500 lines. While I deeply respect the efforts of @tiangolo, I found the blend of extensive documentation within the codebase to hinder readability to the point I just wasn’t able anymore.

Upon examining this PEP, I gained a better understanding of its objectives. Nevertheless, I respectfully hope it does not proceed for the following reasons:

  • Practicality: The proposal will adversely affect code readability.
  • Conceptual clarity: Integrating documentation within the codebase may dilute the coherence of the code itself. Jumping between Python syntax and natural language creates a rather large mental burden I have found going through FastAPI again…
  • Social implications: The proposed changes could impose undue professional pressure to adopt practices that, despite being optional, may not seem realistic in practice.

That said, I acknowledge the PEP’s intention to enhance documentation accessibility. I was particularly intrigued by the discussion on type aliases, considering how they could potentially streamline documentation at the beginning of files, albeit at the expense of adjacency to its symbol (which was after all coined as a benefit in this PEP).

I align with @erictraut’s suggestion that formalizing docstrings could capture the PEP’s benefits without compromising code clarity. Despite some resistance to microsyntaxes within the PEP, I believe refining docstring practices could address current gaps for various stakeholders, including editors, developers, and documentation specialists, without necessitating documentation placement adjacent to symbol declarations, which could ultimately complicate the codebase.

Maybe offtopic

Although I don’t have any notable credentials that would lend credibility to such an effort, I would definitely be open to help out. There are many more benefits in having a formalized docstring convention (although the discussions on implementation details will probably be… interesting :wink: ). I have collapsed this because it is more for a different topic

15 Likes

(Reviving this old thread! Thanks for tolerating the necro.)

Is there an example for how enum values might be handled?

I note that the PEP explicitly considers class variables as dataclass-style declarations, where there’s usually a type annotation to be wrapped in Annotated. However, enums also use class variables for their values, and these typically don’t have individual type annotations, so it seems like there’s nowhere to annotate. From reading the PEP and this thread, this appears to not have been explicitly considered yet.

Example: we have a code-base that uses the string-under-definition style with enums and seems to work well with IDEs and doc-generators:

import enum
class Fruit(enum.Enum):
    banana = enum.auto()
    "actually a berry"
    
    tomato = enum.auto()
    "really is"

My editor will then preview the Fruit.banana docs when relevant.