Would it be awesome if docstrings were stamped with where they were defined?

From discussion at

Tools like sphinx report errors in the docstrings along with location of the error.

Some docs are long, and “error on line 174” of the doc string is not nearly as good as “error on line 347” of the file.

Doc strings, however are often copied in decorators and it’s not immediately clear where the source of the doc string was.

What if the parser stamped doc strings specifically with their origin, file foo.py line 98, col 8?

P.S. string operations, by default would forget this source stamp.

This is about tool-specific ‘errors’ in Python-legal strings that happen to be a __doc__ attribute of some Python object.

The full error message you reported in the link is
/code/operator/ops/__init__.py:docstring of ops:42: WARNING: py:attr reference target not found:..., which implies there is a bit more after the final :.

In such cases, add the line number of the line before the ops.__init__ docstring to the offset reported (42) by sphinx. If py:attr is literal text in the string, one could also search for it. If that is a generic template, suggest that Sphinx error messages include the literal offending text so it can be searched.

The tokenizer splits code into chunks represented by 5 tuples that include the start and end positions. So everything is initially ‘stamped’ with its origin. The parser produces an ast of nodes that include the same info. Here is the relevant portion of an ast.dump with attibutes of a function def with a docstring.

    FunctionDef(
      name='f',
      args=arguments(),
      body=[
        Expr(
          value=Constant(
            value='a long\n   doctring\n   covering multiple lines',
            lineno=2,
            col_offset=3,
            end_lineno=4,
            end_col_offset=29),
          lineno=2,
          col_offset=3,
          end_lineno=4,
          end_col_offset=29),

The parser just sees a string constant that just happens to be the first expression in the function body. The compiler notices this juxtaposition and attaches the compiled str object to the compiled function object as its __doc__ attribute. The compiler also produces some location tables (details have changed with versions) as part of the compiled code object for use in tracebacks and other tools.

I have no idea what Sphinx does to check docstrings rather than doc text, but for static docstrings identified by the compiler location data is available if source is available.