Another attempt at docstrings for names and parameters--using "||"

jamestwebber · October 16, 2024, 2:43pm

Personally I wouldn’t write multi-line descriptions in the signature, because I think it’s hard to read, and I certainly wouldn’t put a block of text related to the return value there.

I don’t have familiarity with docments myself, and I don’t think the author(s) (Jeremy Howard?) are active in this forum. It’s quite likely that it doesn’t address all of these issues because it was built for a different purpose. It’s just an example of what a comment-based syntax could look like.

pawamoy · October 16, 2024, 2:46pm

So where would you write these multi-line descriptions when you need them? I find the return block weirdly placed too, but can’t think of a better alternative.

I’ll try to see if I can reach out to docments authors and bring them here

jamestwebber · October 16, 2024, 2:49pm

If I need a longer block of text to describe something I put it in the docstring or equivalent. My preference is to keep per-parameter documentation fairly brief–it’s more of a reminder to a user who has already read the documentation, not a replacement.

petercordia · October 16, 2024, 2:50pm

Do you think we’d have one annotations extension to add docstrings to class variables, and an entirely separate system for what I proposed?
I think they are too deeply connected for that to happen. Before python had type hints, people used to put type hints in the doc string of functions, like so:

def function_with_types_in_docstring(param1, param2):
    """Example function with types documented in the docstring.
  
    :type param1: int
    :type param2: str
    :rtype: bool
    """

You would see ‘variable docstrings’ being used the same way if that ends up being the only way to properly type hint dataclass/attrs properties.

So then, if the ‘docstrings for variables’ is implemented in a manner that is not compatible with adding proper type hints for dataclass, we might never get proper type hints for dataclass. And that would really bother me.
Adding to this, if I have proper type hints I often don’t need a docstring. (Partially because you can make custom types.) But if I have a docstring I still need proper type hints.

Adding to my original post, I forgot you’d also need to be able to annotate it explicitly with the default value, and the get_type should probably be possible to do with a normal type hint, so it could also be reasonable to have

@attrs.define
class MyDataClass():
    a: str = attrs.field(converter=int, default='33')
    || set_type = str|float|int
    || default = '33'
    || "A variable that is correctly type hinted in the class constructor"

If these 2 things were implemented separately, what would we end up with?

@attrs.define
class MyDataClass():
    a: str = attrs.field(converter=int, default='33')
    || "A variable that is correctly type hinted in the class constructor"
    ## set_type = str|float|int
    ## default = '33'

??

I really don’t see that happening.
And again, that would be bloody annoying, if we can’t have good type hints for classes because we half-arsed the docstrings.

So no, I think it is worth making this change well, if and when it is made.

pawamoy · October 16, 2024, 2:52pm

I see, thanks. So prose first, and summarized structured information second.

jamestwebber · October 16, 2024, 2:52pm

Honestly I don’t understand why set_type needs an explicit type hint in the first place. Why can’t that be provided by the generated converter method?

pawamoy · October 16, 2024, 2:58pm

Peter Gerlagh:

@attrs.define
class MyDataClass():
    a: str = attrs.field(converter=int, default='33')
    || set_type = str|float|int
    || default = '33'
    || "A variable that is correctly type hinted in the class constructor"

I would never want to write such metadata. I maintain Griffe and mkdocstrings (extracting and rendering docs respectively), and how we handle (or would handle) attrs, pydantic, and the many other third-party libraries that provide dataclass-like APIs, is through extensions that perform additional static (or dynamic) analysis to extract types or default values from the code. The default value is right there in the AST!

petercordia · October 16, 2024, 3:03pm

At the moment Pylance only sees the type hint and “this is a field”. So it ends up being interpreted as

MyDataClass(a: str =  ...) -> MyDataClass

where ... means “there is a default value but I don’t know what it is”.

I’m not certain how it all works in the background, but the understanding I got based on another discussion on this forum is that you shouldn’t expect type checkers to be too clever, because some things that a human sees instantly are hard to program or expensive to compute.

I just want to be able to write type hints that result in the (correct) interpretation

MyDataClass(a: str|float|int = '33') -> MyDataClass

that doesn’t seem like it is a lot to ask.

@pawamoy
Just saw your contribution.
Great that your systems are working well
Mine aren’t.
I don’t know what Griffle and mkdocstrings are, perhaps I should adopt them.
But I also think it is valuable to be able to do things by hand. Particularly in Python, where it is possible to use very low levels of sophistication, and for example to write programs directly in the terminal.

pawamoy · October 16, 2024, 3:10pm

Yes type checkers cannot realistically support all third-party libraries, or even just the popular ones, and all their ways of performing type validation / conversion dynamically. But these type checkers often support extensions too, so I would say that developers of third-party libs should be responsible for writing extensions for these type checkers to make them work with their libs. For example, you can find many Mypy plugins to support various third-party libs.

I don’t think standardizing metadata for all the libs at once in a generic manner is feasible

jcampbell05 · October 16, 2024, 5:59pm

was wondering if some of this couldn’t just be a special decorator in python ?

In which case better placed to be a community library

pawamoy · October 17, 2024, 12:43am

It’s not possible to decorate a parameter slot, or a return type, unfortunately. You’d probably end up decorating the function definition, which once again separates the documentation from the things it documents, and which once again would ask repetition of parameter names. Not better than a static docstring

petercordia · October 27, 2024, 8:55am

I was just reminded of yet another reason why I want to be able to annotate variables with a default value. The fairly common pattern

def f(a : list | None = None):
  if a is None:
    a = []
  ...

which would be much clearer if it could be displayed as if I had written

def f(a : list = []):
  ...

Using the top pattern is recommended by linters because otherwise you can introduce bugs too easily by modifying a, and my linters don’t even allow a : list = None due to the type hint being technically incorrect.

But for any users of the function that’s boilerplate and implementation details, and what the user needs to know is that if they specify something it should be a list, and if they don’t specify something it will be as if they specified a=[].

Of course deferred evaluation would be an even better solution. But we don’t have deferred evaluation yet either.

Nineteendo · October 27, 2024, 11:27am

Is this an option for you? (Intentionally not using Sequence)

def f(a: list | tuple = ()):
    if isinstance(a, tuple):
        a = list(a)
    ...

Sadly doesn’t work for e.g. sys.stdout (which is frequently monkey patched):

import contextlib, io, sys

def foo(fp=sys.stdout):
    fp.write("foo\n")

with contextlib.redirect_stdout(io.StringIO()) as stdout:
    foo()
    assert stdout.getvalue() == "foo\n"  # AssertionError

petercordia · October 27, 2024, 1:26pm

Oh that’s actually a very nice idea, particularly since I do like to guarantee that function arguments aren’t modified by the function.
I can actually leave out the type conversion, so long as I exchange a+b for [*a, *b].
And I suppose I could do something similar with dict and frozendict. Etc.

jph00 · December 1, 2024, 12:20am

Hi, I created docments and I was asked to provide input here. It sounds like pretty much everything I would say has already been said, but feel free to ask me any questions. As has been mentioned, the idea behind the system is that if you have more than a line of stuff to say about a parameter, then you would put that in the docstring or other documentation. Note that with docments you can either place the comment to the right of the parameter you’re documenting on the same line, or you can place it on an empty line above. Anyhoo let me know if there’s anything else I can help with. We’ve been using it in nbdev for a few years now so there’s quite a few docs out there now that use it, and everyone seems pretty happy with it. We also use it in fastcore.script to automate argparse parameter creation BTW.

Prior to creating docments we used a custom annotations system (this was well before generic type reflection existed), but docments has been an improvement because it minimises dupe code, which reduces the opportunity for mistakes to crop up, and makes maintenance easier.

pawamoy · December 1, 2024, 1:35pm

Thanks Jeremy! You answered my question above If you ever hear that your users would like docments to be supported in MkDocs/mkdocstrings, don’t hesitate to redirect them to me (mkdocstrings).

jph00 · December 2, 2024, 12:40am

Oh great! Sorry if I’m being slow, but can you clarify – is that something that you currently support? Or you’re interested in hearing if users would like it supported?

pawamoy · December 2, 2024, 1:52am

It’s not supported yet, but could be if users ask for it, yes

zhangyx · December 2, 2024, 5:05pm

How about a new string prefix?

doc"I am the doc-string for a"
a = 1

b = 2 doc"I describe b"