Summary tables as an API overview

Nodd · October 19, 2024, 4:29pm

The Python documentation is really great, but the long pages can be intimidating and hard to navigate. As an example, what are the methods of the str type ? You have to scroll through the 47 methods in the middle of the larger builtins page to check what is possible to do with a str object. One solution to this is to add summaries at the top, which enables to have a quick overview of the functions of a module or the methods of a class, as it was done for the statistics module.

Pros:

Newcomers can quickly grasp what a module or a class is for
Advanced users can quickly remember the list and check the writing (is it isfile or is_file?)
Documentation would be more similar to other tools or languages, as was discussed in another topic

Cons:

It would make the documentation pages even longer
It would need to keep in sync multiple parts of the documentation when adding new members

If this is seen as a benefit, I’m volunteering to add these tables to some modules. I’m fine with an answer like “propose a PR and see how it goes”, but if there are some consensus first if would make my work easier

willingc · October 19, 2024, 8:37pm

Hi @Nodd,

I know that there has been discussion in the past year about improving the string docs. @trey had some ideas about what he would like to see.

You can follow discussions of the Python docs community meetings: Documentation Community documentation. We welcome attendees to these meetings and are happy to add discussion topics to a future agenda. See also: GitHub - python/docs-community: Community management for documentation contributors and the Docs Workgroup We typically meet the first Tuesday of the month and post a reminder for agenda gathering here, for example: Documentation community meeting: Tuesday 1st October, 2024

There is not yet a consensus on what would be the best moving forward.

Nodd · October 19, 2024, 8:51pm

Thank you for the links, I’ll look into those.

To be clear, str was just an example. I was thinking on starting with pathlib or shutil, which would be easier.

willingc · October 19, 2024, 9:15pm

Totally cool to pick one of those, outline your approach here, and gather consensus here. Thanks!

BrenBarn · October 19, 2024, 9:26pm

I think it’s a good idea, although in many cases I think it’d be even better to actually split the documentation for many types out to separate pages. Maybe this doesn’t need to be done for every type, but for types like str that are ubiquitous and have many useful methods, I think it would be much handier than having them buried in the midst of a long page about all builtins.

ncoghlan · October 20, 2024, 1:14am

One of the technical problems we have with that kind of change is that we don’t have a nice way to keep deep links to the old structure working.

The summary tables at least mitigate the problem without breaking anything (they’re just a hassle to write in the first place, hence their ad hoc inclusion).

BrenBarn · October 20, 2024, 1:42am

If you mean links like https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str, I don’t think they need to keep fully “working”. There can still be a brief paragraph at that point in the docs that says “Textual data in Python is handled with str objects, or strings. See full documentation on the str type here (link to new page).”

ncoghlan · October 20, 2024, 1:53am

No, I mean the anchor links to the individual methods, as used by intersphinx semantic references.

The internal ones would be automatically adjusted, but external ones would be broken until the affected docs were rebuilt.

Manually defined links in old mailing list and forum posts, and on sites like stack overflow, would just be broken.

BrenBarn · October 20, 2024, 1:55am

Yeah, well, I guess we have to choose between breaking those links and having a giant file where it’s difficult for humans to find the information they want. I know which one I’d prefer.

ncoghlan · October 20, 2024, 2:06am

I don’t think we do, since it would be possible to use a different file name for a restructured version and make the old page an orphan with explicitly defined anchors containing semantic links to the updated locations.

Nodd · October 20, 2024, 10:50am

I agree that splitting some pages would be an improvement, but even if it was done, having summary tables would be useful. str will still have 47 members! Both improvement are orthogonal for the most part, so let’s start with the easier part.

willingc · October 20, 2024, 5:05pm

Folks, I want to get a bit more concrete on a specific example. @Nodd mentioned shutil — High-level file operations — Python 3.13.0 documentation as one page with many methods.

So, brainstorming about this page, I see several possible approaches:

Expand the depth of the contents sidebar so that all the methods are displayed without the need to scroll the page
Add a page contents (or table)
A separate page per method

The docs community have discussed possibilities in the past. It makes sense to reboot the discussion at the November meeting. cc/ @hugovk

Nodd · October 20, 2024, 5:35pm

Well, I started with math… I tried two versions:

Multiple tables with subsections:

One (big) table with intermediate headers:

The first version adds another level of sections, which mirrors the sections of the rest of the page. It’s less compact, looks weird when there are a few functions in a section (see Angular conversion for example, and it’s a bit more complicated to integrate into the page flow. The advantage is that it’s not a single huge table.

The second version is easier to insert and looks cleaner, but it’s a bit too big.

A third possible option is to insert the multiple table in each section. It would look good but it would lose the “what is in this module” effect.

I did not push these changes yet, I want to check the workflow first. It’s my first time proposing a modification to CPython

BrenBarn · October 20, 2024, 5:57pm

This seems okay for this case. This is different from str because the entire page is about top-level functions in one module. The problem with str (and other builtins) is that all methods for all builtin types are dumped into one big page.

I don’t see the advantage of this over having the contents in the sidebar. The advantage of the sidebar is that it is separately scrollable so you can jump to any method no matter where you are on the page. With a table at the top you always have to scroll back to the top to find what you want.

This seems like overkill to me for a case like this where most of the functions have relatively modest-sized documentation. A separate page for one method could be useful for cases like Popen that have long individual docs, but I think in most cases it’s useful to have related methods on one page.

There is another option you didn’t mention, which is to split each section onto a separate page. I probably wouldn’t do that here as the last section has only one function, but in some other cases it could be appropriate.

For me the relevant considerations are conceptual units and overall size of the page. I think there is a “sweet spot” of medium-sized pages where you can conveniently jump from one to another and/or use Ctrl-F to search around, but not lose track of where you are. A separate page for each method/function is usually unwarranted. The most egregious cases are ones like the mentioned builtin types one where a huge number of quite distinct types and their methods are all on one page. But usually having a single type and its methods documented on a single page seems good to me.

hugovk · October 20, 2024, 6:12pm

Good idea.

I’ve cleared the HackMD for the next meeting, please add new agenda items to it, and welcome all! Documentation community meeting: Tuesday 5th November, 2024

willingc · October 20, 2024, 6:23pm

@Nodd @BrenBarn Thanks for the feedback and examples. It makes it much easier to visualize.

I’ve opened an experimental PR that tweaks a config setting for the sidebar. We would definitely need to adjust CSS to make it look better as well.

It’s rendered here if you wish to play around with it: 3.14.0a1 Documentation

hugovk · October 20, 2024, 6:44pm

Some of the side menus are far too long and crowded. In January, we added a setting to prevent them getting even longer and more crowded: python/cpython#114318.

I think tables at the top can be useful, especially if in many cases you want to find a thing, and when done, can close the page. No need to navigate back to the top again. See the built-in functions for one example. It would be good if Sphinx could generate them rather than manually building them.

I think it’s worth looking into separate pages per method, or at least breaking some of the huge pages into more manageable chunks. Some of our pages are huge, which are hard to navigate and bad for SEO.

Another idea is to change the theme. At the core sprint, I made a quick demo using the PyData Sphinx Theme, as used by projects such NumPy. This is a quick demo, the header and footer and other custom stuff would need adjusting, but a big benefit is there’s a less cluttered left-hand menu, and it also adds a right-hand menu. We’d also benefit from the huge amount of work being put into the theme, including lots of accessibility work.

jamestwebber · October 20, 2024, 7:15pm

I wonder if it is possible to have both modes available using the same source material and some toggle in the URL or somewhere else. I haven’t thought deeply about the layout but it seems like it should be possible^[1].

There’s something poetic about re-creating the “all one page” vs “HTML tree” options from 30 years ago.

it might not be easy with Sphinx though ↩︎

Nodd · October 20, 2024, 7:48pm

There is autosummary. It uses autodoc behind the scene, thus it uses the docstrings and not the sphinx documentation.

Is it fine to open multiple PRs to be able to easily preview the different options ?

AA-Turner · October 20, 2024, 8:31pm

An alternative to this which ‘resolves’ the long attribute names problem is PR 125757:

The proposed solution is stopping attribute names from breaking over lines and adding horizontal scroll.

The contents in the sidebar do become much longer, but the pages themselves are quite long - and this does help faster access to each name. Personally I think the trade-off is worth it, though others may disagree.

A