Hello,
On the Docs meeting, everyone was in favor of adopting the Diátaxis framework for Python documentation. Let me explain what that means to me so we can get a rough consensus and start work.
Everyone who wants to work on docs editing you should read the Diátaxis docs. They’re short and (as you’d expect) well written, and will probably answer your questions if you have any.
The main point, for me, is to keep the target audience in mind when writing docs. There are four kinds of target audience, and thus four kinds of docs. The first goal should be to avoid mixing them, rather than to cover all four quadrants. Mixing them confuses all readers; not including the HOWTO guide or the tutorial just means those readers will go to Stack Overflow or Real Python instead.
Another point, from “How to use Diátaxis” (the last section, but a very important one), is Work one step at a time. Adopting Diátaxis doesn’t mean we need to restructure the existing docs now. Rather, it should allow us to find small tasks for anyone who wants to help. That currently seems to be a bottleneck – “improve the docs” is so vast and directionless that we can’t seem to get anything started. But if we adopt a framework for the big picture, we can small steps toward it.
So, here’s the proposal: Let’s adopt Diátaxis as a guide for Python docs, and start improving.
Anyone against?
I feel you are stifling the discussion by saying “all questions are answered by reading that document”. It has a frigging video link. Do I really have to watch that?
My immediate question would have been “what’s the north star for docs.python.org” but I’m afraid that that’s answered by reading the framework, so I will just stop caring.
This comment reads a bit off to me? As a bystander who’s familiar with Diátaxis, on one hand I feel compelled to answer the “do I really have to watch the frigging video link?”, but on the other hand you seem to have decided to stop caring already, so I’m not sure if it’s worth explaining anything or not.
I’ll make an attempt to find a middle ground by adding a short answer with my point of view. @encukou stated that the Diátaxis docs “probably” answer folks questions, which doesn’t rule out the possibility that there are some unanswered questions (so definitely not “stifling the discussion”). In any case, I agree with @encukou that the docs are “short and well written” - each section in the quadrant has its own description (tutorials, guides, reference, explanation), there are some more background docs, and no I don’t think watching the video is mandatory (it is an abridged and summarized version of the docs - quite entertaining to watch I must say, since Daniele is so eloquent).
I am not seeing clearly in what this proposal means. The existing tutorial maps to the “Tutorial” part of this method, I suppose. There are already HOWTOs, does it mean that there should be more? The language reference and stdlib reference documentation is already there, should some part of it go in “Explanation”?
Concretely, what is the scope of one HOWTO or Explanation or Reference document in the stlib docs? One stdlib module? One general topic that may encompass several modules (GUIs, networking, i18n, logging, async I/O…)? One class of stdlib modules (e.g. data formats grouping json, configparser, tomllib, etc.)?
The point of proposing Diátaxis, as I get it, is just to put a way of thinking in place, not to force a specific plan to be immediately developed. The page pointed to at the top of this thread (the How to Use page) takes considerable pains to try to convince you not to make a huge restructuring plan, but rather to look at something you’re already in and see if it could benefit from a well-considered change, and if it could, do it. Evolutionarily you’ll get to a better place.
A “North Star” typically isn’t a huge restructuring plan — it’s just a vague idea of an end goal that informs the evolutionary steps taken. Which seems to be just what Diataxis suggests. The difference is that Diataxis can only provide general guidance, whereas a North Star for docs.python.org would provide specific long term goals.
For example, should the Library docs try to be a specification? Then should the various “how to” ant mini tutorials it contains be moved elsewhere?
I’m familiar with the framework (it’s related to what I do in my $job), but I first ran across it via https://documentation.divio.com - so I’m wondering what the relationship is between these sites, if you know.
I think the four perspectives from Diátaxis are good ones. I understand the benefit of not trying to boil the ocean by deciding on a complete restructuring. But I think we need a bit more guidance about how to approach this.
For example, suppose we look at a page that seems to mix how-to with reference. Should we declare it a reference page, and delete the how-to content? Or should we shuffle the content so that we can label the first part How To and the second part Reference? Or should it become two separate pages, and where do they each go?
I think we can do more to choose a path for people trying to make improvements, so that the result is cohesive.
That’s not a North Star, that’s a concise summary of the Diataxis model (it would have been helpful if Petr had included this in his initial message :-).
A North Star should provide a bit more specific guidance. Here’s a strawman (i.e. something meant to be criticized, not the final word):
We should strive to separate the four types of content that Diataxis recognizes (tutorials, guides, explanations and references), into separate trees and cross-link them, rather than intersperse them in a single document.
We should standardize the terms; my specific proposal is to drop “how-to” in favor of “guide” – I find “how-to” as a noun very awkward.
We should provide convenient “home directories” for each of the four types of documentation in the repo under Doc.
We could either move all current documents to the most appropriate subdirectory (and then do the rest of the work incrementally as people prioritize their time) or we could leave the existing documents in place (for now, or forever) and just restructure the top-level ToC for docs.python.org.
Let me walk through two examples.
First let’s look at the socket module. This page is clearly written as a pure reference, terse but complete and unambiguous. The only thing missing is a link to the Socket Programming HOWTO, which is actually more an explanation than a guide.
Contrast this with the argparse module. This is a fairly complicated API, and I often have to look up details on this page – how to define an on/off flag, how to accept a list of filenames, how to do subcommands, and so on.
(Even though I use “how to” here, I really just want the API reference – these are all things I’ve done before but I’ve forgotten exactly how and the reference tells me what I need to know: which keyword arg controls a certain behavior, etc.)
Unfortunately, this reference page is very chatty, because there are lots of examples, all presented in REPL form, that feel a bit like they would fit better in a tutorial or guide. Then again, references often do need examples, so maybe this is fine, and it’s true that I can quickly navigate to any specific method or class using the sidebar ToC.
Another positive for argparse is that there is a separate tutorial, written in tutorial style, and linked from the argparse reference page. Still, I feel that it tries too hard to be both reference and guide, and it isn’t terse enough (this is an explicit Diataxis recommendation for references).
I believe it would depend on the specifics of each individual case. But generally, the preferred approach here would probably be to make two pages and cross-link them, one in the topic-specific reference category, and one in the “HOW-TO” category, at least under the current docs structure.
If we do this, IMO it should be near the end, once the docs better fit the model and we have consensus on the best top-level structure. I believe it would be of more immediate value and less bikeshedding to focus first on improving/restructuring individual docs (especially “hot” ones, like the great example you’ve identified), which is what Diataxis urges, and keeps the focus first on “making the docs better” rather than “moving everything around”.
Speaking of “hot” paths, do we have any analytics on which docs pages get hit the post and least? That would likely be helpful in both prioritizing which get worked on first, and which important content should be more visible.
This is a great example, and one I actually run into myself a lot, including just recently. I’ve found myself scrolling through walls of text on the argparse page (that put even my messages to shame) just to find the one reference bit I’m looking for. The headings sometimes help, but I’m not always sure what param is the one I actually want, and most other pages don’t have them. Ironically, I often find it faster to just look at snippets of my own argparse-using code to remind myself how to do something.
Funny enough, I actually forgot that existed, since its not a heading and my eye slides right over it at the top, unlike e.g. the logging module where the structure makes it very visible. It might be better to put it in a similar admonition, with the link set off rather than buried in the prose.
A systematic framework for technical documentation authoring.
The following paragraphs then go on to concisely explain its fundamental principles, and the How to use Diataxis page presents more specific guidance on implementation.
So, as a conceptual framework based on a sound theoretical and empirical foundation, it is more concrete and directly actionable than just a “way of thinking” or a set of whimsical, vague and eclectic aphorisms like the Zen of Python, but certainly more abstract and flexible than software tooling that enforces a rigid structure upon the docs.
The logging documentation originally was guilty of mixing tutorial and reference information. There were a fair few grumbles about how hard it was to grok, and I got positive feedback after I reorganised it into reference, tutorial and cookbook - which are three of the four Diátaxis categories. The remaining category - explanation - would seem to be more of documenting “why it is the way it is”, e.g. design notes, or a survey of prior art / other similar systems. Or have I got that wrong?
One of the good things about adopting a methodology or system is that it helps to formalize what people are doing, making them more effective by helping them to categorize their work.
One of the bad things about adopting a methodology or system is that people spend a lot of time discussing what fits into each category.
I don’t think that the explanation category is meant for historical background (the Motivation and Rationale sections in a PEP are better suited for that).
To me, the explanation corresponds more to the “theory of operation” chapter that I recall seeing in mainframe documentation sets many decades ago. Its purpose is to help the reader learn the important abstractions underlying a particular API/module/etc. I wrote PEP 483 (“The Theory of type Hints”) to serve this purpose for type hinting.
In my experience, it tends to be self-regulating in open source projects (and yes, we do have a tendency to bikeshed)… if a methodology is too heavyweight or intrusive, developers will just end up ignoring it, or doing only part of it, and things eventually stabilize on something, which might not be what you proposed at first.
Of those, I’d pick “a way of thinking”.
Diátaxis gives us some shared terminology, and hints hints for how to answer the practical questions. Perharps we’ll find areas where it’s not a good fit for Python docs – but even then we’ll have a way to talk about this (“let’s not write exposition for this”, “we need two different kinds of tutorial here”, “these docs mix explanation with howto, but it actually works pretty well in this case”).
There seems to be little opposition to Diátaxis itself, so I’ll use its terms and assume everyone involved knows what they are.
Coming up with something specific is the next step. Thanks to Guido for going ahead with that!
Yes.
To start with, we can move them to separate sections within a single document, and leave reorganization for when we have a bunch of them to move around.
I’d like to discuss preserving URLs before any reorganization. (Please open a new thread if you want to start on that.)
Also, I don’t think we need exactly four top-level categories. It’s OK to continue separating language reference from stdlib reference, or the Python tutorial from specific library tutorials.
Yes, many of the HOWTOs are actually explanations or tutorials. (A naming clash is bound to happen when you introduce new terminology.)
“Guide” sounds great!
I assume the repo (and URL) structure should continue to match the doc trees, so this feels like a part of reorganization.
I think a lot of the docs can actually stay where they are – for example, most of the Library Reference should be kept, but the non-reference parts of it should be moved to new homes.