We currently publish the documentation in six formats: HTML (website), HTML (zip/tar archive), PDF, EPUB, plain text, and texinfo. The first is found online (d.p.o), and the rest via the downloads page.
For all of May, June, and July, updates/builds for the non-HTML files were unintentionally stopped, and (to our knowledge), no one noticed. These archive (non-HTML) builds consume the vast majority of the compute resources used to build the documentation, and take up a not-inconsiderable amount of volunteer time trying to maintain them. If they have little or no use, we should consider no longer supplying them.
Last year, we removed the letter PDF without complaint. I’m now proposing that we remove the PDF build entirely. The LaTeX compilation process is both slow and resource intensive, and not producing PDF documentation would mean we can update the rest of the documentation faster.
For context, a brief survey of other programming languages shows the following:
Language
HTML
PDF
EPUB
Other
Python
JS
C (WG14)
C#
Go
Java
Julia
Kotlin
PHP
R
Ruby
Rust
Swift
Please let us know if you rely on the PDF or other non-HTML formats, or equally if you know of others who are using them. We want to avoid breaking people’s workflow, so we are keen to here use-cases for the archived/downloadable documentation.
As a crude indicator of opinion, please find a survey below:
I voted “I only use HTML”, but even though that sounds very non-committal, I’m actually taking a stronger stance: I had no idea that these other options were even still available, and had not looked for them in many years. And I can’t imagine ever directing someone to them. Last I checked, pretty much everyone has a web browser, so if you need a local (non-internet) copy of the docs, an archive of HTML files should be suitable for the vast majority of people.
Do the PDFs offer any security features (eg cryptographic signing)?
I only use the HTML documentation and I’m in favor of lessening workload so I voted for that.
I’ll also throw out the possibility that focusing on HTML might allow for some nice designs that aren’t possible with PDFs or EPUB formats. e.g. interactive components to show how a part of the library works under different conditions (a common example in other documentation is the OS).
That’s not a request for such features, just thinking that having a single medium to focus on might make it more viable.
I dropped these builds in Flask and other projects for the same reasons. They’re slow, hard to debug/maintain, not as well supported by Sphinx themes (which mainly focus on HTML). Very rarely, we get a request for PDF or ePub. But anyone can run sphinx-build on a local checkout to get whatever format they want.
We’re able to do this already, Sphinx allows including content based on the output format. Supporting fewer formats probably wouldn’t make a difference here.
I voted keep all because I know that people definitely use the pdf downloads and I know people that have literal binders of python docs at their house. While I do use html and pdf interchangeably the pdf docs are great for printing for a literal side -to- side reference on smaller monitors and epub is nice to have. I voted yes because there are people who like these alternatives.
EDIT: On second thought people could run sphinx-build if they need it, but I feel that adds unnecessary complication
However , I don’t really think we need EPUB and the other formats as they are obsolete and not widely used other than for ebooks and whatnot. I just didn’t see that on the poll
Removing most of them seems fine, but is there a reason to stop providing plain text? These take up less space than the HTML ones and I’m guessing are not so compute-intensive to produce. I tend to think having a plaintext version of things is a useful fallback alternative for extreme lo-fi cases where any kind of “rendering” is off the table.
I care very much about offline documentation and I consume the Python documentation almost exclusively through Kapeli’s “Dash” app… which uses… the HTML.
I love epubs and ebooks generally, but the form of the Python documentation and the access pattern of interacting with it is such that I cannot imagine wanting to load it up in Books.app or whatever and read through it by turning virtual pages.
How do IDEs show the documentation on hover? If that’s plaintext or some other format, please keep that. It’s my main way of using the Python docs.
I also use the website HTML, and the downloadable HTML archive for Internet-less times.
Once upon a time I had no smart phone but had an ebook reader. Having the epub version was important to me back then, as I had no other way of reading the docs unless I was online at my desktop. The navigation wasn’t great IIRC, requiring many bookmarks to land at the interesting parts.
I guess some users might still have a similar setup, but smartphones and being online are much more common nowadays, and the HTML version should work well for desktops/laptops. So I voted to only keep the HTML version, but hard numbers should be more important than our opinions to reach a decision IMHO.
Ideally, smartphone apps like Python 3.12 offline HTML docs would be available to cover the need for offline docs for the vast majority of people who have these devices. For desktop users, dash and Zeal offer a great offline experience.
The Python community is known for being an inclusive community.
Removing non-HTML formats, especially PDF, is an exclusionary initiative. (Sorry if these words hurt someone.)
Eventually, someone asks for the PDF format on the Brazilian community Telegram. The last time was 12th July, for instance.
I guess the people who use these files the most are newcomers from places with poor internet around the world, looking for the tutorial and after library reference. These people won’t go here to vote in this survey or comment on it.
In fact, I believe it could be improved. For example, providing individual section files, rather than a single file with the entire documentation.
We also can’t forget that until recently, the docs download page was in an almost secret place. It isn’t surprising that some of us don’t remember that this exists.
I completely acknowledge needs of people with poor network access, but aren’t they adequately served by ePub? (and we used to have *.info in those old long gone days when info was still a Thing).
it’s plausible to assume that people who rely on these formats due to their infrastructure environment are less likely to follow online discussions on that matter. so there may be a problem with underrepresentation of those who’d be most affected by such change.
technically for sure, but an unfortunate truth also is that consumer devices tend to support ePub rarely and PDF always out of the box or with very common software.
I have no personal need for the PDF docs these days, but I agree that if I were looking for a doc format to use offline, I’d be looking for PDF in pretty much every case. eDoc is viable, but it’s definitely not what I’d choose by default, and HTML is only really viable on desktop platforms where you can store the whole directory tree locally and point a browser at it.
Whether that’s enough justification for the cost of maintaining the PDF docs, I can’t honestly say.