Best practice for documentation & its installation

_wiz · March 24, 2023, 10:59pm

Hi!

TLDR: What is the pythonic answer to documenting (offline) a CLI program that needs additional information (e.g. input file format), more than what reasonably should be put in argparse help output?

Long version:
A friend and I are currently re-writing a perl CLI program in python. Coming from Unix, we already have documentation in man page format (but also conversions of these documents to HTML).

The current best practice for packaging seems to be using pyproject.toml, and I’ve managed to get a working binary script installed using [project.scripts].

Then I looked at installing the documentation. I browsed a couple programs (written in Python) I had lying around and noticed that most of them used setuptools’ data_files keyword (in setup.cfg); these are however not supported in pyproject.toml AFAICT, at least not with the setuptools build backend.

I don’t really care much markup I use for describing the program, but I didn’t find a suggested format or installation method or installation location.

Suggestions welcome!

Thanks,
Thomas

abravalheri · March 25, 2023, 12:19am

Hi @_wiz , if you really want you can still use data-files with setuptools: Configuring setuptools using pyproject.toml files - setuptools 67.6.0.post20230308 documentation.

However this is discouraged/nor recommended because that will result in a OS-specific solution, and because it might not work well with the different ways a Python package can be installed (virtual envs, pipx, etc)…

As far as I know there is no good solution for distributing packages with docs in a way that is platform independent.

Because of that, you might be better off using OS-level packaging/distribution mechanisms.

(Unless docstrings are enough for you, in that case pydoc may work: pydoc — Documentation generator and online help system — Python 3.11.2 documentation)

mgorny · March 25, 2023, 5:37am

If you’re basically looking for a way to install stuff in /usr/share/doc or alike, the current packaging standards don’t really permit that. I’m not saying it isn’t possible — but you’re pretty much relying on undefined behavior, and then Linux distributions will have to move files anyway to the correct directories.

If you just prefer having it installed but the program doesn’t need to access it, I think the best you can do is to put it into a subdirectory of your project, publish somewhere online, make sure it’s included in sdist and perhaps add a big fat warning for packagers that installing docs offline is important.

If you want your program to be able to access it, or it is simply crucial for all users of the package to have it, then I suppose you can install it as package data — i.e. stuff it inside package’s directory with .py files, and make sure it ends up installed in site-packages. Then you can use importlib.resources to access these files, and I suppose you can write some help system on top of that.

Technically, you could also print the path to that subdirectory and tell people to browse it but that will only work when the package is loaded from files installed directly into the filesystem which isn’t the only supported case.

takluyver · March 27, 2023, 8:28pm

I don’t think it’s as bad as that. The standards do allow for what setuptools calls data-files and Flit calls external-data. Where those get installed is not guaranteed, but it is mostly predictable, and it can be used to install man pages if that’s your goal. On at least some Linux distros, man will even find man pages inside an activated virtualenv.

You shouldn’t assume man pages are readily available to all users, though. Beyond docstrings, we don’t really have a standard system for delivering documentation with Python packages.

What we do have is Readthedocs - most medium/large Python projects have docs written in Sphinx or Mkdocs hosted here (or sometimes on things like Github pages). RTD also offers PDF and Epub if you want to download docs for reference offline. Smaller projects might make do with just a README file in the repository.

pradyunsg · March 27, 2023, 8:37pm

Actually, it is guaranteed in some sense! They end up in the data directory specified by the corresponding scheme. You can see the specific location by running python -m sysconfig and seeing literally the first path it prints.