Thanks for the great summary @steve.dower, I agree with pretty much everything you said.
For the first bullet point, there’s also the related but broader point of mapping build and runtime dependencies. That requires some kind of mapping mechanism that is generic (e.g., take PyPI package names as canonical, and allowing conda and other package managers to map those to their names). I believe GitHub - conda/grayskull: Grayskull - Recipe generator for Conda guesses this mapping in an ad-hoc fashion and gets it right about 90% of the time, and that’s very helpful for recipe creation - but not enough. If there were a registry, that’d allow pip to not re-install already-present dependencies.
That said, being able to declare native dependencies would be a great start. I’d be interested in writing a PEP on this topic and help move it forward.
To give an example of how much goes missing, here are the build dependencies of SciPy:
- dependencies declared in
pyproject.toml
:- numpy (with still incomplete versioning, due to Steve’s 2nd point)
- meson-python
- Cython
- pybind11
- pythran
- wheel
- dependencies that cannot be declared:
- C/C++ compilers
- Fortran compiler
- BLAS and LAPACK libraries
-dev
packages for Python, BLAS and LAPACK, if headers are packaged separately- pkg-config or system CMake (for dependency resolution of BLAS/LAPACK)
And SciPy is still simple compared to other cases, like GPU or distributed libraries. Right now we just start a build when someone types pip install scipy
and there’s no wheel. And then fail halfway through with a hopefully somewhat clear error message. And then users get to read the html docs to figure out what they are missing. At that point, even a “system dependencies” list that pip can only show as an informative error message at the end would be a big help.