Python Packaging Strategy Discussion - Part 1

rgommers · January 11, 2023, 9:53am

I agree with both Oscar’s comment and Steve’s reply. With the minor note that I don’t think build system usage will ever converge. It’s conceivable build backends do though, since they’re a pretty thin layer in between a couple of pyproject.toml hooks and invoking the actual build system. So it’s not inconceivable that, for example, scikit-build-core and meson-python would merge in the future and have a configuration option for whether to use CMake or Meson.

Overall we’re in decent shape here - there’s work to do on build backends and build systems, but nothing in the overall Python packaging design for that is currently blocking or in clear need of changes.

Ralf Gommers:

steve.dower:

To end with a single, discussion-worthy question, and bearing in mind that we don’t just set the technology but also the culture of Python packaging: should we be trying to make each Python user be their own system integrator, supporting the existing integrators, or become the sole integrator ourselves?

I like @pf_moore’s answer a lot. The “want to be their own integrator” users are important (and over-represented on this forum), and that should continue to be supported. However, the average user doesn’t want to do this, they want things to work well without too much trouble. So I’d also go for supporting the existing integrators better.

Like @pradyunsg, I also have more thoughts than fit in a Discourse post - will go write a blog post too:)

Getting back to the big picture strategy discussion, here is that blog post: Python packaging & workflows - where to next? | Labs. It’s an attempt at a comprehensive set of design choices and changes to make yes/no. There’s a long version, and a short version with only the key points. I’ll post that short version below.

The most important design changes for Python packaging to address native code
issues are:

Allow declaring external dependencies in a complete enough fashion in
pyproject.toml: compilers, external libraries, virtual dependencies.
Split the three purposes of PyPI: installers (Pip in particular) must not
install from source by default, and must avoid mixing from source and binary
features in general. PyPI itself should allow uploading sdist’s for
redistribution only rather than intended for direct end user use.
Implement a new mode for installers: only pure Python (or -any) packages
from PyPI, and everything else from a system package manager.
To enable both (1) and (3): name mapping from canonical PyPI names to other names.
Implement post-release metadata editing capabilities for PyPI.

Equally important, here are the non-changes and assumptions:

Users are not, and don’t want to become, system integrators,
One way of building packages for everything for all Python users is not feasible,
Major backwards compatibility breaks will be too painful and hard to pull
off, and hence should be avoided,
Don’t add GPU or SIMD wheel tags,
Accept that some of the hardest cases (complex C++ dependencies, hairy native
dependencies like in the geospatial stack) are not a good fit for PyPI’s
social model and require a package manager which builds everything in a
coherent fashion,
No conda-abi wheels on PyPI, or any other such mixed model.

On the topic of what needs to be unified:

Aim for uniform concepts (e.g., build backend, environment manager, installer) and a multitude of implementations,
Align the UX between implementations of the same concept to the extent possible,
Build a single layered workflow tool on top (ala Cargo) that,
- allows dropping down into the underlying tools as needed,
- is independent of any of the tools, including what package and environment
  managers to use. Importantly, it should handle working with wheels, conda
  packages, and other packaging formats/systems that provide the needed
  concepts and tools.