Building compiled code is really, really hard. There’s a lot of code in setuptools/distutils dedicated to supporting different situations and compilers, and it’s still really, really basic. You can’t do basic things like compile files in a single target in threads and support C++ standards (like C++11) without coding them up yourself. NumPy famously had 13,000+ lines of code dedicated to building with distutils - quite a bit of it was helping support Fortran compiles, which was not built in either. Things like cross compile support, WebAssembly support, etc. are all hacks that we’ve just lived with. And if you want to include another library, you are almost always on your own, having to construct the command line invocations for each possible compiler.
The path forward for most of these projects is to use a tool designed to build compiled code (CMake or Meson), and backends that do this are getting to be pretty good (scikit-build-core and meson-python). These tools handle most compilers, different languages, have great support for all the things you might expect these days (like multithreaded builds, etc), and supports libraries that export build configurations. It was really fun to sit down with people working on various projects at scipy and show them that a 800+ line setup.py could be replaced by <20 lines of CMake and a simple scikit-build-core configuration (mostly PEP 621, and I used
hatch new --init to convert the metadata automatically), and it also worked more places than the old one did!
Hatchling is 5K lines of code. I think you are looking at the whole repo, which includes hatch. Hatch is basically a replacement for nox/tox and PDM combined (minus locking), so it’s going to be a bit large.
It does use dependencies, though - a really minimal example without any dependencies (save a vendored copy of tomli) is flit-core. It doesn’t have the custom plugin feature you’d like, though.
There’s a library for extensionlib, and it’s on my todo list. Getting it right will be tricky, as there are a lot of details when building an extension. I’m planning on relying heavily on the experience gained with scikit-build-core. PDM-backend, Poetry-core, and Hatchling all have the ability to add custom build steps, including those that build binaries, and they all tend to have problems, since there are details like the compiled extensions needing to control the tags of the output wheel - only the compiled extension knowns if it needs the normal Python ABI, ABI3, no Python ABI at all, etc. There’s probably an issue every few weeks on cibuildwheel from some user who tried to set this up with Poetry-core themselves and not all platforms / cross-compiles work because Poetry wasn’t really designed for binary extensions.
This would be great, but would be tricky (even assuming you meant “sdist”, not “wheel”). Every backend has a different method. Hatchling is the best, IMO, and is what scikit-build-core is modeled on too; it starts from the
.gitignore so it doesn’t depend on git being available, but is still a sensible default. I’d like a way to specify “src” directories, too; if there was a standard way to specify them, then tools like Ruff wouldn’t have to also be told about them separately. This is the biggest problem with the ultra-simple flit-core; getting the includes right always involves manually listing patterns in pyproject.toml.
I’d imaging something like this being really nice:
packages-dir = "src"
ignore-file = "**/.gitignore"
include = ["**.schema.json"]
exclude = ["/docs"]
Going all the way to wheel would be even harder to standardize. That might force projects to lay themselves out on disk a specific way, and especially when it comes to adding built extensions, everyone likes something different. Should the compiled code live next to the Python code, and be filtered out when making the package? Should they be in separate folders (remember some packages are primarily compiled with an optional Python binding)? Things like hatching’s force-include are fantastic, but probably not something that you could standardize.