Some thoughts about easing a migration path away from setup.py

A standard for describing “what files go into the wheel” would indeed be very nice. There’s a bunch of complexity to be navigated to make sure such a standard leaves a way open for backends to handle special cases like generated files (compiled extensions being the obvious example, but tools that generate a _version.py from metadata are another case, that applies to pure Python files). But a “works for straightforward cases” solution would be very welcome.

The trick is that someone has to put together a proposal, and get buy in from all the various backends. I don’t know if the backends feel that the options have been explored sufficiently that they are ready to standardise something, but it feels like we’re at least close to that point.

2 Likes

For me, “could be streamlined if not for backwards compatibility” is already a problem worth solving. Call it an aesthetic sense or something. It doesn’t particularly bother me if others don’t see value in it.

However, I’d like to be able to offer something that feels really minimal, while still offering adequate control to handle a few more complex cases. I also want to have something that is really just a backend that is not even associated with a larger toolchain.

Agreed. My solution is essentially: here are a few canned recipes, if you want something else then you can write the code to choose the files. At least for now. But the point is, it’s a plugin-style interface. Just like how PEP 517 describes hooks and a backend implements them and a frontend calls them - this backend would describe and call hooks that the user has an option to implement.

There isn’t a standard for a manifest file and there certainly isn’t a standard for files equivalent to Setuptools’ manifest.in. I also quite dislike the idea of having to update a manifest file every time I add a new source code file to the project. Especially when the most common cases are “everything in src/”, “everything version-controlled in src/”, and perhaps the same but with some simple regex filters.

The existing large setup.py files I saw seem to be mostly code that figures out a bunch of paths and somehow gathers them into the right attributes of some Distutils subclass, and then invokes whatever Distutils magic that I assume actually runs the compiler. The issue for me is that there’s a lot of boilerplate involved in putting things in the places Setuptools/Distutils wants them to be.

My hope is that, in the refactoring that my architecture would force, these setup scripts would have a much clearer separation between the requirements-gathering and compiler-invocation steps. Then the former could eventually migrate to PEP 725 or whatever we end up with.

I assume that there will always be a need for some kind of script-driven compiler invocation step; it doesn’t seem feasible to provide metadata that allows a backend to deduce all the necessary command lines in order.

… But maybe that’s worth thinking about more.

Building compiled code is really, really hard. There’s a lot of code in setuptools/distutils dedicated to supporting different situations and compilers, and it’s still really, really basic. You can’t do basic things like compile files in a single target in threads and support C++ standards (like C++11) without coding them up yourself. NumPy famously had 13,000+ lines of code dedicated to building with distutils - quite a bit of it was helping support Fortran compiles, which was not built in either. Things like cross compile support, WebAssembly support, etc. are all hacks that we’ve just lived with. And if you want to include another library, you are almost always on your own, having to construct the command line invocations for each possible compiler.

The path forward for most of these projects is to use a tool designed to build compiled code (CMake or Meson), and backends that do this are getting to be pretty good (scikit-build-core and meson-python). These tools handle most compilers, different languages, have great support for all the things you might expect these days (like multithreaded builds, etc), and supports libraries that export build configurations. It was really fun to sit down with people working on various projects at scipy and show them that a 800+ line setup.py could be replaced by <20 lines of CMake and a simple scikit-build-core configuration (mostly PEP 621, and I used hatch new --init to convert the metadata automatically), and it also worked more places than the old one did!

Hatchling is 5K lines of code. I think you are looking at the whole repo, which includes hatch. Hatch is basically a replacement for nox/tox and PDM combined (minus locking), so it’s going to be a bit large.

It does use dependencies, though - a really minimal example without any dependencies (save a vendored copy of tomli) is flit-core. It doesn’t have the custom plugin feature you’d like, though.

There’s a library for extensionlib, and it’s on my todo list. Getting it right will be tricky, as there are a lot of details when building an extension. I’m planning on relying heavily on the experience gained with scikit-build-core. PDM-backend, Poetry-core, and Hatchling all have the ability to add custom build steps, including those that build binaries, and they all tend to have problems, since there are details like the compiled extensions needing to control the tags of the output wheel - only the compiled extension knowns if it needs the normal Python ABI, ABI3, no Python ABI at all, etc. There’s probably an issue every few weeks on cibuildwheel from some user who tried to set this up with Poetry-core themselves and not all platforms / cross-compiles work because Poetry wasn’t really designed for binary extensions.

This would be great, but would be tricky (even assuming you meant “sdist”, not “wheel”). Every backend has a different method. Hatchling is the best, IMO, and is what scikit-build-core is modeled on too; it starts from the .gitignore so it doesn’t depend on git being available, but is still a sensible default. I’d like a way to specify “src” directories, too; if there was a standard way to specify them, then tools like Ruff wouldn’t have to also be told about them separately. This is the biggest problem with the ultra-simple flit-core; getting the includes right always involves manually listing patterns in pyproject.toml.

I’d imaging something like this being really nice:

[source]
packages-dir = "src"
ignore-file = "**/.gitignore"
include = ["**.schema.json"]
exclude = ["/docs"]

Going all the way to wheel would be even harder to standardize. That might force projects to lay themselves out on disk a specific way, and especially when it comes to adding built extensions, everyone likes something different. Should the compiled code live next to the Python code, and be filtered out when making the package? Should they be in separate folders (remember some packages are primarily compiled with an optional Python binding)? Things like hatching’s force-include are fantastic, but probably not something that you could standardize.

9 Likes

Also, I highly recommend becoming familiar with what exists before trying to write something new. Make sure you have at least written one hatchling plug-in. Find an example or too of packages that do not fit with the current workflows well. I know trampolim had a simple build.py. You might find what is needed is just documentation for what exists.

It is somewhat annoying that many of the backhands are tied to frontends, but the situation is getting better. PDM supports any PEP 621 backend. Hatch is supposed to soon. I’m using hatchling for for quite a while, and I’m just now beginning to learn a bit of Hatch. I’ve never used flit, but have used flit-core for years. Etc.

5 Likes

I guess there isn’t a good way besides actually digging in to the implementation, to understand what is supported… ?

Yes, I think I saw you mention something like this. I’d be interested to see the input and output from that effort. What are tools like CMake and Meson doing that is so magical and need tens of kloc of Python to emulate? And how much wrapping does the backend need to do? I mean, can it subprocess.call an existing application, or does it have to have its own built extensions or just what?

Strange. Why wouldn’t it be the other way around?

I thought that ABIs existed on a per-library basis. How does this work if multiple extensions need to be compiled for the same wheel?

I like the basic shape of this and I might try implementing something analogous in the [tool] table just to see it. As regards wheels, my thinking was: it’s the plugin’s job to figure out where the C code is and where the corresponding object files should go; and then the C code is filtered out from the wheel. (or Fortran, whatever, mutatis mutandis) It’s not clear to me why there isn’t always some kind of Python binding - how else will the code be called? At any rate, the correct transformation from sdist to un-built wheel seems arbitrarily complex, as does the organization of the build artifacts - I can’t fathom describing it in data. But I don’t see a problem with the conceptual flow of lay things out → compile C code → put object files in places → clean up.

(Maybe there should be separate hooks for the phases of that…?)

Skimming through the trampolim source it seems like it uses .trampolim.py for that kind of configuration. I don’t see anything useful in the way of documentation, but the overall design seems like it might be close to what I envisioned.

The differences are all just about how you decide which files to include. That is why I would propose just to have a manifest file like:

dir: proj
mod: src/proj/__init__.py -> proj/__init__.py
mod: src/proj/mod.py -> proj/mod.py
...

Basically a file that explicitly lists which individual files are included and where they are supposed to go. I understand that many people would not want to type this out but that is why it should be generated by a tool. Different tools can have different ways of generating the manifest but the manifest itself should be very simple and easy to understand both for humans and computer programs.

I don’t like it because it has implicit defaults and has too many configuration options making it difficult to reason about exactly what it is doing short of running it and then unpacking the sdist to diff the contents. In a project with many contributors it needs to be very easy to understand when a change will result in files being added or removed from either the sdist or wheel.

I can see that being nice but I would want that as the input to a manifest generating tool so I can run commands like:

$ hatch generate-manifest
dir: proj
mod: src/proj/__init__.py -> proj/__init__.py
...

Then I would want commands like hatch validate-manifest, hatch update-manifest etc that can check if the manifest matches the rules laid out in the config and maybe hatch build could warn or error if the manifest seems to be out of date. You could run hatch validate-manifest in CI or even have hatch update-manifest run automatically on PRs.

what? lol

1 Like

Yes precisely, I mentioned that in the final point here User experience with porting off setup.py - #11 by ofek

As Henry and everyone else has mentioned, please look at what already exists and then circle back with what you find insufficient.

It has what I would argue the best defaults for users and in fact what users expect to happen by default. Additionally, there is no guessing because the defaults when you don’t specify inclusion/exclusion options have been documented forever:

1 Like

Are you aware that a shared library (.so in Unix, .dll in Windows) can be imported directly by Python, and its exposed methods called, without any sort of wrapper Python code? If you’re not, then you really need to read up on that feature, because otherwise you’re not going to understand the background of how C extensions (can) work.

3 Likes

Some thoughts about easing a migration path away from setup.py

From my point of view, setuptools is setup.py, setup.py is setuptools.

Maybe we rather want to help projects migrate away from setuptools when it makes sense. Maybe we want to help other build backends get better at doing the things setup.py enables, when it makes sense. But changing setuptools in depth so that it does not need to rely on setup.py anymore, I am not convinced it is worth pursuing, Or it’s up to setuptools to decide for themselves. Let setuptools be.

1 Like

If you want to learn about that, I highly recommend that you look at Meson’s source code (GitHub - mesonbuild/meson: The Meson Build System). It’s ~75 kLOC of Python code. (CMake is comparatively harder to dive into since it’s a ~700 kLOC behemoth of C++.) The command

cloc --by-file --vcs=git .

is a good start to see where the complexity is.

You will find things like: a build config DSL (because C and C++ are not standardized well enough for fully declarative config to work well); a backend for translating everything to Ninja build definitions (for fast and incremental builds) but also XCode and MS Visual Studio; 1kLOC just for finding Boost; lots of data about the idiosyncrasies of compilers of various platforms; framework-specific stuff like invoking Qt’s “meta-object compiler” or GNOME’s binding generation tools; cross-compilation; and so on.

Realistically, if every project reimplements the subset of these things that it needs, it’s going to reimplement them badly.

1 Like

Assuming they don’t type it out, this just kicks the can down the road. Either you need the tool to somehow get it right all the time, or you have to tell the tool what the rules are, and then you need a format for that input. Either way, a manifest like that isn’t solving the problem; it’s just recording the solution (and whether that happens before or after the actual file copy doesn’t matter much).

Aside from the feature request to make and work with a manifest as a separate task, I think you are both on the same page, actually. But while such a manifest file could be validated and such, I’m not convinced that it actually facilitates (re)building a wheel.

I relied on an online repository counting tool, applied too naively - the issue was already pointed out. Sorry about the confusion, anyway my planned feature set is surely even less ambitious.

I checked out extensionlib - unfortunately I couldn’t get a clear sense of how it works, in particular how it actually helps with producing the extension modules - what I saw only seems to help organize the code that would do so. Also, by my reading it would not be PEP 621 compliant to add a [[project.extensions]] array.

However, I do strongly agree with the idea of separating extension building from wheel packing. I just figure the easiest way for the extension builder to communicate the desired location of the build artifact - given that it’s going to operate within an isolated environment that will ultimately hold the wheel contents - is to just put the build artifact in the appropriate place in that environment. I don’t really want to define a separate API for that. Although I guess that avoids coupling to that design decision, for other build backends that want to work differently… ?

Thinking about it some more, I think I must have been. But I will need to understand it in more detail.

This is a fair assessment and I don’t mean to backseat the Setuptools team. The config scripts I imagine would not be compatible with Setuptools, nor my implementation with a current setup.py. Rather, the goal is to take inspiration from setup.py, and design something that people accustomed to setup.py could figure out easily enough, as the next step after moving static bits to pyproject.toml. I’m starting my analysis from Setuptools because it’s the obvious starting point: it’s what Pip uses by default, and what makes all those existing setup.py files work. But I’m neither trying to refactor Setuptools into oblivion nor asking anyone else to do the same; instead, I’m building upward from the example code in PEP 517 (which seems to be exactly what the PEP intended to happen).

Frightening, and impressive. I agree that it would be far better to leave it to those who already have a headstart on the task. But given that, I’m now more interested in how to interface to it. It looks like there is no API and the intended interface is all command-line, and that you basically use it just to produce the necessary artifacts? I see some stuff in the documentation about installing things, but it seems to refer to system-level stuff, so not compatible with the wheel format. (The examples seem focused on standalone C executables anyway.)

I guess there’s also the option of using Ninja directly, but it seems like at some level there’s always going to be some interface layer somewhere that does a subprocess.call etc. to invoke the C-building system.

When I dig through all the layers (Setuptools → distutils build command → build_clib or build_ext command → ccompiler base class (compiler attribute of the command) → _compile in an implementation class → spawn back in the base → top-level spawn function), I do in fact end up at such a wrapper. (I don’t know why I had any doubt I would.) It’s just that all the intermediate layers seem to be trying to implement some part of what Meson etc. do; and you’ve very much convinced me to try to implement any of that.

So now I’m just firmly convinced that I just want to smooth out that step a little bit, which honestly is pretty much what I originally had in mind. People who want to shell out to Meson can install Meson and do that. People who just want to make one manylinux wheel and know exactly what gcc commands they want, can directly use those commands instead. There just needs to be a hook for the right point in the process to do that, and a wrapper for things like logging and collecting errors from each invocation.

what I saw only seems to help organize the code that would do so

Yes, that is literally the purpose and only that. Whether you are using CMake, Rust, or whatever else to build extensions it doesn’t matter, the interface is the same and standardized.

Okay, I’m glad I understood properly.

When you said

If “this” can include other designs for the same fundamental idea, then I’m happy to help. But from what I understood of the interface described by extensionlib, I didn’t really like it. From experience, it’s hard to explain this kind of thing, and it depends a lot of subjective personal preferences. I think it will be easiest for me to express my own ideas in code.

Can you please describe briefly not the interface but very high level what you think the components of building Python packages are/should be conceptually?

edit: specifically a wheel, forget about all other possible outputs

1 Like

This is the flow I imagine for building a wheel.

  1. A build frontend invokes the PEP 517 build_wheel hook.

  2. The build backend creates a temporary folder that will contain the files to be packed. (Aside from build isolation, this is the easiest way to handle the requirement that the source folder may be read-only.)

  3. The backend parses pyproject.toml and produces a combined config object from the frontend’s config_settings and the appropriate [tool] table. It remembers the [project] table for later metadata creation.

  4. The backend invokes a “manifest” hook, which is responsible for copying necessary files and folders into the temporary folder - laid out as they would be for an sdist. Normally this will use a built-in hook provided by the backend (which in turn may care about the config), but it can be user-defined for more control.

  5. The backend invokes zero or more “build” hooks, which are responsible for invoking compilers as needed. There can be several that handle separate extensions, or one that oversees the entire process (possibly doing its own imports of helpers), or none for a pure Python wheel.

  6. The backend invokes a “cleanup” hook, which is responsible for any necessary rearrangement, deletion of C source. After this step, the packages for the wheel should be in src/, and certain other subfolders at top level can be used to specify the wheel’s data files. Anything else at top level will be at most used for metadata. The default cleanup hook basically just enforces “src layout”.

  7. Metadata is generated based on any README, LICENSE etc. files that remain at top level. (This is deferred in case the cleanup hook does something especially tricky.)

  8. The backend reorganizes and packs the appropriate folders into the wheel, and (most likely) removes the temporary folder. It returns the wheel’s basename to the frontend, per PEP 517.

Building sdists would be essentially the same for the first four steps. It would skip steps 5 and 6, and have different/simpler rules for steps 7 and 8.

As I understood your idea, the separation here is between step 5 and everything else.

Re-reading this, I realize I didn’t decide how/where the wheel tags are computed. I guess the cleanup hook is the most sensible place for that.

2 Likes

Thanks! I now understand what you were talking about.

As always I am in favor of 5 since that is the concept behind extensionlib but:

4.) On face value it’s wasteful versus just putting everything in a source distribution, but actually this would be an improvement because many tools build the wheel from the source distribution and therefore an unpacking step would no longer be necessary. I would be in favor, except I don’t think this optimization realistically will be accepted because the standards would have to be updated and every backend would have to change. Since this is just an optimization, I don’t see this happening.

6.) I think this is trying to do too much and is largely unnecessary if we have 5 because the outputs would be known and therefore can be removed. Anything extra should be the purview of build backends and other tools.

Maybe I should have been clearer that this is only the design I’m expressing in my own project.

To build sdists, there has to be some kind of step that decides what goes into an sdist. I expect that almost everyone will be able to use the default, but it’s a clear separate step in my design so I might as well expose the hook. Aside from that, once we already have the decision to copy files to a build folder, “everything laid out as it should be for the sdist” seems to me like the most natural starting point for a wheel build.

I see the opportunity there for an optimization, but I’m not trying to push it on others (at least, not yet). Many other toolchains want to verify explicitly that the sdist can be unpacked to build a wheel, and indeed that’s what build does by default. In fact, since PEP 517 doesn’t specify an interface for building both at once, I could only take advantage of the optimization by exposing a config setting (a flag for build_wheel that means to pack the sdist as well), and then I couldn’t communicate to the frontend about it. So, doing it properly would take a new PEP, and I don’t know how well that would be received.

In terms of cleanup, maybe it won’t be necessary in general, but again I am just exposing a hook in my own design. But my thinking is that someone might want to write per-extension hooks that leave the .so files etc. in the simplest places, and then a single overall hook that figures out where they go. Or maybe they all need to be linked together at the end somehow.

There’s also the issue about wheel tags, which has to happen somewhere. Maybe the build system is responsible for figuring out what the platform is, which then determines wheel tags. Maybe it had to do something different to target different Python versions. I guess this is something where I’d have to talk to cibuildwheel users to get a better idea.

What matters is that it should be easy to understand when something of potential consequence is being changed like files being added or removed from sdist or wheel. If you have a VCS-controlled manifest file then it is very clear when the contents of the release artefacts are being changed: all changes are explicitly visible in the diff of any pull request whether that means changing the contents of the files or changing which files are included.

What also matters is what it is exactly that can be made standard. The different tools like setuptools, poetry, hatch etc have all made opinionated decisions about how to specify the configuration of which files are included in sdist/wheel and it seems unlikely that we could get them to agree on a single standardised approach for this configuration. What can be standardised though is a very simple manifest file format that makes no implicit or opinionated decisions and that any tool can easily output or consume.

Indeed. In fact, “I prefer tool X’s opinionated decision” sounds like one of the main reasons someone would choose that tool. Part of the point of PEP 517, as I understand it, was to enable that kind of expression.

To be clear, do you imagine that there would be tools that produce a manifest but don’t build a wheel? And tools that expect the manifest file to exist rather than using their own scheme? (Or perhaps they’d offer a switch to override their [tool]-specific config with the manifest… ?)

I can see value in that, but I’d be opposed to mandating that any particular toolchain supports such a flow.

I guess the format is not quite as straightforward as it sounds, so there would be some point in standardization because there are actual decision points. At least, I can think of one: how to represent folder structure (either with some hierarchical organization - maybe involving indentation - or else by explicitly giving the full path for every file).