Some thoughts about easing a migration path away from setup.py

Continuing the discussion from User experience with porting off setup.py:

I want to keep on this line of inquiry for a bit. My original title was going to be “What would it take to get rid of setup.py for good?”, but this seems so involved that even I’d like to step back a bit. My first goal here is to enable people to replace their current setup.py configuration with something that is still Python code, but simpler, not constrained by Setuptools/Distutils boilerplate, and not dependent on Setuptools.

The following replies caught my attention:

My synthesis of these ideas, after also looking at some of the examples of setup.py that were shown in that thread, and looking at some of the Setuptools code:

  • In the world where everyone is at least not doing anything actively deprecated (even if they aren’t following best practices), the only remaining role Setuptools is as a PEP 517 backend.
  • Viewed purely as a PEP 517 backend, Setuptools is absurdly heavyweight.
    • It vendors its own version of Distutils and then builds on top of that. Prior to 3.12 it needs to be able to replace the standard library Distutils, which entails more code for the monkeypatching and even more code just to warn the user of the consequences of that monkeypatching.
    • Then this system is intended to be able to parse a command line by dynamically looking up classes that implement them, and having those class implementations in turn delegate to each other and do a bunch of other complex stuff.
    • In order to actually communicate with this system, anyone whose project involves compiling some non-Python stuff will be expected to subclass the build_ext command implementation, and then call a setup function provided by Setuptools, and pass a keyword argument like cmdclass={"build_ext": my_build_ext}, and also separately pass in a list of Extension instances, which then I guess interacts with the build_extensions method somehow.
    • Meanwhile, the script that’s importing Setuptools and calling the setup method was itself probably invoked from Setuptools anyway! And it has to use exec rather than import to do that, and it potentially has to run the script multiple times, with different monkey-patches each time - so that it can separate the process of asking the script for dynamically-discovered dependencies, vs. having it invoke whatever Distutils machinery to spawn processes for compilers. (Install dependencies, not build dependencies, btw. I think. It’s hard to keep track.)
    • And in spite of all this complexity, there apparently isn’t a good way to pass config options to the setup script that isn’t the now-deprecated approach of running it directly as a script. Which was a huge part of the original complaint in the previous thread, after all. [1]
  • On the other hand, there’s an example in PEP 517 showing that a minimally compliant - albeit user-hostile - backend is not many lines of code at all. Embellishing that for tasks like generating the metadata shouldn’t be too difficult, as all the necessary code already exists in various places.
  • Without something like PEP 725, there’s no way to specify platform-specific requirements. The easiest way to describe these sorts of things is in code, by letting Python check environment variables (although it’d be nicer if it could get that information from pyproject.toml, naturally… ?) and things like os.name and sys.platform, and compute data like lists of C source files that represent separate extension modules, or paths for libraries that the code will link against, etc.
  • But because of how Setuptools got to where it is now, the control flow is backwards and the setup.py script is dependent on Setuptools. It can’t just provide the necessary information to another backend. But actually, that would be backwards anyway: it would require standardization of the format for that information, which is at least as hard as solving PEP 725 and getting everyone’s buy-in.
    • As long as the “configuration” has to be running arbitrary Python anyway, it would make more sense to do the “dependency injection” thing, and have the backend call an explicit entry point in the setup script and provide utilities to that entry point, representing the sorts of functionality that the base build_ext command currently provides. It’d offer a callback that could accept lists of library dirs etc., instead of having to make a subclass and communicate the information via self.compiler or whatever.

I propose to create, as a proof of concept, a minimal PEP 517 backend that can be told to invoke Python scripts for customization purposes, which only considers itself responsible for setting up metadata and packing. Because I think I’m clever or something, I plan to call it stptls.

Of course, anyone who wanted to use this would have to rewrite their setup.py to a new interface, but it should be able to keep most of the core logic intact, just refactored. And of course it breaks command-line setup.py install etc. invocations but the entire point is to deprecate those anyway. I don’t really expect major projects to use this, but it’d be nice to prove that it’s usable. I’d even be willing to try and port and refactor some large existing setup.py files just to show what it could look like.

On the upside, such configurations could be simplified and made backend-independent, and also the new versions would (I think) be easier to migrate to a PEP 725-like standard if and when we get something that works there.

An outline of the design:

  • In pyproject.toml, [tool.stptls.hooks] (or something like that) specifies dotted names of configuration scripts. When the name starts with a ., it will try to do a relative import of a hook that it provides. Otherwise, it will try to do an absolute import of a setup script provided in the repository.
    • Every such script is expected to provide a specific named function, which will be called with a config object based on pyproject.toml contents plus perhaps some useful callbacks.
  • To build an sdist, it creates a temporary folder, runs the manifest hook, creates metadata, creates the .tar.gz from the entire temporary folder, and puts that in dist/.
    • The manifest hook is responsible for putting the necessary files into the temporary folder. There’s a builtin hook that just copies everything, and another that attempts to use a git checkout.
  • To build a wheel, it creates a temporary folder, runs the manifest hook, [2], runs the compile hook, creates metadata, creates the .whl from the package root, and puts that in dist/.
    • The compile hook is provided with callbacks to invoke compilers and such, perhaps something useful for setting up build isolation, I don’t know yet. It would be responsible for doing the same calculations that current setup.py scripts do now for simply figuring out what needs to be compiled and where everything is located; then instead of delegating to build_ext.run or whatever, it would explicitly invoke the callbacks that were given to it.
    • The default, built-in compile hook does nothing.
    • I’ll figure out something for package data later.

This ought to take all the Distutils-command-line-invocation-related boilerplate out of the process, and optimize for the common case of pure Python distributions - you just skip the compile hook, and then whatever the manifest hook left behind is suitable for both sdists and wheels.

Thoughts?


  1. There isn’t really a way to do that at all, of course; Setuptools will fake it by hacking sys.argv before exec’ing the script. Which, of course, is part of why it can’t run the script via import. But when Setuptools runs the script for you, this “doesn’t count” as doing the ugly deprecated thing. So Setuptools needs even more code so that when the exec’d script imports the Setuptools machinery, it can detect whether that started from Setuptools execing it. ↩︎

  2. At this point, if it was asked to build an sdist as well, I suppose it could do that here instead of repeating work. Except I don’t think that PEP 517 really facilitates that workflow… ? ↩︎

1 Like
  1. Setuptools is the way it is because of backward compatibility requirements. Writing an alternative backend is a fine goal, but you shouldn’t criticise setuptools for its design if your alternative doesn’t support all of the legacy code setuptools has to.
  2. @ofek has a proposal for an extension builder interface that might be of interest to you.
  3. Your proposed backend sounds interesting for its minimalist nature. I look forward to seeing it when you complete it.

You’re getting fairly close to both hatchling and PDM-backend with the approach/mechanisms that you are describing, albeit with slightly less structure to the actual hooks.

I would suggest looking at what these projects are doing and building a few toy-ish examples doing compilation, code generation, and not-1:1 source to installed file mapping to get some experience with those backends.

1 Like

To be clear: I don’t mean to denigrate or blame anyone for the current state of things. I just want to illustrate the complexity of the system, and how much of it is unnecessary for typical use cases, specifically in the context of users who are trying to move past deprecations. My understanding is that the lion’s share of the backward compatibility requirements in question relate to uses that are intended not to be supported in the future.

Is this a PEP in the works? Or do you otherwise have a link I could check out?

That’s… basically hatchling?

Edit: I posted at the same time as @pradyunsg.

Ah, I see Hatchling is specifically the backend used by Hatch - I had been wondering about the distinction. I also see it’s by Ofek, heh. I also see it’s on the order of 50 kloc, which is rather quite a bit more than I imagined would be necessary to exhibit my idea (at least one, maybe two orders of magnitude). But maybe interfacing to a compiler, etc. is a lot harder than I think…

It doesn’t seem as if PDM’s backend is available separately from the overall project, unless I’m missing something.

e: I think what I have in mind, feature-wise, is more like wheel, except that it does offer library use (and sdists) and does not add functionality to Setuptools (or provide a cli, since that will be build’s job). Apparently wheel is a bit under 10kloc, though that includes a vendored packaging. So that’s… something.

Building a backend that just supports pure python wheel is harder than you think, I suspect. You’ll need to support PEP 621 to read metadata from pyproject.toml, you’ll need to parse and validate various fields like version, dependencies, etc., you’ll need to create files like RECORD, you’ll need to support editable installs (PEP 660), …

The example in PEP 517 is extremely simplistic, and out of date in terms of packaging standards that were introduced since that PEP.

It’s available from PyPI (it has to be, to be usable as a backend): pdm-backend · PyPI

2 Likes

I don’t really see what problem you are trying to solve here.

The case of pure Python wheels is already well handled by many different backends. In fact while the design could be streamlined if not for backwards compatibility I think that this case was always handled fine by setuptools. The cases where setup.py can be replaced by a static config are also the cases where the setup.py itself looks basically declarative and worked fine already. It was fine for users and it was fine for maintainers (of pure Python projects). Some things could be improved but there was no need to abandon setuptools and setup.py to make those improvements. We also now have multiple alternative backends to setuptools that can handle pure Python projects and that were designed afresh without the backwards compatibility constraints that setuptools has.

The basic problem of just specifying which files should go into the sdist, which should go where in the wheel and then specifying some metadata like project version is not really that complicated. It is easy to over-complicate solutions to this simple problem though: when I look at e.g. how to tell hatch what files to include I can’t help but think that I would rather just list all of the files explicitly in a version-controlled manifest file. I don’t want to have to interpret a complicated combination of different tool-specific include, exclude, globs, .gitignore etc rules to try to imagine which files will end up in the sdist and the wheel and whether or not the changes in a pull request would affect that. I mention hatch here but the same problem applies to all of them and is made worse by the fact that there are multiple tools that all do it differently.

It would be nice if tools like hatch would provide a way to generate or validate explicit manifest files. I don’t see why the configuration that is used to generate the sdist/wheel for a simple pure Python project needs to be any more complicated than a few pieces of metadata and a manifest file though.

The difficult part is and always was building projects that have non-Python code and external non-Python runtime and build dependencies. We don’t need another “build backend” that doesn’t do anything to help with the cases that are actually difficult. If there are problems with the existing build backends for pure Python projects then it would be better just to improve them and (and their documentation).

5 Likes

A standard for describing “what files go into the wheel” would indeed be very nice. There’s a bunch of complexity to be navigated to make sure such a standard leaves a way open for backends to handle special cases like generated files (compiled extensions being the obvious example, but tools that generate a _version.py from metadata are another case, that applies to pure Python files). But a “works for straightforward cases” solution would be very welcome.

The trick is that someone has to put together a proposal, and get buy in from all the various backends. I don’t know if the backends feel that the options have been explored sufficiently that they are ready to standardise something, but it feels like we’re at least close to that point.

2 Likes

For me, “could be streamlined if not for backwards compatibility” is already a problem worth solving. Call it an aesthetic sense or something. It doesn’t particularly bother me if others don’t see value in it.

However, I’d like to be able to offer something that feels really minimal, while still offering adequate control to handle a few more complex cases. I also want to have something that is really just a backend that is not even associated with a larger toolchain.

Agreed. My solution is essentially: here are a few canned recipes, if you want something else then you can write the code to choose the files. At least for now. But the point is, it’s a plugin-style interface. Just like how PEP 517 describes hooks and a backend implements them and a frontend calls them - this backend would describe and call hooks that the user has an option to implement.

There isn’t a standard for a manifest file and there certainly isn’t a standard for files equivalent to Setuptools’ manifest.in. I also quite dislike the idea of having to update a manifest file every time I add a new source code file to the project. Especially when the most common cases are “everything in src/”, “everything version-controlled in src/”, and perhaps the same but with some simple regex filters.

The existing large setup.py files I saw seem to be mostly code that figures out a bunch of paths and somehow gathers them into the right attributes of some Distutils subclass, and then invokes whatever Distutils magic that I assume actually runs the compiler. The issue for me is that there’s a lot of boilerplate involved in putting things in the places Setuptools/Distutils wants them to be.

My hope is that, in the refactoring that my architecture would force, these setup scripts would have a much clearer separation between the requirements-gathering and compiler-invocation steps. Then the former could eventually migrate to PEP 725 or whatever we end up with.

I assume that there will always be a need for some kind of script-driven compiler invocation step; it doesn’t seem feasible to provide metadata that allows a backend to deduce all the necessary command lines in order.

… But maybe that’s worth thinking about more.

Building compiled code is really, really hard. There’s a lot of code in setuptools/distutils dedicated to supporting different situations and compilers, and it’s still really, really basic. You can’t do basic things like compile files in a single target in threads and support C++ standards (like C++11) without coding them up yourself. NumPy famously had 13,000+ lines of code dedicated to building with distutils - quite a bit of it was helping support Fortran compiles, which was not built in either. Things like cross compile support, WebAssembly support, etc. are all hacks that we’ve just lived with. And if you want to include another library, you are almost always on your own, having to construct the command line invocations for each possible compiler.

The path forward for most of these projects is to use a tool designed to build compiled code (CMake or Meson), and backends that do this are getting to be pretty good (scikit-build-core and meson-python). These tools handle most compilers, different languages, have great support for all the things you might expect these days (like multithreaded builds, etc), and supports libraries that export build configurations. It was really fun to sit down with people working on various projects at scipy and show them that a 800+ line setup.py could be replaced by <20 lines of CMake and a simple scikit-build-core configuration (mostly PEP 621, and I used hatch new --init to convert the metadata automatically), and it also worked more places than the old one did!

Hatchling is 5K lines of code. I think you are looking at the whole repo, which includes hatch. Hatch is basically a replacement for nox/tox and PDM combined (minus locking), so it’s going to be a bit large.

It does use dependencies, though - a really minimal example without any dependencies (save a vendored copy of tomli) is flit-core. It doesn’t have the custom plugin feature you’d like, though.

There’s a library for extensionlib, and it’s on my todo list. Getting it right will be tricky, as there are a lot of details when building an extension. I’m planning on relying heavily on the experience gained with scikit-build-core. PDM-backend, Poetry-core, and Hatchling all have the ability to add custom build steps, including those that build binaries, and they all tend to have problems, since there are details like the compiled extensions needing to control the tags of the output wheel - only the compiled extension knowns if it needs the normal Python ABI, ABI3, no Python ABI at all, etc. There’s probably an issue every few weeks on cibuildwheel from some user who tried to set this up with Poetry-core themselves and not all platforms / cross-compiles work because Poetry wasn’t really designed for binary extensions.

This would be great, but would be tricky (even assuming you meant “sdist”, not “wheel”). Every backend has a different method. Hatchling is the best, IMO, and is what scikit-build-core is modeled on too; it starts from the .gitignore so it doesn’t depend on git being available, but is still a sensible default. I’d like a way to specify “src” directories, too; if there was a standard way to specify them, then tools like Ruff wouldn’t have to also be told about them separately. This is the biggest problem with the ultra-simple flit-core; getting the includes right always involves manually listing patterns in pyproject.toml.

I’d imaging something like this being really nice:

[source]
packages-dir = "src"
ignore-file = "**/.gitignore"
include = ["**.schema.json"]
exclude = ["/docs"]

Going all the way to wheel would be even harder to standardize. That might force projects to lay themselves out on disk a specific way, and especially when it comes to adding built extensions, everyone likes something different. Should the compiled code live next to the Python code, and be filtered out when making the package? Should they be in separate folders (remember some packages are primarily compiled with an optional Python binding)? Things like hatching’s force-include are fantastic, but probably not something that you could standardize.

9 Likes

Also, I highly recommend becoming familiar with what exists before trying to write something new. Make sure you have at least written one hatchling plug-in. Find an example or too of packages that do not fit with the current workflows well. I know trampolim had a simple build.py. You might find what is needed is just documentation for what exists.

It is somewhat annoying that many of the backhands are tied to frontends, but the situation is getting better. PDM supports any PEP 621 backend. Hatch is supposed to soon. I’m using hatchling for for quite a while, and I’m just now beginning to learn a bit of Hatch. I’ve never used flit, but have used flit-core for years. Etc.

5 Likes

I guess there isn’t a good way besides actually digging in to the implementation, to understand what is supported… ?

Yes, I think I saw you mention something like this. I’d be interested to see the input and output from that effort. What are tools like CMake and Meson doing that is so magical and need tens of kloc of Python to emulate? And how much wrapping does the backend need to do? I mean, can it subprocess.call an existing application, or does it have to have its own built extensions or just what?

Strange. Why wouldn’t it be the other way around?

I thought that ABIs existed on a per-library basis. How does this work if multiple extensions need to be compiled for the same wheel?

I like the basic shape of this and I might try implementing something analogous in the [tool] table just to see it. As regards wheels, my thinking was: it’s the plugin’s job to figure out where the C code is and where the corresponding object files should go; and then the C code is filtered out from the wheel. (or Fortran, whatever, mutatis mutandis) It’s not clear to me why there isn’t always some kind of Python binding - how else will the code be called? At any rate, the correct transformation from sdist to un-built wheel seems arbitrarily complex, as does the organization of the build artifacts - I can’t fathom describing it in data. But I don’t see a problem with the conceptual flow of lay things out → compile C code → put object files in places → clean up.

(Maybe there should be separate hooks for the phases of that…?)

Skimming through the trampolim source it seems like it uses .trampolim.py for that kind of configuration. I don’t see anything useful in the way of documentation, but the overall design seems like it might be close to what I envisioned.

The differences are all just about how you decide which files to include. That is why I would propose just to have a manifest file like:

dir: proj
mod: src/proj/__init__.py -> proj/__init__.py
mod: src/proj/mod.py -> proj/mod.py
...

Basically a file that explicitly lists which individual files are included and where they are supposed to go. I understand that many people would not want to type this out but that is why it should be generated by a tool. Different tools can have different ways of generating the manifest but the manifest itself should be very simple and easy to understand both for humans and computer programs.

I don’t like it because it has implicit defaults and has too many configuration options making it difficult to reason about exactly what it is doing short of running it and then unpacking the sdist to diff the contents. In a project with many contributors it needs to be very easy to understand when a change will result in files being added or removed from either the sdist or wheel.

I can see that being nice but I would want that as the input to a manifest generating tool so I can run commands like:

$ hatch generate-manifest
dir: proj
mod: src/proj/__init__.py -> proj/__init__.py
...

Then I would want commands like hatch validate-manifest, hatch update-manifest etc that can check if the manifest matches the rules laid out in the config and maybe hatch build could warn or error if the manifest seems to be out of date. You could run hatch validate-manifest in CI or even have hatch update-manifest run automatically on PRs.

what? lol

1 Like

Yes precisely, I mentioned that in the final point here User experience with porting off setup.py - #11 by ofek

As Henry and everyone else has mentioned, please look at what already exists and then circle back with what you find insufficient.

It has what I would argue the best defaults for users and in fact what users expect to happen by default. Additionally, there is no guessing because the defaults when you don’t specify inclusion/exclusion options have been documented forever:

1 Like

Are you aware that a shared library (.so in Unix, .dll in Windows) can be imported directly by Python, and its exposed methods called, without any sort of wrapper Python code? If you’re not, then you really need to read up on that feature, because otherwise you’re not going to understand the background of how C extensions (can) work.

3 Likes

Some thoughts about easing a migration path away from setup.py

From my point of view, setuptools is setup.py, setup.py is setuptools.

Maybe we rather want to help projects migrate away from setuptools when it makes sense. Maybe we want to help other build backends get better at doing the things setup.py enables, when it makes sense. But changing setuptools in depth so that it does not need to rely on setup.py anymore, I am not convinced it is worth pursuing, Or it’s up to setuptools to decide for themselves. Let setuptools be.

1 Like

If you want to learn about that, I highly recommend that you look at Meson’s source code (GitHub - mesonbuild/meson: The Meson Build System). It’s ~75 kLOC of Python code. (CMake is comparatively harder to dive into since it’s a ~700 kLOC behemoth of C++.) The command

cloc --by-file --vcs=git .

is a good start to see where the complexity is.

You will find things like: a build config DSL (because C and C++ are not standardized well enough for fully declarative config to work well); a backend for translating everything to Ninja build definitions (for fast and incremental builds) but also XCode and MS Visual Studio; 1kLOC just for finding Boost; lots of data about the idiosyncrasies of compilers of various platforms; framework-specific stuff like invoking Qt’s “meta-object compiler” or GNOME’s binding generation tools; cross-compilation; and so on.

Realistically, if every project reimplements the subset of these things that it needs, it’s going to reimplement them badly.

1 Like