Continuing the discussion from User experience with porting off setup.py:
I want to keep on this line of inquiry for a bit. My original title was going to be “What would it take to get rid of setup.py
for good?”, but this seems so involved that even I’d like to step back a bit. My first goal here is to enable people to replace their current setup.py
configuration with something that is still Python code, but simpler, not constrained by Setuptools/Distutils boilerplate, and not dependent on Setuptools.
The following replies caught my attention:
My synthesis of these ideas, after also looking at some of the examples of setup.py
that were shown in that thread, and looking at some of the Setuptools code:
- In the world where everyone is at least not doing anything actively deprecated (even if they aren’t following best practices), the only remaining role Setuptools is as a PEP 517 backend.
- Viewed purely as a PEP 517 backend, Setuptools is absurdly heavyweight.
- It vendors its own version of Distutils and then builds on top of that. Prior to 3.12 it needs to be able to replace the standard library Distutils, which entails more code for the monkeypatching and even more code just to warn the user of the consequences of that monkeypatching.
- Then this system is intended to be able to parse a command line by dynamically looking up classes that implement them, and having those class implementations in turn delegate to each other and do a bunch of other complex stuff.
- In order to actually communicate with this system, anyone whose project involves compiling some non-Python stuff will be expected to subclass the
build_ext
command implementation, and then call asetup
function provided by Setuptools, and pass a keyword argument likecmdclass={"build_ext": my_build_ext}
, and also separately pass in a list ofExtension
instances, which then I guess interacts with thebuild_extensions
method somehow. - Meanwhile, the script that’s importing Setuptools and calling the
setup
method was itself probably invoked from Setuptools anyway! And it has to useexec
rather thanimport
to do that, and it potentially has to run the script multiple times, with different monkey-patches each time - so that it can separate the process of asking the script for dynamically-discovered dependencies, vs. having it invoke whatever Distutils machinery to spawn processes for compilers. (Install dependencies, not build dependencies, btw. I think. It’s hard to keep track.) - And in spite of all this complexity, there apparently isn’t a good way to pass config options to the setup script that isn’t the now-deprecated approach of running it directly as a script. Which was a huge part of the original complaint in the previous thread, after all. [1]
- On the other hand, there’s an example in PEP 517 showing that a minimally compliant - albeit user-hostile - backend is not many lines of code at all. Embellishing that for tasks like generating the metadata shouldn’t be too difficult, as all the necessary code already exists in various places.
- Without something like PEP 725, there’s no way to specify platform-specific requirements. The easiest way to describe these sorts of things is in code, by letting Python check environment variables (although it’d be nicer if it could get that information from
pyproject.toml
, naturally… ?) and things likeos.name
andsys.platform
, and compute data like lists of C source files that represent separate extension modules, or paths for libraries that the code will link against, etc. - But because of how Setuptools got to where it is now, the control flow is backwards and the
setup.py
script is dependent on Setuptools. It can’t just provide the necessary information to another backend. But actually, that would be backwards anyway: it would require standardization of the format for that information, which is at least as hard as solving PEP 725 and getting everyone’s buy-in.- As long as the “configuration” has to be running arbitrary Python anyway, it would make more sense to do the “dependency injection” thing, and have the backend call an explicit entry point in the setup script and provide utilities to that entry point, representing the sorts of functionality that the base
build_ext
command currently provides. It’d offer a callback that could accept lists of library dirs etc., instead of having to make a subclass and communicate the information viaself.compiler
or whatever.
- As long as the “configuration” has to be running arbitrary Python anyway, it would make more sense to do the “dependency injection” thing, and have the backend call an explicit entry point in the setup script and provide utilities to that entry point, representing the sorts of functionality that the base
I propose to create, as a proof of concept, a minimal PEP 517 backend that can be told to invoke Python scripts for customization purposes, which only considers itself responsible for setting up metadata and packing. Because I think I’m clever or something, I plan to call it stptls
.
Of course, anyone who wanted to use this would have to rewrite their setup.py to a new interface, but it should be able to keep most of the core logic intact, just refactored. And of course it breaks command-line setup.py install
etc. invocations but the entire point is to deprecate those anyway. I don’t really expect major projects to use this, but it’d be nice to prove that it’s usable. I’d even be willing to try and port and refactor some large existing setup.py
files just to show what it could look like.
On the upside, such configurations could be simplified and made backend-independent, and also the new versions would (I think) be easier to migrate to a PEP 725-like standard if and when we get something that works there.
An outline of the design:
- In
pyproject.toml
,[tool.stptls.hooks]
(or something like that) specifies dotted names of configuration scripts. When the name starts with a.
, it will try to do a relative import of a hook that it provides. Otherwise, it will try to do an absolute import of a setup script provided in the repository.- Every such script is expected to provide a specific named function, which will be called with a config object based on
pyproject.toml
contents plus perhaps some useful callbacks.
- Every such script is expected to provide a specific named function, which will be called with a config object based on
- To build an sdist, it creates a temporary folder, runs the
manifest
hook, creates metadata, creates the.tar.gz
from the entire temporary folder, and puts that indist/
.- The
manifest
hook is responsible for putting the necessary files into the temporary folder. There’s a builtin hook that just copies everything, and another that attempts to use a git checkout.
- The
- To build a wheel, it creates a temporary folder, runs the
manifest
hook, [2], runs thecompile
hook, creates metadata, creates the.whl
from the package root, and puts that indist/
.- The
compile
hook is provided with callbacks to invoke compilers and such, perhaps something useful for setting up build isolation, I don’t know yet. It would be responsible for doing the same calculations that currentsetup.py
scripts do now for simply figuring out what needs to be compiled and where everything is located; then instead of delegating tobuild_ext.run
or whatever, it would explicitly invoke the callbacks that were given to it. - The default, built-in
compile
hook does nothing. - I’ll figure out something for package data later.
- The
This ought to take all the Distutils-command-line-invocation-related boilerplate out of the process, and optimize for the common case of pure Python distributions - you just skip the compile
hook, and then whatever the manifest
hook left behind is suitable for both sdists and wheels.
Thoughts?
There isn’t really a way to do that at all, of course; Setuptools will fake it by hacking
sys.argv
before exec’ing the script. Which, of course, is part of why it can’t run the script viaimport
. But when Setuptools runs the script for you, this “doesn’t count” as doing the ugly deprecated thing. So Setuptools needs even more code so that when the exec’d script imports the Setuptools machinery, it can detect whether that started from Setuptools execing it. ↩︎At this point, if it was asked to build an sdist as well, I suppose it could do that here instead of repeating work. Except I don’t think that PEP 517 really facilitates that workflow… ? ↩︎