User experience with porting off setup.py

2 posts were split to a new topic: Proposed alternative governance structure for Python packaging

build has a dependency on packaging and pyproject_hooks. If they are installed normally, it is problematic because there are other tools with a dependency on them, and you could no longer get a consistent environment in case the packaging or pyproject_hooks version wanted by a tool is incompatible with that wanted by build. Effectively, you would be adding dependency constraints on all Python environments. Vendoring is not good either because build also has an importable API. You would easily get two versions of the same package (packaging and build._vendored.packaging) imported into the same Python process.

I am sympathetic to the desire of a unified toolchain, but I don’t think the stdlib is the best way to go about it.

Also note that it has been the case for a long time that you need an extra tool, twine, to upload packages to PyPI. This is not new under the sun.

This is a legitimate concern.

So? What’s wrong with that? I know of a handful of projects that vendor their dependencies and then import them into package-local namespaces so they don’t conflict with modules from site-packages. Pip is in that list. I admit it is a crude and somewhat less efficient solution. But it works.

For the rare project that wants to import build as an API and also needs to use packaging and/or pyproject_hooks, it seems to me that build could gain an API that forces it to import the global version of dependencies to avoid the multiple version problem. Again, a pattern I’ve seen in the wild.

I also believe this is sub-optimal and somewhat user hostile. But the set of people who want to upload packages is smaller than the set who want to build them is smaller than the set who want to install packages. So it isn’t the highest priority to address.

I will note that at the point there exists a unified packaging tool that does everything under the sun, this problem becomes moot because you must ship a tool to install packages out-of-the-box and if that tool also does building and uploading, you get those features for free without having to debate whether to include another tool.

6 Likes

Pip does not have an importable API, though.

You also have to consider that packaging standards are quite in flux, and if everyone gets build preinstalled (or if it even becomes part of the stdlib), then rolling out changes will be much harder. Pip has “please upgrade” automatic notices to help with that, but as an installation tool, it is priviledged.

1 Like

I pinned versions of build dependencies in my pyproject.toml and the OpenIndiana package maintainer filed an issue that it breaks downstream packaging if my pinned version is different from whatever OpenIndiana is using.

This is exactly the kind of [undocumented] negative downstream packaging effect I was worried about when adopting pyproject.toml :confused:

So I guess as a package maintainer I have to choose between determinism and the convenience of downstream packagers.

If I ignore pyproject.toml and just invoke python setup.py build from a [deterministic] virtualenv [using a requirements.txt with all versions pinned], I can have both.

Maybe the problem here is downstream distro packagers aren’t yet using modern packaging workflows. But something tells me they won’t like the new workflows for a myriad of reasons (including the non-determinism and the fact that the build frontend really likes to download things from the Internet).

3 Likes

You might want to read this discussion: PEP 665: Specifying Installation Requirements for Python Projects . PEP 665, proposing a lock file format (with hashes for security), was rejected due to lack of sdist support.

Yes, this is a key problem with pyproject.toml. It is actually not generic, the constraints you write in it are specific to how you build wheels for PyPI - they do not apply for other distros. This is not really documented anywhere, and widely misunderstood by distro packagers.

This is not a pyproject.toml (metadata) issue, but is because of build isolation. You can still have both here, just set your virtualenv up the way you did before and turn off build isolation with, e.g. python -m build -wnx.

4 Likes

This is a significant problem because there is no other way to communicate this information to distros. I would have thought that the dependency spec in pyproject.toml should represent compatible ranges such that if everything is built from source then the build should be expected to succeed.

Separately of course there is a need to be able to pin specific versions of dependencies for generating the specific wheels that go to PyPI. Ideally that should be specified in some other way though. Maybe that means that you need a separate pyproject.toml file for e.g. cibuildwheel or is it possible to configure these in a [tool.cibuildwheel] somehow?

If I understand correctly numpy uses version pinning because of ABI compatibility between PyPI wheels but there should really be a different way to say “this wheel requires this exact other binary wheel”.

Not well. Pip’s vendoring causes problems for Linux distributors who don’t like vendoring (for legitimate reasons), and debundling causes problems for pip.

It’s the best we can do, not a good approach to recommend.

If a lack of bootstrapping is a problem, would it be feasible to move build (and its dependencies) into the stdlib on the same terms as importlib.metadata and importlib.resources? I don’t know how painful these are from a maintainer’s perspective, but from a downstream developer’s perspective it’s very nice to be able to say “I need XYZ feature in importlib.metadata, so I will add the dependency importlib_metadata >= X; python_version < Y, and will drop it when python_requires >= Y”.

I get that it would be a sort of undoing of the distutils purge, but build and packaging seem at least as important as importlib.*, and that approach seems to work to balance the flexibility/stability tradeoff of stdlib vs PyPI.

2 Likes

Not at all. Everything that this tool would have to do is specified by PEPs, which gives us stability that distutils never had (specifically, distutils had to invoke platform-specific compilers, which are not covered by our own PEPs).

The challenge is persuading the maintainers of these libraries to give up their current flexibility. We’re somewhat failing that with importlib.resources at the moment, leading to more drastic API changes over time than some of us are comfortable with. But I think with a clearly specified API and willing maintainers, there’s nothing actually stopping this from happening.


Also, we’re onto a very new topic here. Whoever continues this path, please start a new thread.

4 Likes

I would not recommend pinning build dependencies in pyproject.toml for the same reasons that project.dependencies or setup.py install_requires should not be pinned nor bounded more than strictly necessary.

The difference when using pyproject.toml is that it enables build isolation by default. But you can disable it with options of pip (--no-build-isolation) and build (--no-isolation), so as to use the same deterministic build environment you were using by invoking setup.py directly.

So you can have pyproject.toml and determinism.

1 Like

With cibuildwheel the environment is always isolated but could you use something like CIBW_BEFORE_BUILD='pip install -r requirements-build-pinned.txt?

How would you pass this over to someone who does pip install foo and ends up installing from the sdist?

Is it not expected that anyone with the sdist should be able to reproduce the same versions of build requirements that were used to generate the PyPI wheels?

1 Like

They can, but it’s not expected to be the default.

The default is that they get a successful build of their own. To get a build with matching dependencies they’re going to have to find another reference (such as a doc page) that explains how to do it.

The difference is that for the default, they no longer need to find that page. Pre-pyproject.toml, it was impossible to build any package without looking up its specific docs.

1 Like

This is a valid use case, but my point was about the difference between building with a standard build front end and direct setup.py invocation. I wanted to highlight that both stand on the same foot with respect to build reproducibility.

It seems unfortunate that you have to choose between having pinned build dependencies and having build isolation. Ideally for deterministic builds you would want both together.

1 Like

Was it? I suppose that for the majority of projects, python setup.py bdist_wheel would have worked fine?

I’m not sure why, if you are pinning dependencies, you wouldn’t also want a clean environment for them…?

Well, yeah, probably. “Any” was a bit hyperbolic. If you had the right dependencies available and configured, and the project had hacked distutils/setuptools sufficiently to hide any extra steps behind bdist_wheel (and those hacks hadn’t broken), chances are it was going to work.

I recently went through about 30 commonly used packages (and none were the “tricky” ones) to set up rebuilds, and only a minority Just Worked. (Though figuring out how to run tests was much harder than how to build - at least there are only two real choices for building.)

1 Like

This isn’t including pure-Python projects, I assume?