Support for build-and-run-time dependencies

njs · June 13, 2019, 8:16am

When you say “via pep517”, do you mean specifically this python package?

I think for recent setuptools, if you put a requirement in setup_requires, and you’re using a PEP 517-aware frontend (like the pep517 package, or recent pip), then anything in setup_requires gets converted into get_requires_for_build_wheel. So… if I’m right, then either setup_requires should work for your use case, or else, if it doesn’t, then that might mean that you actually do need the expensive packages already installed before you can call hooks.get_requires_for_build_wheel().

anon5950365 · June 13, 2019, 8:24am

only at build-time or also at run-time?

I also need it at runtime but there install_requires works so that’s not an issue

So… if I’m right, then either setup_requires should work for your use case

setup_requires is fundamentally incompatible with Cython because it uses easy_install, and easy_install’s sandbox breaks the Cython compiler. (it makes assumptions about module unloading & reloading that don’t hold true for native code modules like Cython and mess up global state) Unless someone decides to undeprecate and actually fix it, it’s sadly not an option. edit: some more info on the setup_requires issue: Question: odd error, disappears on second run: AssertionError: PyTypeTest on non extension type · Issue #2730 · cython/cython · GitHub

edit2: unless that changed in recent setuptools of course. but hasn’t setup_requires been deprecated? what’s the alternative for the future? I’ve always been pointed back to build-system.requires when I asked about that

edit3: I actually made a setuptools ticket for this Please consider undeprecating setup_requires · Issue #1742 · pypa/setuptools · GitHub but closed it because build-system.requires was suggested as an alternative and at the time I didn’t know that it had this metadata analysis duration issue. so maybe I should just reopen that ticket

pf_moore · June 13, 2019, 8:42am

That’s what I thought, too.

That specific statement is about setuptools. I don’t know what the setuptools project’s plans are for setup_requires, but the comment at the top of the build_meta code mentions it, so maybe they intend to (or already do) handle it in a more PEP 517 compatible way somehow.

Is setup_requires deprecated? I couldn’t see any mention of that in the setuptools documentation. As to fixing it, that’s something that would need to be raised on the setuptools tracker.

Overall, I think your issues here sound like they would be better explored on the setuptools tracker. If there’s something that needs a standards-level change, then that’s something that should be raised here, but I’d prefer it if the setuptools maintainers were able to confirm that they weren’t able to handle your requirements within the build tool before we escalated the discussion to standards that all build tools would be required to add support for.

anon5950365 · June 13, 2019, 8:54am

Is setup_requires deprecated?

So far, that appears to be the plan: setup.py install doesn't work for packages with setup_requires on older macOS · Issue #1320 · pypa/setuptools · GitHub in favor of build-system.requires. I have now brought this up here in the hopes that this is reconsidered: Please consider undeprecating setup_requires · Issue #1742 · pypa/setuptools · GitHub

Overall, I think your issues here sound like they would be better explored on the setuptools tracker.

I agree, I kind of forgot setup_requires is even an option due to its deprecation. But it seems it might most sense to figure out if maybe it shouldn’t be deprecated on the setuptools issue tracker, or if there’s some other alternative I’m not aware of. Will let you know how things turn out

edit: corrected wrong link

pradyunsg · June 13, 2019, 9:49am

Ok then, let’s move any further discussion on @anon5950365’s issue to https://github.com/pypa/setuptools/issues/1742.

FRidh · October 11, 2019, 9:48am

I did not go through the whole thread but I understand the point that is made.

In Nixpkgs Python recipes we have three type of dependencies:

native build inputs. These are e.g. the build system and the interpreter that runs the build. Corresponds to build_system.requires or setup_requires
build inputs. These are typically libs we link against.
runtime inputs. These are the install_requires and these do not need to be present when building a wheel, only when installing it.

This distinguishment is especially important in the case of cross-compilation.

pradyunsg · October 13, 2019, 9:25am

@FRidh where can I find documentation on how nix handles these dependencies?

FRidh · October 13, 2019, 11:10am

The Nixpkgs manual. https://nixos.org/nixpkgs/manual/ See 10.15.2.2.1.1. buildPythonPackage parameters the final part before the next section. Note it builds on the general section on dependencies (for which the explanation is way too complicated).

uranusjr · January 2, 2021, 10:16pm

Waking up this old thread. I spent quite some time recently considering different aspects and possible solutions to this, and the more I think, the more I feel it is best to have this “build-and-run-time-dependency” separate to build-system.requires.

There are two different kinds of dependencies during build-time. One is the tool to build a project e.g. setuptools and flit-core, which needs to be installed to create a wheel from an sdist, but is ultimately not related to the wheel created. The other is projects that would affect how the project at hand would work at run-time, like how Numpy works for Scipy and Pandas. For the first variant, nobody really cares about whether the run-time version matches build-time, and in some situations we really don’t want them to match, so project X can use build tool ~=2.0 to build, while Y uses ~=3.0. But for the second variant, the version really must somehow match (for ABI compatibility), and preferably the front-end should know about this to leverage caching so Numpy is not build multiple times. The two are fundamentally different, so we should treat them differently, instead of trying to make build-system.requires fit both usages.

I feel “build-and-run-time dependency” is not a good name for this idea, so I’ll call this new dependency specification “build dependencies” from now on, and what build-system.requires was designed for something like “(build) backend dependency” instead. This new kind of build dependencies should be specified separately from backend dependencies. When a frontend populates the backend’s environment, it should (using pip’s terminology) use the currently known run-time dependency information as constraints, to determine what versions of build dependencies to install. After building the project at hand, it should in turn add the chosen build dependency versions as run-time dependency constraints, and detect conflicts between them.

Some other unorganised thoughts I have on this:

Since this new build-time dependency thing is not a part of the build system, the key should not live under [build-system].
Maybe it’s a good idea to put this in [project] (PEP 621)? Something like project.build-dependencies.
This can create difficult problems for dependency resolution if two packages both depend on (say) Numpy. Dependency metadata is only available after build dependencies are installed (at least for now), but build dependencies can only be determined when the front-end knows about run-time dependencies. There are probably some clever ways to make this work, but I expect front-ends to break this circular dependency with the easiest way, and fail when a build dependency pin causes run-time dependency conflicts, requiring the user to resolve them manually.
Whether build-dependencies automatically implies dependencies doesn’t really matter. It’s probably easier to require project authors to list them twice since that avoids possible edge cases when some don’t want a build dependency to be installed at run-time. I have no idea how this may be useful, but people more skilled in (ab-)using Python packaging stuff than I am will probably want it eventually.

uranusjr · January 2, 2021, 10:36pm

For completeness, there is a way to still use one field for both kinds of dependencies. The trick is the frontend must communicate run-time depedency information as an argument to get_requires_for_build_wheel(), so the back-end can return build dependencies. But the problem is what the frontend should provide. The frontend does not really know what the backend need to know to generate build dependencies, so it will need to pass all run-time dependency constraints (probably as PEP 508 requirement strings).

This can definitely work, but also expose dependency resolution internals handled by the frontend. The backend will also be burdened to make sense of requirements that are otherwise done in the front-end. But it is actually cleaner in some sense—get_requires_for_build_wheel() would handle this problem in the backend, while the proposed build-dependencies specification does it in the frontend. But since full run-time dependency information is in the front-end, it would be difficult to design what the frontend need to pass into get_requires_for_build_wheel() so the backend can know enough to implement this.

Edit: Another possibility that just crossed my mind is we could make the frontend handle get_requires_for_build_wheel()'s return value smarter. Instead of interpreting these requirements as-is, the frontend can combine run-time dependency information and choose versions that match both run-time dependency information and get_requires_for_build_wheel(). This is actually cleaner—build-system.requires declares backend depdencies, get_requires_for_build_wheel() build dependencies, and all dependency resolution logic in the frontend. But are we allowed to interpret get_requires_for_build_wheel() this way?

brettcannon · January 4, 2021, 11:56pm

Is the reason you want this standardized is because the end user is going to specify the “build dependency” and then indirectly that will influence the build of some other dependency that gets pulled in? My assumption is “yes” since otherwise it’s a back-end concern and thus not something we necessarily have to be involved in.

uranusjr · January 5, 2021, 5:04am

Not necessarily the build, but yes. The main issue is the back-end need a way to communicate it needs different front-end behaviour between

[build-system]
requires = ["foo", "bar"]

and

[build-system]
requires = ["foo"]

[project]
build-dependencies = ["bar"]

kpfleming · January 5, 2021, 12:24pm

This seems like a very good way to describe the problem, and in other language ecosystems they are referred to as tools and libraries. The built package has no run-time dependencies on the versions of the tools used to build it, but it can have run-time dependencies on the versions of the libraries used to build it. CPython itself has this distinction: the interpreter binary’s behavior and dependencies are not affected by the version of Autoconf which was used to build it, but are heavily dependent on the version of the C standard library which was used to build it.

It seems reasonable to me that if there is going to be a syntax for expressing these distinct dependencies, a very clear vocabulary should be chosen for that purpose. Putting both types of dependencies into fields with very similar names will likely lead to user confusion.

FRidh · January 7, 2021, 5:58pm

Non-Python run-time dependencies are so far not handled at all. Here one also needs to consider several cases:

libraries one links against during build-time, which means the headers need to be available during build-time
libraries that are directly opened by Python code (e.g. using ctypes)
executables

How could these be handled?

is handled by the build system. Because there is also a run-time dependency, 2) will also apply.
should probably be handled in pyproject.toml, just as Python run-time dependencies could go there.
same as 2)

I would like 2 and 3) to be declared separately. In Nixpkgs we could potentially use these lists to automatically patch the code to hardcode the dependencies.

uranusjr · January 7, 2021, 11:16pm

From Python’s perspective, anything that do not expose a Python interface are considered data files, so I expect 2 and 3 are going to be fundamentally treated the same.

Also, the build system interface only deals with having the front-end make the correct version of build-time libraries/tools available for the back-end; how the stuffs made available are interpreted is entirely the back-end or library’s internals, and should be discussed in a separate topic IMO.

uranusjr · January 7, 2021, 11:27pm

Back to the original issue, what should be a good way to move this forward? I personally quite like my last solution:

build-system.requires maintains its current behaviour.
The requirements returned by get_requires_for_build_wheel should be resolved with run-time requirements.

This does not add any new interfaces, nor change the installer’s behaviour for most people. Only those already using get_requires_for_build_wheel with incompatible build- and run-time requirements will be affected, who I would guess is very few, if any at all. People using this via setuptools’s setup_requires are likely not affected—any conflicts would already be a problem before build isolation was a thing since setuptools install those into the run-time environment in legacy mode anyway.

Does this require a update to PEP 517, or can this be implemented directly into the installer (pip)?

pf_moore · January 8, 2021, 8:23am

Without making any comment on whether this is a good idea or not, I think it would need a clarifying explanation added to PEP 517. We’d need to document somewhere that installers should treat the two sources of build requirements differently, for the benefit of potential installers other than pip.

takluyver · January 8, 2021, 11:32am

I think we need some more detail on how the new kind of requirements should be handled.

At present, if a wheel is built from an sdist, installer tools can and do cache that wheel to use for subsequent installations. But this doesn’t necessarily work if it the build also depends on the packages in the target environment where a package is to be installed. I.e. if I install h5py from source in an environment that already has numpy 1.19.5, and it is therefore built using that version, that build can’t be reused in an environment with an older numpy.

And we would also need some way to specify how to build packages for publication, with the broadest possible compatibility (e.g. build using the oldest practical version of numpy for each Python version). At present, we put this in the build dependencies, and use --no-build-isolation if we want to build it in a specific environment.

uranusjr · January 8, 2021, 2:52pm

This is already an issue for projects that publish wheels, since those wheels also need to somehow support multiple upstream versions and are also cached. A dependency offering C API would have to provide some kind of ABI compatibility policy to make itself usable, and the dependants need to apply the policy for publishing wheels, so pip install pandas numpy would not break when numpy breaks ABI compatibility.

In practice, a project having such dependencies would likely need to compute the run-time dependency dynamically as well to match the build environment. This is one other aspect I like about the get_requires_for_build_wheel approach; it forces projects to implement a custom back-end and take care of the run-time dependency calculation.

FRidh · January 9, 2021, 10:54am

Sorry, you’re talking about Python dependencies during build-time of which some will also still be run-time dependencies. And because they may contain native code ABI needs to be considered. Somehow I started thinking about the non-Python deps. The same still applies, but indeed the extra parts 2) and 3) are irrelevant here for the discussion and should be discussed elsewhere.

Yes, in most ecosystems you indeed declare separately the parts you retain a dependency on during runtime. As mentioned by me before, in Nixpkgs we separate between the build-time only deps (“tools” / nativeBuildInputs) and run-time deps available during build-time (“libraries” / buildInputs). That means sometimes we need to list a dependency twice (both a “tool” and a “library”), but that’s for good reason: you need to for cross-compilation, because in that case the dependency is in fact not the same.

Now, I think separating the two here in pyproject.toml is a good idea, because it allows you to check that you do not retain any unwanted run-time dependencies (Nixpkgs point of view, it is the opposite in most other systems I think). The items in project.build-dependencies will then correspond to a single wheel in case of a native build, and two wheels in case of cross-compilation.