Handling optional C dependencies

maxbachmann · July 13, 2022, 4:10pm

My package has a pure Python version and a faster C++ version. I currently handle this in the following way:
setup.py

if packaging:
    build_c_version()
else:
    try:
        build_c_version()
    except:
        build_python_version()

However to build the C++ version I need some additional dependencies:
pyproject.toml

[build-system]
requires = [
    "cmake'",
    "ninja; platform_system!='Windows''",
    "oldest-supported-numpy",
]

this is problematic, since those might fail to install, which would cause the whole build to fail even though those are only dependencies for the C++ build. Is there already a way to mark these dependencies as optional (do not fail build if they fail to install), or is there a workaround I can use to inject these dependencies only for the C++ build?

pradyunsg · July 13, 2022, 4:22pm

I don’t know what build system you are using, but the pyproject.toml based build system has a mechanism to dynamically get build-time dependencies.

I think setuptools’ setup_requires maps to that, when building with pyproject.toml.

abravalheri · July 13, 2022, 4:56pm

If setup_requires does not work for you, you can also try to wrap the setuptools backend:

https://setuptools.pypa.io/en/latest/build_meta.html#dynamic-build-dependencies-and-other-build-meta-tweaks

maxbachmann · July 13, 2022, 5:43pm

I don’t know what build system you are using

The C++ version is built using scikit-build and the pure Python version using setuptools.

you can also try to wrap the setuptools backend

How would this work? As far as I can see it allows me to overwrite get_requires_for_build_wheel and return a list of dependencies to install. So the build would look like this:

get_requires_for_build_wheel
install requires
build_wheel
3.1) try to build C++ version
3.2) fallback to Python version

However I do not understand how this allows me to make dependencies optional (try to install them but ignore if the installation failed)

abravalheri · July 13, 2022, 6:02pm

Ok, sorry. I see now.

If you are looking for something deterministic (i.e. you know before hand when you want to do the C++ build or not), you can use conditional logic in get_requires_for_build_wheel (e.g. by detecting a environment variable, or by checking config_settings). But If you don’t know it before hand and you want to be able to decide it dynamically if the dependencies fail to install, I think that the unfortunate answer is “the Packaging ecosystem around PEP 517 does not support this kind of use case right now”, but maybe @pradyunsg has a different answer.

setuptools will not supervise the installation of the build dependencies, that is up to the frontend (e.g. build or pip)…

Have you considered splitting the package in 2, one for the C++ and one for the pure Python?

maxbachmann · July 13, 2022, 6:23pm

If you are looking for something deterministic (i.e. you know before hand when you want to do the C++ build or not), you can use conditional logic in get_requires_for_build_wheel

Right now I provide deterministic behavior for packaging environments, which should always build the C++ version, since it is useless to build the pure Python version here. In addition users on unsupported platforms can enforce this behavior by setting an environment variable.
Generally for users building from source I do not know for sure, that all dependencies will compile, since the dependencies do not provide wheels for these platforms either.

Have you considered splitting the package in 2, one for the C++ and one for the pure Python?

Yes this might be the cleanest solution:

package
package-cpp
package-py

where package provides the extras python and cpp. When not providing any extras it installs package-cpp if it provides a wheel and otherwise package-py. This way most users will get a fast implementation and users on platforms without wheels still have a way to enforce the usage of the C++ version.

maxbachmann · July 13, 2022, 8:34pm

Is there a way to split packages in this way and still keep them maintainable? I need to perform these changes in three of my projects, so keeping this maintainable is a major concern.

I would like to avoid splitting into 3 repos. This makes it hard for contributors to find where to apply changes + many changes will be required in both C++ and Python version at the same time. In addition it makes the release process more complicated.
tests are included in the sdist, since many package managers want to run them after building the binary. However I would like to avoid duplicating the tests, since they are obviously mostly the same for both implementations.

As a reference I currently use the following project structure: GitHub - maxbachmann/RapidFuzz: Rapid fuzzy string matching in Python using various string metrics

pradyunsg · July 13, 2022, 9:00pm

You can have all of them live in the same repository, evolving together. Each will need to be in a subdirectory and have its own pyproject.toml within that.

pradyunsg · July 13, 2022, 9:01pm

Can you share the sources of your project, so that it’s easier for the folks here to understand what configuration you have?

maxbachmann · July 13, 2022, 9:07pm

Can you share the sources of your project, so that it’s easier for the folks here to understand what configuration you have?

maxbachmann · July 13, 2022, 9:13pm

You can have all of them live in the same repository, evolving together. Each will need to be in a subdirectory and have its own pyproject.toml within that.

I assume this would mean a structure like the following:

/rapidfuzz/*
/rapidfuzz-cpp/*
/rapidfuzz-py/*

I am however unsure how this would affect the tests. Currently I simply test both implementations in the same place: RapidFuzz/tests/distance/test_Levenshtein.py at main · maxbachmann/RapidFuzz · GitHub, which avoids duplicating all the tests. In addition the projects need to include some top level files like License + Readme.

EpicWink · July 13, 2022, 9:24pm

Your build process could copy those files to the subdirectory before building the sdist (make sure to include them in MANIFEST.in). I would suggest storing this build process in a shell script (and using it in CI, tox etc).

As for running those test against the desired package, I would simply run them against the installed version, and have a tox configuration to install each individually

pradyunsg · July 14, 2022, 8:04am

FWIW, this doesn’t need to be three packages. You can have just two: projectname and projectname-speedups. projectname can hold the Python implementation, with preferential use of speedups which is the C++ implementation.

maxbachmann · July 15, 2022, 10:57pm

What is the simplest way to detect whether a platform will support wheels, which can be uploaded to PyPi. E.g. for *-cp310-cp310-musllinux_1_1_x86_64.whl I can test that the platform is x86_64 and it python implementation is cpython in version 3.10. However I am unsure how to check for musllinux. I tried to use platform.libc_ver(). However this appears to work only for glibc:

$ podman run -ti python:3.10-buster
Python 3.10.5 (main, Jul 12 2022, 11:43:42) [GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.libc_ver()
('glibc', '2.28')
>>> quit()
$ podman run -ti python:3.10-alpine
Python 3.10.5 (main, Jun  7 2022, 19:23:05) [GCC 11.2.1 20220219] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.libc_ver()
('', '')
>>> quit()
$ podman run -ti python:3.9-alpine
Python 3.9.13 (main, May 25 2022, 21:34:36) 
[GCC 11.2.1 20220219] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.libc_ver()
('', '')
>>> quit()
$ podman run -ti python:3.8-alpine
Python 3.8.13 (default, May 25 2022, 21:40:28) 
[GCC 11.2.1 20220219] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.libc_ver()
('', '')
>>> quit()

Edit: packaging.tags.sys_tags seems to do the job