This is a variant of the selector packages idea and and the build backend approach that is sort of suggested here. There is a good summary of some of the problems this is trying to solve here so I won’t repeat that.
The basic problem is that somehow we want a user to be able to do pip install foo
and end up with different wheels being installed depending on some property of their system that is not currently expressible in packaging metadata. The property in question might be CUDA version or some property of their CPU or GPU etc. It is easy for the maintainer of the project being installed to write a small piece of code that checks this property but much harder or impossible for maintainers of tools like pip to maintain environment/system checks that would make this work for all projects.
There is also a tension between the desire for static metadata and static dependency resolution and the requirement for some level of dynamism. A frequent concern with previous attempts at solving this has been that the proposed mechanisms allow arbitrary code execution when installing or when resolving dependencies. Here I have attempted to come up with something that is as close to static as possible while still containing the unavoidable dynamic part.
I will describe this in terms of someone doing pip install python-flint
just because I know that case well. I described here that it would be useful to be able to provide x86-64 wheels that are built for newer architecture versions like x86-64-v3 in order to be able to use things like AVX instructions. The particular case of x86_64
variants is potentially better handled through more traditional platform tags but let’s just ignore that for now…
Firstly we reduce the problem to two parts:
- Having variant wheels based on some property.
- Providing a way to select among variant wheels.
For the first part you could extend the platform tag in the wheel filename using e.g. +
as a separator to have wheels like:
python_flint-0.6.0-cp312-cp312-win_amd64.whl
python_flint-0.6.0-cp312-cp312-win_amd64+x86_64_v3.whl
python_flint-0.6.0-cp312-cp312-win_amd64+x86_64_v4.whl
The extra parts of the platform tags are “extended platform tags” and are just arbitrary strings. PEP 517 build backends should possibly not output wheels with names like this by default but tools like cibuildwheel
might provide options to rename the built wheels like this as they already do when renaming wheels as e.g. manylinux. The suggestion is that you should be able to make wheels with these names and upload them to PyPI or put them in a local wheelhouse directory.
The next step is selecting the right wheel. The problem here is that we usually put the metadata that tools like pip consume into the wheels themselves but pip install python-flint
does not yet know which wheel to look at. For that I suggest having an additional file that sits alongside the wheels so that the wheelhouse or PyPI index page looks like:
python-flint-0.6.0.tar.gz
python-flint-0.6.0-wheel-selector.toml
python_flint-0.6.0-cp312-cp312-win_amd64.whl
python_flint-0.6.0-cp312-cp312-win_amd64+x86_64_v3.whl
python_flint-0.6.0-cp312-cp312-win_amd64+x86_64_v4.whl
A tool such as pip should check for the *-wheel-selector.toml
file before choosing a wheel. This will tell it how to decide which wheel to choose. The wheel selector file is a toml file that is statically analysable by a dependency resolver. Its contents look like:
# python-flint-0.6.0-wheel-selector.toml
[wheel-selector]
variables = ["x86_64_version"]
[selector.x86_64_version]
requires = ["cpuversion >= 1.0"]
function = ["cpuversion:get_x86_64_psABI_version"]
wheel_tags = {
x86-64 = [""],
x86-64-v2 = [""],
x86-64-v3 = ["x86_64_v3", ""],
x86-64-v4 = ["x86_64_v4", "x86_64_v3", ""],
}
The cpuversion
requirement is an installable package (e.g. from PyPI) that provides a get_x86_64_psABI_version
function like:
>>> import cpuversion
>>> cpuversion.get_x86_64_psABI_version()
'x86-64-v3'
When pip reads the wheel selector toml file it should install cpuversion
and call the indicated function. The acceptable return values must be the strings listed in the wheel_tags
table where lhs is a value returned and rhs is an ordered list of allowable extended platform tags. The empty string allows a wheel without any particular extended platform tag. The order of the list indicates preference so usually the first item should be selected but all items can give a valid install. If cpuversion.get_x86_64_psABI_version()
returns 'x86-64-v3'
then the allowable wheel files are
python_flint-0.6.0-cp312-cp312-win_amd64+x86_64_v3.whl
python_flint-0.6.0-cp312-cp312-win_amd64.whl
and the _v3
wheel is the preferred one.
From a project maintainers perspective it is ideal to be able to do this without needing to make separate packages either for the different variants or for a separate selector package that then installs the main package. Adding a single file along with the wheels is the nicest way to handle wheel selection from that side. In more complex situations I suppose that the variant wheels can still be used to pull in different dependencies as is the idea for selector packages.
From a dependency resolver perspective I imagine that this fits in neatly with a part of the process that any resolver already needs to handle. Basically at some point there is a requirement for a project, a version is selected and then there is a list of release artefacts from which one must be chosen. The requirement never says which artefact to choose so the resolver somehow chooses a wheel/sdist from the given version based on some preference system. The suggestion here just alters how that choice is made. I might be massively underestimating the complexity of how this fits into the broader resolution though…
This does involve some arbitrary code execution as part of the install because we need to call a function from the installed cpuversion
package. However this is more limited than in some other proposals and there are some other advantages.
The arbitrary code execution comes from calling a function in the cpuversion
module. In practice there would not be many such modules and so it could be feasible for someone to maintain a list of allowed modules that are vetted and that work for the dependencies that they want to install. Also if particular packages like cpuversion
, cudaversion
etc were established then an installation tool could potentially vendor them to avoid arbitrary external code execution.
For a dependency locking tool it is impossible to know what the output of get_x86_64_psABI_version()
might be on the target machine. However the tool can see all the possible values that it is allowed to return and can also see what the implications of those different values would be. A locking tool could exhaustively resolve all possible cases for the version and produce a lockfile that accommodates all of them. Alternatively a locking tool might recognise that the empty string ""
matches all cases and could choose the wheel with no extra platform tags. A final option is that the locking tool could provide a way to specify what values to choose like:
locktool -r requirements.txt --selector-tags x86_64_version=x86-64-v2
locktool -r requirements.txt --extra-wheel-tags x86_64_v3
Equally installers like pip
could use the same options for handling extra platform tags during dependency resolution.
Probably many of the details above are not right and need more careful thought and also I have avoided going into details of more complicated cases like multiple extended platform tags. My intention here is to try to present a dynamic resolution scenario that is as constrained as possible so that I can ask:
Is this level of dynamism acceptable to those who want to make locker tools or want to minimise arbitrary code execution etc?
Is this sufficient to handle all cases like GPUs etc where people want dynamic resolution?