I don’t have a particular bug here because I don’t know who’s responsible for this case. I’m interested in engaging with the discussion about what the best solution is here, so that I can propose solutions to the right team. I also have a workaround that seems to be working.
Ok, let me start with a trivial example before going into the motivating use case. Essentially build isolation can install build dependencies that are too new for the environment they’re about to get installed in. I encountered this with pyworld in a docker container.
$ docker run -it --rm nvcr.io/nvidia/pytorch:21.05-py3 sh -xc 'pip install pyworld==0.3.3 && python -m pyworld --version'
+ pip install pyworld==0.3.3
Collecting pyworld==0.3.3
Downloading pyworld-0.3.3.tar.gz (218 kB)
|████████████████████████████████| 218 kB 5.4 MB/s
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (from pyworld==0.3.3) (1.20.1)
Requirement already satisfied: cython>=0.24 in /opt/conda/lib/python3.8/site-packages (from pyworld==0.3.3) (0.28.4)
Building wheels for collected packages: pyworld
Building wheel for pyworld (PEP 517) ... done
Created wheel for pyworld: filename=pyworld-0.3.3-cp38-cp38-linux_x86_64.whl size=978625 sha256=b7ea7be12c881f278fd1e254c75024962b34955b3a62fde28d0d819f2406ce1e
Stored in directory: /root/.cache/pip/wheels/5f/6d/ee/6d4d4f8dfe7731ef094f74a7f52359d4b3fd2273d9ac9cf16a
Successfully built pyworld
Installing collected packages: pyworld
Successfully installed pyworld-0.3.3
+ python -m pyworld --version
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/runpy.py", line 185, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/opt/conda/lib/python3.8/runpy.py", line 144, in _get_module_details
return _get_module_details(pkg_main_name, error)
File "/opt/conda/lib/python3.8/runpy.py", line 111, in _get_module_details
__import__(pkg_name)
File "/opt/conda/lib/python3.8/site-packages/pyworld/__init__.py", line 7, in <module>
from .pyworld import *
File "pyworld/pyworld.pyx", line 1, in init pyworld.pyworld
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
Essentially what happens is pip creates an isolated build environment, installs the newest numpy to compile extensions, then installs pyworld into an environment with an older version of numpy thus creating an incompatible environment.
The simplest workaround is to use --no-build-isolation. This works for this trivial case.
The larger motivating case, and the reason for the question is, what do I do when I have a larger requirements.txt that includes, say, pyworld and numpy both? The numpy I’m about to install may be different than the one I have previously installed. So I end up in a situation where there’s potentially 3 different environments.
(1) the environment I’m starting with (which may or may not already have numpy installed)
(2) the build environment with the newest build dependencies at the time of install
(3) the install environment
What I’m wondering is what is the best practice here. I can see a few solutions that all seem pretty imperfect. I know the best solution is likely to be “avoid build isolation”, but that seems very tedious and a bit opposed to PEP 517/518, which I think are an overall huge improvement to the python package ecosystem. Let me outline what I see as possible solutions here, and you tell me what’s already idiomatic, and you tell me what solutions I’m missing.
Solution 1) Error/warning out
Should pyworld be producing a wheel that is only compatible with one version of numpy once it’s built? I personally don’t know how to author a setup.py that detects the currently installed numpy version and updates install_requires, but maybe that’s trivial.
I noticed that the source package can work with many versions of numpy, but once built it has much more strict constraints. Is there a model for dealing with the difference between broad source compatibility, but narrow binary compatibility with dependencies?
Solution 2) Avoid build isolation
This seems like the obvious solution but it potentially leads to a longer process than it used to be. In particular pip install --no-build-isolation -r requirements.txt
has the potential to break other packages in requirements.txt. That might lead to a process that’s now 3 steps long
pip install numpy==some-version
pip install --no-build-isolation pyworld==some-version
pip install -r requirements.txt # I sure hope the versions didn't change
Maybe there’s a solution to specify this in requirements.txt? Then there’s still a total ordering question, but at least we’re back to 2 steps instead of 3. Some day I hope for this to be 1 step.
Solution 3) enforce constraints in build isolation
I don’t think pip has an API for this, but is there a way to have pyproject.toml specify that the build dependencies should be constrained by what’s about to be installed with requirements.txt. Or to have a way to direct the build frontend to use some additional constraints when building the wheel. It looks like PEP517 has get_requires_for_build_wheel. I wonder if it should be the case that pip take the intersection of requested packages with pyproject.toml
’s build-system.requires dependencies.
Solution 4) don’t use pip; use conda
I appreciate that this entire issue has to do with binary compatibility between depdencies. This really smells like a non-portable python package, which seems to suggest a binary compatibility tool like conda is the way to go. However, that feels more like a workaround than a solution, and IMHO conda isn’t ready to be the only dependency manager.
(personal rant: At least I’ve never been able to create reproducible environments with conda, despite several tactics for creating lock files, and conda itself says you’ll probably eventually use pip anyways. It feels like it’s not ready for reproducible builds, whereas pip has made huge progress in recent years towards reliability and reproducibility and is inches away from handling every use case IMHO. Currently I use pip-compile + pip for reproducible environments and it’s mostly great)