Build isolation can lead to invalid install

I don’t have a particular bug here because I don’t know who’s responsible for this case. I’m interested in engaging with the discussion about what the best solution is here, so that I can propose solutions to the right team. I also have a workaround that seems to be working.

Ok, let me start with a trivial example before going into the motivating use case. Essentially build isolation can install build dependencies that are too new for the environment they’re about to get installed in. I encountered this with pyworld in a docker container.

$ docker run -it --rm nvcr.io/nvidia/pytorch:21.05-py3 sh -xc 'pip install pyworld==0.3.3 && python -m pyworld --version'
+ pip install pyworld==0.3.3
Collecting pyworld==0.3.3
  Downloading pyworld-0.3.3.tar.gz (218 kB)
     |████████████████████████████████| 218 kB 5.4 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (from pyworld==0.3.3) (1.20.1)
Requirement already satisfied: cython>=0.24 in /opt/conda/lib/python3.8/site-packages (from pyworld==0.3.3) (0.28.4)
Building wheels for collected packages: pyworld
  Building wheel for pyworld (PEP 517) ... done
  Created wheel for pyworld: filename=pyworld-0.3.3-cp38-cp38-linux_x86_64.whl size=978625 sha256=b7ea7be12c881f278fd1e254c75024962b34955b3a62fde28d0d819f2406ce1e
  Stored in directory: /root/.cache/pip/wheels/5f/6d/ee/6d4d4f8dfe7731ef094f74a7f52359d4b3fd2273d9ac9cf16a
Successfully built pyworld
Installing collected packages: pyworld
Successfully installed pyworld-0.3.3
+ python -m pyworld --version
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/runpy.py", line 185, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/opt/conda/lib/python3.8/runpy.py", line 144, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/opt/conda/lib/python3.8/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/opt/conda/lib/python3.8/site-packages/pyworld/__init__.py", line 7, in <module>
    from .pyworld import *
  File "pyworld/pyworld.pyx", line 1, in init pyworld.pyworld
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Essentially what happens is pip creates an isolated build environment, installs the newest numpy to compile extensions, then installs pyworld into an environment with an older version of numpy thus creating an incompatible environment.

The simplest workaround is to use --no-build-isolation. This works for this trivial case.

The larger motivating case, and the reason for the question is, what do I do when I have a larger requirements.txt that includes, say, pyworld and numpy both? The numpy I’m about to install may be different than the one I have previously installed. So I end up in a situation where there’s potentially 3 different environments.
(1) the environment I’m starting with (which may or may not already have numpy installed)
(2) the build environment with the newest build dependencies at the time of install
(3) the install environment

What I’m wondering is what is the best practice here. I can see a few solutions that all seem pretty imperfect. I know the best solution is likely to be “avoid build isolation”, but that seems very tedious and a bit opposed to PEP 517/518, which I think are an overall huge improvement to the python package ecosystem. Let me outline what I see as possible solutions here, and you tell me what’s already idiomatic, and you tell me what solutions I’m missing.

Solution 1) Error/warning out
Should pyworld be producing a wheel that is only compatible with one version of numpy once it’s built? I personally don’t know how to author a setup.py that detects the currently installed numpy version and updates install_requires, but maybe that’s trivial.

I noticed that the source package can work with many versions of numpy, but once built it has much more strict constraints. Is there a model for dealing with the difference between broad source compatibility, but narrow binary compatibility with dependencies?

Solution 2) Avoid build isolation
This seems like the obvious solution but it potentially leads to a longer process than it used to be. In particular pip install --no-build-isolation -r requirements.txt has the potential to break other packages in requirements.txt. That might lead to a process that’s now 3 steps long

pip install numpy==some-version
pip install --no-build-isolation pyworld==some-version
pip install -r requirements.txt  # I sure hope the versions didn't change

Maybe there’s a solution to specify this in requirements.txt? Then there’s still a total ordering question, but at least we’re back to 2 steps instead of 3. Some day I hope for this to be 1 step.

Solution 3) enforce constraints in build isolation
I don’t think pip has an API for this, but is there a way to have pyproject.toml specify that the build dependencies should be constrained by what’s about to be installed with requirements.txt. Or to have a way to direct the build frontend to use some additional constraints when building the wheel. It looks like PEP517 has get_requires_for_build_wheel. I wonder if it should be the case that pip take the intersection of requested packages with pyproject.toml’s build-system.requires dependencies.

Solution 4) don’t use pip; use conda
I appreciate that this entire issue has to do with binary compatibility between depdencies. This really smells like a non-portable python package, which seems to suggest a binary compatibility tool like conda is the way to go. However, that feels more like a workaround than a solution, and IMHO conda isn’t ready to be the only dependency manager.

(personal rant: At least I’ve never been able to create reproducible environments with conda, despite several tactics for creating lock files, and conda itself says you’ll probably eventually use pip anyways. It feels like it’s not ready for reproducible builds, whereas pip has made huge progress in recent years towards reliability and reproducibility and is inches away from handling every use case IMHO. Currently I use pip-compile + pip for reproducible environments and it’s mostly great)

I know you do not seem too keen on testing conda, but conda handles this case pretty well. For your reproducible build rant, have you looked at conda-lock, an official tool of the conda ecosystem that should help for that?

I don’t know what the NumPy API compatibility promises are, but it seems you may need to specify the NumPy requirements for both build-system and project.dependencies such that this skew doesn’t happen.

Nope.

It’s not clear where your numpy came from, but the “best practice while using build isolation” is to have a wheel cache or index that contains everything you install. That way, when you need to build something, the isolated environment will pull an identical wheel to what you have installed. Because your numpy isn’t the same as the one being pulled from PyPI for the build, you get an incompatibility.

If this isn’t viable (and it often isn’t), then using a non-isolated build to produce your wheels is best. Then you can install the wheels into your actual environment without needing to worry about isolation. It does require building the wheels in an identical environment, but this may well be easier than manipulating your existing environment to match whatever was used for the builds on PyPI.

Fundamentally, if your base environment (including all previously installed packages) does not match the one where the wheel was built, you may get these errors. The only true solution is to build all the packages yourself in the target environment (and use wheels to cache those builds for your own later use). It’s often good enough to get everything from PyPI, since it’s likely that builds for PyPI also used wheels from PyPI, so they’ll be consistent.

Anaconda and conda-forge’s primary advantage here is that they do build all the packages in a consistent environment, so as long as you get them all through their repository, the builds will be consistent. (Note that Anaconda’s packages may not be compatible with conda-forge, but the metadata is usually strong enough to sort that out.)

Isn’t the usual fix here to make oldest-supported-numpy the build dependency?

3 Likes

At least when it comes to NumPy version (as opposed to cross-building for another platform/environment), the standard, well-accepted and documented solution here is to simply correctly specify the minimum supported version of NumPy to build against in the build-system.requires.

@stefansjs see the appropriate section of the NumPy docs that describes more details on what (or the package author) can do here to prevent those sorts of problems—either manually pinning the lowest NumPy version you want to support, or doing so automatically with the oldest-supported-numpy metapackage.

EDIT: Per Steve Dower’s followup comment, sounds like this particular case might have more to do with the compilation environment than the NumPy version, as was originally speculated n the OP.

Not in this particular case, since the incompatibility is in the calculation of the native struct backing the array objects:

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

This doesn’t (usually) relate to the version of numpy, but to the platform it was compiled on. So to get a compatible version, you need a consistent compilation environment (which more practically means you build it yourself and the later isolated build uses your own wheel, not the publicly released one).