Enforcing consistent metadata for packages

Example for numpy. Current situation:

  • the numpy sdist has zero runtime dependencies
  • all numpy wheels also have zero dependencies, and it vendors libopenblas.so|dll (and other things it needs, like libgfortran.dll)

To improve binary sizes by unvendoring:

  • the sdist still has zero dependencies
  • all numpy wheels gain a dependency on scipy-openblas64

That’s a mismatch between sdist and wheels (metadata 2.2 nor your proposal here allow for this). I think your question is: why can’t the sdist gain the dependency too? There are multiple reasons why that is not possible:

  1. There is no sdist for openblas, so adding the dependency would break numpy installs on platforms not supported by wheels
  2. It wouldn’t make sense to create such an sdist for openblas, not only because it’s not a Python package but also because we build the wheel in very specific ways to make it work - no from-sdist build on a random end user’s machine is going reproduce that.
  3. OpenBLAS is not a required dependency of NumPy, but only of the binary numpy wheels. We have users who want to build from source with MKL instead of OpenBLAS, and distro packagers all have different ways of building against some BLAS library too.

I think the solution here doesn’t have to be that radical. The problem here seems to be not the mixing of installing from source and from binaries, but only the assumption made by resolvers that sdist and wheels always have the same dependencies. Which isn’t a great assumption to be making. If instead they would only assume that all wheels have the same dependencies, that’s still not perfect but already a lot better.

Or, there could be some way for package authors to already declare the differences in the sdist. E.g., you may add to or overwrite metadata fields:

[project.pypi-binary-wheels]
dependencies = ['scipy-openblas64']

This would make sense conceptually, since binary wheels on PyPI are definitely not the same as just building the sdist from source (e.g., it’s not like pip runs auditwheel …).

6 Likes