Drawing a line to the scope of Python packaging

People discussed external dependencies during the packaging minisummit at PyCon North America in May 2019. Moving notes by @btskinn and @crwilcox here for easier searchability.

@msarahan championed this discussion:

Expression of dependencies on externally provided software (things that pip does not/should not provide). Metadata encompassing binary compatibility that may be required for these expressions.

[note from @tgamblin : "FWIW, Spack supports external dependencies, and has a fairly lightweight way for users to specify them (spec + path)

https://spack.readthedocs.io/en/latest/build_settings.html#external-packages

We do not auto-detect them (yet). We’ll likely tackle auto-detecting build dependencies (binaries) before we tackle libraries, where ABI issues come into play."]

What metadata can we add to increase understanding of compatibility?

a. Make metadata more direct
b. Check for libs rather than just provide them
c. manylinux1 does some things that move toward this
d. We don’t really consider different workflows separately. Could improve docs, guiding users to other tools, rather than defaulting to one tool that isn’t the right fit.
e. Can we design standards/interchange formats for interoperating between the tools? Should PyPA provide a standard?
f. A key goal is to avoid lock-in to PyPA tools
g. Tools for people who don’t want to rely on an external package manager for provisioning Pythons

  • I.e., yum, apt, brew, conda

h. Need to ensure appropriate bounds are placed on pip’s scope

Draft PEP @ncoghlan

  • Challenge is expressing dependencies in a way to expose actual runtime dependencies

    • Particular dependencies of interest are (1) commands on PATH and (2) dynamic libraries on, e.g., LD_LIBRARY_PATH
  • For practical usage, automatic generation is likely needed → usually not human-readable

  • Developers don’t explicitly know these.

  • Audit wheel may generate these?

pip as user of the metadata spec

  • Once the metadata spec is written, pip could likely check for these dependencies, and fail if they’re not met

  • Unlikely pip could be made to meet (search-and-find) any missing dependencies

  • Where do we get the mapping(s) from dependency names ← → platform-localized platform names. → step 1: speclanguage

How hard would it be to have conda-forge provide its metadata?

  • @scopatz: not too hard and could be done. We can make this simpler by providing this for tools. PyPI can provide metadata. Conda even has this already and maybe repurposable

Are there other sources of inspiration/shared functionality?

  • Ansible
  • Chef (resources and providers)
  • End users are able to extend the list of providers should they be using a distribution or environment where the default providers do not match the running system

What about platforms other than Linux?

  • (Nick) Windows and MacOS are likely to be somewhat easier – much stricter ABIs, with bundling → less variation to cope with

Is there any way to shrink the scope of the problem?

  • Advanced devs usually can know what to do to fix a dependency problem based on a typical error message (‘command not found’ or ‘libxyz.so not found’).
  • The key thing is(?) to make it clearer to inexperienced users what the problem is -- capture the ‘exception’ and provide a better message

Should there be a focus on one type of dependency?

  • I.e., missing commands vs missing libraries
  • Probably not: Seems likely that solving one type will provide machinery to make solving the other relatively easy

Actions:

  • Draft a PEP (or update existing) to define a spec
  • First crack at integration between PyPI/pip + conda-forge
  • Start a thread for this topic to form a working group?
2 Likes