any others?
Off the top of my head, pipfile
/pipfile.lock
is still a relatively common one, and its important to note that setup.cfg
uses a custom INI parser, and there are multiple formats for deps in pyproject.toml
: build deps under [build-system]
, standardized PEP 621 runtime (“project”) deps under [project]
, and deps for many of the non-setuptools build backends (Flit, Poetry, PDM, etc), each in their own tool-dependent format, under [tool]
. Plus, there are “extras”, which are expressed differently for each formats, and legacy setup-requires and test-requires equivalents in some formats. In short…what a mess! Hopefully, someday they’ll all use PEP 621
project source metadata for runtime deps, and [build-system]
for build deps, but that future is still years away.
However, pip seems to be able to parse all of these, and tries to download dependencies if they are missing. Is there any pip API to get this list of packages?
Disclaimer: I’m pretty far from a packaging expert unlike many users on here, so take this with a grain of salt, but the following is my high level understanding of the situation.
Historically, this worked because all packages used a setup.py
, which pip would call as part of its build/install process, and then Setuptools (or the legacy distutils) would actually do the dirty work (or they used a requirements.txt
, which had to manually be fed to pip with -r
).
Nowadays, the way this works (for non-editable installs) is defined by PEP 517. Under the modern nomenclature, Setuptools is what we would call a build backend, which does the dirty work of taking the project source tree and transforming it into a distribution package that pip knows how to install, either a semi-standardized sdist, or ultimately a standardized wheel; while pip
is a build frontend, which directly interacts with the user and calls the backend. Other package manager with their own tool-specific dep formats, including Poetry, pipenv, Flit, PDM, etc are all build backends, though some of them can act as frontends too.
Essentially, the backend takes care of handling its own tool-specific (or generic) dependency format and transforming it into standardized metadata in the distribution packages following the core metadata standard, which pip can then consume to install its dependencies, and so forth in the same manner. But in order to use that, the package needs to be rendered to a source distribution (sdist), and from there a built distribution (wheel), which then contains the deps under Requires-Dist
keys in the RFC 822 (ish) format METADATA
file under the .dist-info
directory in the wheel archive.
So, to use the same method as pip and work with any modern Python package regardless of build system or format, what you could do is:
- Use
build
, pip
or another build frontend, or call the PEP 517 hooks directly, to build the project into a built (“binary”) distribution package for your target platform
- Extract the build wheel (its just a ZIP) and locate the
METADATA
file
- Parse the file with Python’s email.parser in legacy
compat32
mode (or write/use a parser that emulates it) and extract the Requires-Dist
keys
- Parse the requirements with
packaging
’s Requirements format parser
- Rinse and repeat for each package and sub-dependency.
As you can see, this is certainly non-trivial, and seems unlikely to fully meet your requirements above (no running Python). However, it is unfortunately the only reliable way to do so for arbitrary packages, using a variety of build tools and dep formats, particularly in the case of non-static metadata (setup.py) that can be arbitrary and must be executed to return an accurate result.
This is basically most of what Linux distros do in order to repackage Python projects into their own distro packages; conda has automated checks that I believe do either this or read the source, but canonical deps are specified in the conda reciepie. pip-tools
hooks pip
at a low level to generate fully resolved standard requirements.txt
s, but these contain all direct and indirect deps and are resolved down to concrete versions which is not likely what you need. While it is a lot, if it only needs to be done on CI services or at the packager’s end, it is much less prohibitive than for every package user.
The alternative, which might make more sense for a “best-effort” approach as a first guess or additional check for human packagers, is reading as many source dependency specification formats as possible using the appropriate parser for the file and packaging.requirements
to parse the actual PEP 508 strings, and you could do basic static AST introspection or even regexes on setup.py
s to try to guess at the deps there. But you’d have to be able to tolerate a significant amount of error and a substantial and perhaps, at least for a while until PEP 621 adoption accelerates, a growing number of projects whose deps are in tool-specific formats, as well as a hopefully shrinking ones who use setup.py
instead of static metadata.
So, it really depends on your goals. However, perhaps others might have better ideas on an approach that requires less novel effort on your end by hooking the internals of existing tools. Ultimately, though, something is going to need to do each of those steps in order to get from a project source tree to reliable package requirements for an arbitrary Python project, at least until the world adopts PEP 621 (or at the very least static requirements metadata in a constrained number of formats).