Off the top of my head,
pipfile.lock is still a relatively common one, and its important to note that
setup.cfg uses a custom INI parser, and there are multiple formats for deps in
pyproject.toml: build deps under
[build-system], standardized PEP 621 runtime (“project”) deps under
[project] , and deps for many of the non-setuptools build backends (Flit, Poetry, PDM, etc), each in their own tool-dependent format, under
[tool]. Plus, there are “extras”, which are expressed differently for each formats, and legacy setup-requires and test-requires equivalents in some formats. In short…what a mess! Hopefully, someday they’ll all use
PEP 621 project source metadata for runtime deps, and
[build-system] for build deps, but that future is still years away.
However, pip seems to be able to parse all of these, and tries to download dependencies if they are missing. Is there any pip API to get this list of packages?
Disclaimer: I’m pretty far from a packaging expert unlike many users on here, so take this with a grain of salt, but the following is my high level understanding of the situation.
Historically, this worked because all packages used a
setup.py, which pip would call as part of its build/install process, and then Setuptools (or the legacy distutils) would actually do the dirty work (or they used a
requirements.txt, which had to manually be fed to pip with
Nowadays, the way this works (for non-editable installs) is defined by PEP 517. Under the modern nomenclature, Setuptools is what we would call a build backend, which does the dirty work of taking the project source tree and transforming it into a distribution package that pip knows how to install, either a semi-standardized sdist, or ultimately a standardized wheel; while
pip is a build frontend, which directly interacts with the user and calls the backend. Other package manager with their own tool-specific dep formats, including Poetry, pipenv, Flit, PDM, etc are all build backends, though some of them can act as frontends too.
Essentially, the backend takes care of handling its own tool-specific (or generic) dependency format and transforming it into standardized metadata in the distribution packages following the core metadata standard, which pip can then consume to install its dependencies, and so forth in the same manner. But in order to use that, the package needs to be rendered to a source distribution (sdist), and from there a built distribution (wheel), which then contains the deps under
Requires-Dist keys in the RFC 822 (ish) format
METADATA file under the
.dist-info directory in the wheel archive.
So, to use the same method as pip and work with any modern Python package regardless of build system or format, what you could do is:
pip or another build frontend, or call the PEP 517 hooks directly, to build the project into a built (“binary”) distribution package for your target platform
- Extract the build wheel (its just a ZIP) and locate the
- Parse the file with Python’s email.parser in legacy
compat32 mode (or write/use a parser that emulates it) and extract the
- Parse the requirements with
packaging's Requirements format parser
- Rinse and repeat for each package and sub-dependency.
As you can see, this is certainly non-trivial, and seems unlikely to fully meet your requirements above (no running Python). However, it is unfortunately the only reliable way to do so for arbitrary packages, using a variety of build tools and dep formats, particularly in the case of non-static metadata (setup.py) that can be arbitrary and must be executed to return an accurate result.
This is basically most of what Linux distros do in order to repackage Python projects into their own distro packages; conda has automated checks that I believe do either this or read the source, but canonical deps are specified in the conda reciepie.
pip at a low level to generate fully resolved standard
requirements.txts, but these contain all direct and indirect deps and are resolved down to concrete versions which is not likely what you need. While it is a lot, if it only needs to be done on CI services or at the packager’s end, it is much less prohibitive than for every package user.
The alternative, which might make more sense for a “best-effort” approach as a first guess or additional check for human packagers, is reading as many source dependency specification formats as possible using the appropriate parser for the file and
packaging.requirements to parse the actual PEP 508 strings, and you could do basic static AST introspection or even regexes on
setup.pys to try to guess at the deps there. But you’d have to be able to tolerate a significant amount of error and a substantial and perhaps, at least for a while until PEP 621 adoption accelerates, a growing number of projects whose deps are in tool-specific formats, as well as a hopefully shrinking ones who use
setup.py instead of static metadata.
So, it really depends on your goals. However, perhaps others might have better ideas on an approach that requires less novel effort on your end by hooking the internals of existing tools. Ultimately, though, something is going to need to do each of those steps in order to get from a project source tree to reliable package requirements for an arbitrary Python project, at least until the world adopts PEP 621 (or at the very least static requirements metadata in a constrained number of formats).