This question is to some extent not really just about python. I think there is a lot of waste that happens, when installing dependencies from 3rd party packages. Let’s say you want to use one class and a few methods from a particular package. Pip install will take this package and install all its dependencies in turn. So you end up with a huge .lock file (whether poetry.lock or uv.lock) containing dependencies of dependencies of dependencies.
Some of these, sure, are ultimately used in the methods that you use in your code base. But I would suggest that most of them are not. In one case I remember installing a package for one single method, and this package in turn installed fastapi
, greenlet
, etc. even though the method itself that I actually used never uses these (not even in the transitive closure).
I wonder whether these can be significantly reduced, to make the “builds” (of python virtual environments) much “thinner”.
Here are some ideas:
-
Within a python project, the interpreter should be able to for every method, class, constant trace out all the dependencies that are needed.
-
As a consequence of 1, any package manager will/should know, that in order to install a particular bundle of methods/classes, it has to install a particular (in general strict) subset of the dependencies listed out in the
pyproject.toml
file. -
When installing a package, it should be possible to choose which methods/classes one really needs, e.g.
[project] name = "mypackage" version = "1.0.0" dependencies = [ ... "otherpackage1>=3.0.1,exports=*", "otherpackage2>=3.0.1,exports=[SomeClass,SomeOtherClass,some_method1,some_method2]", ... ]
(I am of course not suggesting that that be the syntax.)
Now consider the transitive closure. If every pyproject.toml
file is set up in this conservative way, we end up with a better diet of dependencies. The enormous tree of unnecessary waste will be pruned down to just the sliver of things that a particular project/package actually needs and uses.