Clarification requested about the packaging "resource" system

I read that specifying capped versions for dependencies causes headaches, primarily because (as I’ve understood for a long time) Python’s packaging system is “flat” and not designed to accommodate parallel installs of multiple versions of the same library (so that dependencies need to be resolved to a single version that satisfies all its dependents).

But then it was brought to my attention that setuptools, via pkg_resources, and thus its successors importlib.resources and importlib.metadata1, are intended to be able to work around this by using built eggs.

Is that, in fact, a designed purpose intended to be supported going forward with the new importlib contents?

If so: suppose I pip install foo and it requires bar >= 42.0, but I already have installed something else that requires bar < 42.0. Is there a way I can tell Pip to just grab the new bar version anyway, set it up as an egg that doesn’t conflict with the existing bar install, and somehow configure foo so that its imports use the egg?

Or is this only meant to support vendored dependencies? (I’m not sure why they would actually require any support, though.) Or, perhaps, is importlib.resources abandoning pkg_resources’s promise of “support for parallel installation of multiple versions”? It seems like the non-deprecated parts of importlib.resources only implement a limited API for reading “resource” files, which doesn’t seem like it’s intended for making it possible to import code. But then, it’s not clear to me from the pkg_resource documentation how “support for parallel installation of multiple versions” is supposed to work there, either.

Or… just what am I missing here?

1Incidentally, much thanks to the latter for presenting the terms “import package” and “distribution package”! That makes it a lot easier for me to explain those concepts to others when talking generally about import statements and installing libraries.

2I had previously only really thought about eggs in the context of egg links, as an implementation detail of pip install -e.

This aspect of eggs was an essentially failed experiment from a long time ago. The “multi version install” feature never really caught on, had (I believe) a number of bugs and unreliable behaviours, and is essentially no longer supported in mainstream packaging.

2 Likes

AFAIK pkg_resources was never intended as a mechanism to allow parallel installation of multiple version of the same package. It is intended as a mechanism to allow accessing “resources” from Python packages, where “resources” are files installed alongside the Python modules. The standard lib importlib.resources implements a very similar functionality. importlib.metadata allow to fetch metadata for the installed packages, looking into metadata files installed along the packages code. None of these facilities offers the possibility to import code.

Python eggs are a way for distributing Python packages, mostly superseded and obsoloted by the wheel distribution format. Neither is designed to allow parallel installations in the same Python environment.

The only facility to install multiple versions of the same package is to use distinct Python environments as implemented by venv or virtualenv. This allows to isolate applications with conflicting dependencies. However, this is not the same as allowing libraries with conflicting dependencies to be used in the same application.

2 Likes

Ah.

I would have said in that case that the pkg_resources documentation ought to reflect that properly; but I guess it’s too late to bother if all this stuff is on the way out anyway.

There still do seem to be use cases for resolving version conflicts this way, though. There just… isn’t a clear way to make that play nice with an import statement syntax that’s based on symbolic names rather than file paths (like in JavaScript).

I tried to keep my reply simple leaving out the historical prospective. Once upon a time, there were eggs and eggs could be installed but not activated. Multiple eggs for different version of the same package could be installed in parallel. Then pkg_resources could be used to activate a specific version for a given Python process. Unless it was that messy that I decided to forget everything about it, it was not possible to activate two versions of the same package in the same process. However, it was found that this way of allowing applications to access the version of their dependencies that they needed is not great. Virtual environments are found to be nicer to work with, at the expense to have multiple copies of the common dependencies installed.

I think that the setuptools documentation for pkg_resources is quite clear in indicating that pkg_resources is deprecated and should not be used in new projects.

Having multiple versions of the same module be available in the same process works only in a very limited number of cases, namely when the code patch that access the two versions are completely disjoint. Because of the limited applicability and the limited benefits, I don’t really see the big complication that it would require to be worth the effort.

With eggs your program could essentially do dependency resolution at import time. Instead of having multiple virtual environments, you could have a big directory full of eggs. With the buildout system the script to start your application might start with a long list of the eggs it wants to run with, then set up sys.path and continue. However you would never import two versions of the same egg in the same Python interpreter.

10+ years ago virtual environments became much more popular, where everything your application can import is grouped under a directory.

1 Like