Virtual environments vs. *nix Python upgrades

Hello,
With a recent update to /usr/bin/python, all my virtual environments were broken. I have an easy solution to the problem – delete and re-create them – but that’s not very friendly to people encountering this for the first time.
Since this affects both virtualenv and venv, I’m posting here rather than opening issues.

Following PEP 394 the python command on my system was a series of symlinks:
pythonpython3python3.9
After an update (of the whole OS in my case), this changed to:
pythonpython3python3.10
I have both python3.9 and python3.10 available (both before and after), but I assume there are people who only want one version – the latest one that’s sufficiently stable.

When virtual environments (both venv and virtualenv) are created, they use symlinks to the Python they are created with. So after a python -m venv my_env, I end up with:
my_env/bin/python/usr/bin/pythonpython3python3.10
which will break after python3 is updated to python3.11 – the library in my_venv/lib/python3.10 won’t be usable.

Looking at this, it would make sense to me that if sys.executable is a symlink, virtual environment tools would resolve it before pointing to it. But I probably don’t know the whole picture. Are there any downsides to that approach? Is there any discussion I should read up on?

(We briefly looked into how this would affect nested venvs with --system-site-packages, but that looks messy even now.)


And while I’m here: as a distro maintainer, I’d love to have a way to provide a system-spectftc message for venv/virtualenv activation scripts saying “this Python was uninstalled, here’s the command to bring it back”. Does that sound like a good idea?

cc @hroncok @bernatgabor @vsajip

2 Likes

Congrats on your election, BTW.

My knowledge pales in comparison to most here, and most of my experience with venv vs. conda envs and other tools (where the Python version is managed by the environment itself, not the system) is with CI, VM, server and remote client deployments (e.g. RPis), which are mostly deployed via an image versus upgraded in this fashion, so not sure how much I have to add.

But for what its worth, from a user perspective, the potential downside is that existing envs would stop working if that specific Python were not installed, which for lightweight pure-Python packages may nominally be undesirable to some users. However, for anything that’s compiled against a specific version (and not the stable ABI) , or that aren’t source compatible with the new version, failing early with an informative message (as you suggest) is much preferable to packages randomly breaking if they don’t support the new version with possible confusing tracebacks that don’t directly state the problem and solution.

Since the version of Python is an important variable when installing packages into a venv, as a rule it wouldn’t seem to make sense to allow it to vary as opposed to locking the env to a specific version. Of course, with Conda or other package managers one can simply upgrade the version of Python in the env, and any necessary packages are updated too.

Perhaps it would be feasible to tell the new pip solver to re-solve for a new Python version and upgrade packages as necessary, and update the env symlink to the new Python version? So long as the user system-installed the Python versions they wanted, this could theoretically allow somewhat similar functionality without too many changes, without pip managing the Python versions directly, but I’m a bit out of my depth here as to the practical feasibility.

There’s some work going on which might solve this issue, in that it relates to improving symlink resolution in venvs. Does this look like it will sort out this problem?

Sadly, no. The PR there only affects Python running from a virtualenv. I’m looking at behavior during virtualenv creation, when the venv/bin/python symlink is made.

$ ls -l /usr/bin/python
… /usr/bin/python → /usr/bin/python3.11
$ python -m venv /tmp/my-venv
$ ls -l /tmp/my-venv/bin/python
… /tmp/my-venv/bin/python → /usr/bin/python

That last one should link to /usr/bin/python3.11 to avoid breaking when /usr/bin/python is updated to an incompatible version.

1 Like

Unfortunately, venv gets pretty complicated here with the mix between symlinks and copies, and the fact that it doesn’t preserve all the information it needs in the pyvenv.cfg.

The issue Vinay linked is actually very relevant to this, and it’s worth contributing there, because the fault ultimately lies with how sys._base_executable is calculated and later used by venv.

The PR should actually fix your main issue (possibly with another tweak to venv itself) by ensuring that sys._base_executable is resolved from .../python to .../python3.11 at startup, and then that path will be used by venv (except maybe not on POSIX to avoid breaking people who assume its value, but if we break them deliberately then it’s fine :wink: ).

The challenge with pyvenv.cfg is that it doesn’t preserve the executable name, and so if that has to be resolved later and we can’t trust/use the symlink for whatever reason, we’re reduced to guessing.

I might have found a problem if we always resolve symbolic links when creating virtual environments. It might worsen the upgrade experience for some:

Consider the user has their Python installed into the /opt/Python3.10.1/ prefix. For convenience they also have a /usr/local/bin/python3.10 symbolic link to /opt/Python3.10.1/bin/python3.10.

They create a virtual environment with python3.10 -m venv myvenv and it resolves the symbolic link, so the ./myvenv/bin/python link now links to /opt/Python3.10.1/bin/python3.10.

Another day, they update their Python installation to 3.10.2 by installing Python to the /opt/Python3.10.2/ prefix, removing the /opt/Python3.10.1/ installation, and updating /usr/local/bin/python3.10 symbolic link to point to /opt/Python3.10.2/bin/python3.10. Their virtual environment is now broken :bomb:


To solve this situation, we would need to realize when to resolve the symbolic links and when not. And that smells like a heuristic :cry: The most trivial case of such heuristic that would solve Fedora’s problem is to resolve the symbolic link iff sys.executable is a link to a file in the same directory. E.g.:

executable = Path(sys.executable)
if executable.is_symlink() and executable.resolve().parent == executable.parent:
    executable = executable.resolve()

Doing that would also keep the current “nested venv” behavior unchanged.

(Imagine another complaint here about how much I wish we could replace venv’s with something more robust.)

It’s starting to look like resolving the symlink at all is a bad idea, or at least a very complicated one to explain.

Maybe we can change it to “venv remembers the exact command you used to launch it and uses that in future to find your original Python install”? So if you python3.10 -m venv ... then it’s going to resolve whatever python3.10 means next time you launch it. But if you ./myvenv/bin/python -m venv ... it’ll use that path (after making it absolute).

Trying to balance predictability with being easy to explain, and letting the user make their own choices. Resolving symlinks based on a heuristic takes away the ability for the user to choose how it should work, barring extra command line options, which also generally make things worse.

Considering we are the distributors and we know what we want to achieve ,would make sense to advertise the “actual” path of our interpreter somehow?

It may make sense, I haven’t thought through all the implications.

To do it, I’d override sys._base_executable at some point. It’s just a regular attribute, so can be set directly in Python.

Though it might be better to just patch venv rather than try and make it infer the right things to do.

It sounds like there needs to be a way to specify what the canonical path to a specific interpreter is, allowing for some local behavior to be specified.

Currently, I think that’s “whatever the symlink chain resolves to”.

The only place that breaks down is when a user wants a non-canonical path (or a canonical path to a non-specific interpreter). But we can’t have both through the same mechanism.

1 Like

That only works for venv, which is part of the standard library. We can patch our virtualenv as well, but users tend to pip upgrade it, so our patches would be lost. Since we seak for a common way of dealing with this issue for both venv and virtualenv, @encukou opened this discussion here.

1 Like