Alternative approaches to distros providing a "system Python"

I guess an alternative (and better?) approach is to create a separate virtual environment for each system package, or even better, have all packages ship their own virtual environment. From my understanding, many Linux distributions maintain a so-called “system Python”, e.g. /usr/bin/python3, and all system packages share that particular Python distribution, which is clearly suboptimal. For example, the package update-manager depends on python3-yaml==5.3.1, but what if another system package depends on python3-yaml==6.0.0? You get a version conflict, and PEP 668 doesn’t help that.

Essentially, PEP 668 says that “system Python” should not be touched by the user, but I argue that such a globally mutable “system Python” shouldn’t even exist.

Isn’t that the job of a distro maintainer, though? Resolving all dependencies through the distro’s package manager?

Yeah, that’s true. I feel like the purpose of PEP 668 is to make the package managers’ lives easier by giving them an exclusive Python environment. However, that only works under the assumption that they can reliably install multiple Python packages into a single shared Python distribution, which I highly doubt.

On the other hand, if a package manager does give each package an exclusive virtual environment, then why bother marking any Python environment “externally managed”? They are virtual and invisible anyway.

Well… that’s something they have a LOT of experience with. I’ve used Debian for years and haven’t had problems.

Sure! That would totally work. All you need is to install a completely new set of packages, a new Python interpreter, why not go the whole way and put it in a VM? But that’s not a package manager’s job, that’s a completely different concept. Containerizing is not what we’re talking about here.

Well… that’s something they have a LOT of experience with. I’ve used Debian for years and haven’t had problems.

Neither do I, but maybe we have no issue installing official Debian packages because packages that have complex Python dependencies didn’t make into the official package index in the first place? I am just speculating, but I find myself using pipx much oftener than apt, so I suspect there is some survivorship bias at play.

Moreover, I wonder how long can they keep up. With more and more system packages having a Python dependency and different (incompatible) versions being released, the situation will only get worse.

But that’s not a package manager’s job, that’s a completely different concept. Containerizing is not what we’re talking about here.

If your argument is “distro/package managers are doing their job well and we should leave them alone”, then why don’t we just do nothing? After all, it’s their job to keep the users from touching the system Python. What exactly is PEP 668 trying to achieve?

I agree, and this is similar to what I suggested somewhere in these various threads. Every use of Python should be via an environment manager. The “system Python” could one such environment, or it could be several.

FWIW, that’s what the EXTERNALLY-MANAGED file allows distros to indicate and for installers to know that they shouldn’t mutate it.

As for not making it easy/trivial for users to access, yes, we’re in agreement. It’s however a non trivial change and there seemed to be limited appetite from distros in unmixing for-the-user and for-the-system Python installations.

Again, that’s not our problem; if the Debian folks think that it’s impossible to handle the mess of dependencies, they can make their own decisions. This is no different from anything else. But to be honest, it’s not actually THAT common to have truly complicated requirements; usually, it’s just that you pick some key requirement (say, “Python 3.9”) and then find the version of everything else that will work with that. Yes, that’ll sometimes mean you don’t have THE LATEST of everything, but that’s the price you pay for consistency and stability.

That is, in fact, precisely what PEP 668 is trying to achieve. We want to leave the distro’s package manager alone. We want to make sure that, if you have Python 3.9.2 installed from apt and Python 3.12.0 built from source, that they have independent package repositories, so pip won’t touch the one for 3.9.2 and apt won’t touch the one for 3.12.0. The latter half is already handled (when I install into /usr/local/bin/python3.12 and make a /usr/local/lib/python3.12/site-packages, that’s not going to be mentioned anywhere by any apt-managed package; in fact, I can even have a Python 3.9 that apt won’t touch, independently of the 3.9 that it does work with), so all we need to do is stop pip from messing with the directory that apt is managing.

Current versions of pip do this by screaming if the directory is owned by root (or if pip is running as root, not sure which). That’s far from ideal (the /usr/local/lib/python3.12/site-packages directory is owned by root in the above scenario), but it’s a start. What we really need is for a way for apt to say “Hey, I’m taking care of this one”, and then pip won’t touch it.

THAT is what we’re working towards here. From my understanding this should be something that no built-from-source Python will ever need to worry about, but which Debian’s shipped Python will take advantage of. If I use “sudo python3 -m pip install websockets”, and that’s pointing into my 3.12 installation (which it does for me), then that should be fine, no problems, now I have a global installation of the websockets package. But the exact same command on a vanilla Debian system should complain on the basis that it’s trying to install into Debian’s Python, and it would be better to use “sudo apt install python3-websockets”. Obviously we can’t expect pip to know the exact apt package name to use, but at least, if pip refuses to install into the same directory, worst case you can just clean up your personal pip directory and be back to a clean Debian install.

If I’ve misrepresented the situation, please correct me.

2 Likes

I see. We are just worrying about different things. You think the distro package maintainers can handle the Python dependencies well as long as users don’t mess with their “system Python”, whereas I am more concerned about an implosion where the Python dependency of distro packages inevitably conflicts with each other. I guess only time will tell, so let’s wait and see.

By the way, I think we can further avoid user errors by removing the executables pip or pip3 from PATH and encouraging the usage of python -m pip instead. See this article for a rationale. Is there an existing PEP for this that I can upvote?

Pip documentation has long recommended this way to call Pip: User Guide - pip documentation v23.3.1

That’s good to know! However, apparently not a lot of people listen to the suggestion, which is understandable since pip just has fewer letters to type. I am talking about removing the pip entirely and forcing people to use the more sensible python -m pip alternative. Otherwise people will never change their dangerous habit.

(Are we getting off-topic here, or is it on-topic since pip is part of the packaging tool chain?)

Yes; ISTM that this whole discussion is rather off-topic, as the topic here is the (already approved and implemented) PEP 668 EXTERNALLY-MANAGED file that controls pip’s behavior rather than how distros should structure their own system Python installs (which is orthogonal to pip’s behavior), nor how pip should be invoked (which is orthogonal to the EXTERNALLY-MANAGED flag). I will move this to a new thread in the Packaging section.

EDIT: Since this is really two separate topics that are themselves orthogonal (distro system Python structure and pip vs. python -m pip), maybe I should have split this into two different threads, one for each, but since the latter point has had its ground retreaded countless times already I was hesitant to proliferate yet another thread on that specific topic, considering we can always do so at any point later if that discussion continues to develop independently.

FWIW, if I understand you correctly, this is pretty close to what packagine managers like Nix/Guix and distros like NixOS/Guix System do (of which some of the maintainers are active on here). You might want to look into those if you’re interested in that approach.

The way Nix works is that every package is installed under its own prefix - not just Python packages but every package - meaning that you can pick and choose which packages to expose in the environment. This is in contrast to, say, apt which installs everything under the same prefix; installing a package implies mutating the environment. The same is not true for Nix.

Normally, if you’d install two Python packages with Nix side by side, they’d still be able to see each other just as they would with apt (Nix pieces together a $PYTHONPATH). Nix is also able to build Python “applications”, exposing only the package’s console scripts in much the same way.

I think what you described is closer to the Docker approach, which is the exact opposite of what Nix does. My understanding is that containers like Docker ship a copy of the entire Linux userland, whereas Nix stores all dependencies in a global immutable directory, and references them from packages.

I actually believe the Nix approach is superior since it avoids unnecessary copies, but Nix has been around for 20 years yet never takes off, so I assume there must be a lot of technical complexity if we take that path. On the other hand, Docker is very well-received and Python users have no issue with creating a separate virtual environment for each project, so I guess that approach will work for distro packages as well.

The thing is, figuring out how to resolve conflicts between packages within their distribution is something that is wholly the responsibility of that distribution to solve, and is explicitly one of the tasks that they take on as system integrators to figure out how they are going to do it. Most of them have opted to have a single version of a dependency, and they will have a single set of packages that work together, opting to make patches or hold back new versions or what have you in cases where that isn’t possible [1].

If at some point they end up in a situation that they can’t resolve, then it’ll be up to them to decide what to do. Certainly using an approach like pipx is one of their available options, and this PEP doesn’t prevent that.

What this PEP does do, is provide a mechanism for the distribution and Python tooling to coordinate, when the distribution wants that to happen. Without the distribution doing something, this PEP does nothing, it exists purely for Python distributors to more explicitly coordinate with Python level tooling, and if they end up arriving at a solution that doesn’t need that… then they just don’t opt into it. If all of the downstream distributors end up doing that, then this PEP ends up being a funny quirk in history, but we have quite a few of them already.


  1. Sometimes they even vendor things. ↩︎

6 Likes

Well distributions have been working fine like this for decades. It might be a somewhat long wait.

1 Like

Note that nowadays we have GitHub - cachix/nixpkgs-python: All Python versions, kept up-to-date on hourly basis using Nix. which allows you to get pretty much any version of Python using Nix.