PEP 668: Marking Python base environments as "externally managed"

pradyunsg · October 13, 2022, 1:20am

Is the intended goal for Fedora to allow /usr/local installations with pip, without flags?

If so, are there any specific reasons for preferring this over the PEP’s recommendations to nudge users toward virtual environments by default?

h-vetinari · October 13, 2022, 2:30am

Normally I’d say “each environment needs it’s own /lib path”, but since this topic is specifically about the “base environment”, I think whatever makes up the base environment^[1], including the /lib that goes along with it, needs to be externally managed.

to me, this includes the system python, but we may have different interpretations of “base environment” ↩︎

encukou · October 13, 2022, 9:31am

It might not be the best long-term solution, but it works now and changing it will break things for some users. I’d be happier with nudging users rather than breaking them.

The PEP 668 way covers use cases I can think of, like:

venvs with --system-site-packages should solve issues with system libraries that pip can’t install (e.g. dnf, selinux).
applications with pip-installed plugins should probably manage a virtualenv, or add its own path entry for the plugins

I probably forgot something, but anyway: switching to the new way will take time & effort. If we add EXTERNALLY-MANAGED now and pip decides to honor it, things will break for users (both installing to /usr/local/ and with --user). I’m worried that many users will reach for the --break-system-packages hammer and leave that in their scripts forever.
Things would be easier if we could e.g. agree on some kind of INSTALLING-HERE-IS-DEPRECATED-BUT-SAFE marker file for individual path entries, and USER-HOME-INSTALLS-ARE-DEPRECATED-BUT-SAFE, at least for a few releases.

hroncok · October 13, 2022, 10:24am

The high-level goal in Fedora that I thought this PEP will help me achieve is:

local means /usr/local/lib.../python.../site-packages
system means /usr/lib.../python.../site-packages

`sudo pip install` will install to local

This is already the case in Fedora due to our patch to sysconfig.

`sudo pip install --upgrade` will install to local and only uninstall from there, never uninstall from system

This is already the case in Fedora due to our patches to sysconfig and pip.
Users who pip install --upgrade pip unfortunately lose this protection because they undo our pip patch ^[1].

`sudo pip install --prefix=/usr` will error with the message from `EXTERNALLY-MANAGED`

This is not the case for Fedora yet.

`sudo pip install --prefix=/usr --break-my-system` will let users install to system

This is currently moot for Fedora, due to the previous point.

Maybe I simply had bad expectations about this PEP. Sorry for not making this clearer before it was approved. I tried to stay on top of this and then I missed the train when it suddenly got moving.

I’ve just noticed the KeyError comment in the patch is bogus, feel free to ignore it. ↩︎

pradyunsg · October 13, 2022, 11:25pm

I’m gonna use this to sneak in a possibly-paraphrased quote that I really like (I first heard it from a John Green):

Long term systemic problems require long term systemic solutions.

This is tackling a UX/correctness problem that has calcified over a long time. I promise I’m not trying to hurry this along in a disruptive manner.

FWIW, I think it’s sensible for redistributions to disable user installs with site.ENABLE_USER_SITE being modified (it supports a default of False, instead of the regular None). I don’t recall if we stated that in the PEP as a recommendation.

These together would basically mean that the system-provided Python can only be managed with a system package manager and all other install-via-pip use cases are delegated to virtual environments.

I guess one thing we could do to ease transitions is enable externally managed environments to present warnings and continue, instead of just erroring out, by supporting a Warning key in addition to Error. I’m unsure how useful that would be TBH, but doesn’t hurt to throw the idea out there.

Right now, this PEP says that we will basically disallow installing in both local and system, if the marker file exists.

Broadly, this PEP isn’t trying to solve the /usr vs /usr/local problem (that’s a sysconfig design problem to be solved separately). It’s trying to remove any possibility for users to break their system-package-manager-managed environment by adding/modifying/removing arbitrary files using (a new-enough) pip.

Another way to think of this: this gives redistributors the ability to tell Python tooling “hey, don’t meddle with files in /usr based schemes” (which… as noted affects /usr/local).

That said, part of the motivation behind suggesting setting the schemes explicitly and separately (at least, as far as I remember) was to make it easier to support the sysconfig changes and to eventually have better support the /usr vs /usr/local situation via a CLI argument in pip in the future. It could enable a hypothetical pip install --scheme=posix_local/posix_global to modify files there (which could imply ignoring externally-managed, or require an extra flag). I’m imagining that a future functional change to sysconfig would implement the posix_local as a default (or simplify implementing/patching it) and make some other changes to actually properly support this.

PS: The --prefix approach is a neat idea, however (without patches) regular/vanilla pip installs already use /usr as userbase and posix_prefix as the scheme.

brettcannon · October 14, 2022, 7:02pm

But there are enough core devs here to make changes happen. I think a clear goal for sysconfig along with expectations of how it is to be used would be good. Then we can make the changes upstream since it’s one of those things that has to ship in-box.

pf_moore · October 14, 2022, 9:23pm

Agreed. It should probably be a separate topic dedicated to collecting use cases for sysconfig, rather than derailing this thread, though.

pradyunsg · October 15, 2022, 9:19am

Adding a cross reference to Linux distro patches to `sysconfig` are changing `pip install --prefix` outside virtual environments since the distro needs for /usr vs /usr/local was discussed there too, and the point of that thread is to figure out/discuss a design for a solution to the problem.

stefanor · February 3, 2023, 2:58pm

Debian is getting ready to implement this, for Debian 12 (“bookworm”). This will carry into Ubuntu 23.04 (“lunar”) too.

python-pip version 23.0+dfsg-1 includes PEP668 support, upstream. This version warns users (who have apt-listchanges enabled) about the feature.

python3.11 version 3.11.2-1 or 3.11.1-3 (depending on timing) will declare itself to be EXTERNALLY-MANAGED. This version will carry a README explaining the situation.

We are already hearing from concerned users, whose workflows are going to get broken. I’m expecting some more of this. I wish it wasn’t right before our freeze in Debian, but that’s the timing that this worked out at. If necessary, we can roll back EXTERNALLY-MANAGED in our python3.11 for bookworm’s release, but I’d like to make this happen…

pradyunsg · February 3, 2023, 3:01pm

Excited to hear that Debian’s adopting this! ^>^

I think the Debian message could likely mention --system-site-packages flag (and equivalents)? That’ll likely resolve the main concern that the user had (sorry, didn’t click through for the replies that might’ve been sent already).

stefanor · February 3, 2023, 3:40pm

He just followed up himself to say it’s an option, but not a great option.

uranusjr · February 5, 2023, 6:08am

I’m likely missing context, not being a Debian user, but I would freak out when I read this, if I don’t know what’s actually going on:

Practically, this means that you can’t use pip to install packages outside a virtualenv, on a Debian system, any more.

This sounds like Debian is breaking me, and I will have no choice but to either use virtualenv, or switch system entirely. But in fact I still can install packages outside a virtualenv on a Debian system; I just can’t do that against the Python installation(s) Debian provides. I am not sure if (how) this subtle difference can be significant to certain people, but do wonder whether the message can be tweaked to be less absolute.

stefanor · February 5, 2023, 4:11pm

That’s a fair point, I’ll try to get that across.

actualben · February 6, 2023, 5:28pm

A number of my Alpine-edge based Linux container builds are already broken because of this PEP’s somewhat contradictory guidance regarding containers.

The impact of deciding that “if you want to use pip in a container you must use a venv now” adds ~15MB per container image with no additional functionality. The entire busybox image can fit in the venv overhead 4 times over. Here are the Dockerfiles I used and the resulting image sizes (arm64 architecture on 6-Feb-2023):

without venv - 84.2MB

FROM alpine:3
RUN apk add --no-cache python3 py3-pip;
WORKDIR /app
RUN set -eux; \
  pip install requests; \
  pip cache purge
ENTRYPOINT /bin/sh

with venv - 99MB

FROM alpine:3
RUN apk add --no-cache python3 py3-pip;
WORKDIR /app
RUN set -eux; \
  python3 -m venv venv; \
  . venv/bin/activate; \
  pip install requests; \
  pip cache purge
ENTRYPOINT /bin/sh

dstufft · February 6, 2023, 5:37pm

The question isn’t really about containers, it’s about who manages the Python that you’re installing into. So for instance, the official Python containers, which have a dedicated Python install that isn’t managed by the OSs package manager, should not be marked as externally managed, but rather should be managed by pip etc.

I believe the idea is that there is going to be a flag you can pass to override the externally managed file marker, so you won’t be forced to use a virtual environment. However, system tools rely on system python with system libraries even inside of containers, so you very well may break your system if you’re installing things into your system python using pip.

actualben · February 7, 2023, 2:12pm

I understand where you’re coming from here, but I am talking about the specific case of python in containers made by distros like Alpine and Debian. I think we disagree on whether containers are “special”. Some of my concerns are echoed in the PEP:

A distro Python when used in a single-application container image (e.g., a Docker container). In this use case, the risk of breaking system software is lower, since generally only a single application runs in the container, and the impact is lower, since you can rebuild the container and you don’t have to struggle to recover a running machine. There are also a large number of existing Dockerfiles with an unqualified RUN pip install ... statement, etc., and it would be good not to break those. So, builders of base container images may want to ensure that the marker file is not present, even if the underlying OS ships one by default.

So in a way the pep acknowledges the breakage it will cause in the container world but then goes on to recommend the breakage anyway, by saying “Keep the marker file in container images”. We’ve got ~9 years of people working this way generally without significant breakage and if things did break they could always have added venv on their own without outside steering.

I’m arguing that containers are an exceptional case. In the above example Dockerfiles (which aren’t too far from what people do in real images) I installed python immediately before first using pip - so no OS-level tools are there to break, except maybe pip itself. I’m generally not going to use python to shell-out to an os-owned python program.

Anyway if the idea is to always use venv in containers because there could be package conflicts then in addition to the venv overhead we’ll have the os-owned version of some packages in the container as well as the venv-owned version of those same packages – and that’s not a great outcome for a container image author. Wasted space and a larger number of packages that might get flagged in vulnerability scans are established as anti-practices in the container world.

I’m not shipping a stable OS that happens to have an app in it - I’m shipping a packaged app. I think it’s like a race car where you rip out most of the interior so that what remains is optimized for the car’s single function. Anyway I await the --break-system-packages flag making it into the distros that already have EXTERNALLY-MANAGED implemented. I wanted to give some feedback directly instead of having it arrive second hand.

BTW if I’m not mistaken I think this PEP breaks PEP-370 for the majority of python users (who have installed python via a package manger) but maybe I’ve misunderstood.

pf_moore · February 7, 2023, 2:59pm

Given that you’re configuring the container, and are freely using root permissions to do so, why not just remove the EXTERNALLY-MANAGED file yourself, before you start running pip? If you’re asserting control over the full system stack in the container, you’re entirely within your rights to do that.

kpfleming · February 7, 2023, 3:33pm

In addition, you could request that the creators of the base image that you are using do that when they create their image, so that consumers of that image don’t need to deal with this at all.

merwok · February 7, 2023, 3:38pm

The quoted passage recommends the opposite! It says that even if Debian is shipping the marker file, the debian docker image could remove it to avoid the issues.

dstufft · February 7, 2023, 5:27pm

I’ll point out that the PEP doesn’t actually require that EXTERNALLY-MANAGED be used in any specific case, it just defines what happens if that file exists. There is a section that is explicitly marked as non-normative where the PEP offers some recommendations at what the PEP authors think would be best practices in varying conditions, but distros are free to ignore those recommendations if it makes sense.

For the container use case, you’re using a container that doesn’t ship with Python, but that you’re ultimately installing it yourself through the package manager. There’s not a good way for the package to differentiate between your installation that is “safe”, and a standard installation that includes several tools that depend on system packages where it is unsafe.

You’re ripping out most of the interior sure, but you’re also “buying” (downloading) normal street car parts (the apt install python3 package) and expecting them to be satisfactory for your race car use case out of the box.

I think the recommendations in the PEP are still the right thing to do by default here, but if you’re willing to take the functionality of your container image into your own hands, there are options for you to do that:

Delete the EXTERNALLY-MANAGED file.
Use the --break-system-packages flag once it’s available.
Ask the OS distributors to provide a way to configure the python package to omit the EXTERNALLY-MANAGED file.

PEP 668: Marking Python base environments as "externally managed"

sudo pip install will install to local

sudo pip install --upgrade will install to local and only uninstall from there, never uninstall from system

sudo pip install --prefix=/usr will error with the message from EXTERNALLY-MANAGED

sudo pip install --prefix=/usr --break-my-system will let users install to system