PEP 704 - Require virtual environments by default for package installers

Well, except it breaks Conda and other non-venv forms of environment isolation, because the --user directory if present gets inserted before sys.path in the Python module search order; and the workaround in CPython to bump the activated environment path above this is a venv-specific bespoke site hack and doesn’t work more generally. This has caused much grief for Conda users, as even reinstalling Conda (much less a specific environment) won’t remove these potentially conflicting packages; users have to specifically know where to go looking for them to remove them.

For that and other reasons pip and others have apparently been starting to move away from recommending --user. See discussion in this thread, which also ended up getting on the topic of requiring a venv activated by the default, and the resulting Conda incompatibility:

I agree with that as well. That’s maybe a less extreme version of what I’ve been calling a “manager-first” approach, i.e., that every Python should be “in an environment”, and the top-level installed thing should not be Python but an environment manager.

I don’t think that user-site fills this role, because user-site is not an independent environment.

1 Like

In practice it might work most of the time, but the concept of “user’s space” and “not the primary isntall” are orthogonal – someone could install all of Python into their home dir (conds certainly does), and on a multiuser system, you might not want all users to have to maintain the whole system.

And the “per-project-ish directory” is a horrible idea for many workflows :frowning:

1 Like

I know, I didn’t implement it, but these seem like obvious things that the venv implementation could have just overridden. Maybe it still could, but we’re so wary of breaking things we won’t even touch that (I was probably the last person to touch path calculation, and that was a nearly-perfect reimplementation - and I know it was nearly perfect because the extreme edge cases that did change actually broke people and they came and complained).

pip has no choice but to tell users to avoid user site-packages. They can’t fix it. And the core team/steering council “doesn’t care” about packaging, so the only thing changes cause is complaints.

But my point is that the facilities exist, because they’ve been tried. Suggesting them again is fine, but it has to come with some recognition of past attempts and some justification for why it will work this time.

2 Likes

I’ll try to weigh in from the Spack perspective here.

I guess I should do this more, as I don’t get the impression that folks over here in Python packaging understand the package installation model in Spack/Nix/Guix/etc. particularly well, and that’s probably our fault. I don’t try to impose Spack opinions too much on Python packaging, because we’re really more like a distro than a Python tool, and many of the things we implement (like environments) are mostly distinct from core python. So it can seem futile to speak up here. I’ve been inspired by recent packaging discussions, though, as it seems like the community is starting to realize that scientific computing, HPC, and AI use cases (where we have to integrate with C, C++, Fortran, Lua, Julia, etc.) are not so niche after all. We really are attacking a more general problem (albeit with a smaller user base).

Also, people use Spack to develop Python code, so we share a lot of similarities with tools like pip as well as distros. It would be nice to provide a familiar user experience to Python people in Spack.

Using Python Packages

I think most folks in the community think there about two main ways to install Python packages:

  • Installing in the python interpeter, and
  • Installing in a virtual environment.

I agree the second one’s way better – and people should isolate their python package installations (especially from the system one), but there are other frequently used mechanisms for isolating Python installations. In particular, many HPC sites, as well as Spack, nix, and guix will install every package into its own prefix, and the user can pick what to load using one of:

How Spack does it

In Spack, we also support isolated environments, but they’re implemented by symlinking (or hardlinkng, or outright copying and relocating) --prefix installs into the environment’s prefix (i.e., into site-packages). We use the old virtualenv trick of copying the interpreter and os.py, but we might switch to the venv mechanism eventually. In all of these cases, you can have arbitrarily many Spack environments with arbitrary combinations of packages in them, and we always start with a --prefix install that Spack then links into some Spack env.

If a user just wants to try out different versions of some Python package, they might install a few and load one:

$ spack install py-black@22.1.0
$ spack install py-black@23.1.0
$ spack load py-black@22
$ which black
/Users/gamblin2/src/spack/opt/spack/darwin-monterey-m1/apple-clang-14.0.0/py-black-22.10.0-qtvxbdnup7cuwbfj3lxzn3btv4m2myjl/bin/black

Spack users are used to this; they get that they need to spack load things to use them, and it’s nice because they can have as many versions of any package they like installed at once.

We specifically don’t support installing things into any particular Python interpreter prefix – we don’t want that. If users want Python installed with a bunch of packages, they make a Spack environment, which might look like, e.g., this:

spack:
  specs:
  - python @3.9.15
  - py-torch @1.12.1 +cuda +cudnn +mpi
  - py-pygments
  - py-mpi4py
  - mpich

That is then concretized (resolved), we spit out a spack.lock with all the specific dependency configurations (you can use this to reproduce the build), and the whole env gets linked together in what we call a view – a single prefix. The user can activate/deactivate the env with:

$ spack env activate .
$ spack env deactivate

And they’ll get all of those packages installed into a prefix that they can load/unload on demand.

Installation model

Some points to note here are that Spack would never install “into” a virtualenv like pip does, and Spack sets up the build env independently (and reproducibly) for every package. We don’t want a stateful environment for package installations. We want every package to be isolated from every other package, which is IMO more aggressive than requiring a venv to do an installation. So the requirement for a venv for us is superfluous.

The way this gets implemented is as you might expect – we set up PATH, PYTHONPATH, etc. in the build environment and we currently run pip install with --prefix and a bunch of other args (see here for the rest of them).

We also do things like rewriting shebangs for each script to point to the specific Python that the script was installed with. We generally do not use /usr/bin/env python3. If we end up copying or hardlinking an environment into place, we’ll relocate the shebangs to point to the environment prefix, not the canonical installation directory.

Nix and Guix have similar installation models – there may be slight differences, and they generally only support using things within a user “profile” (which is kind of like an environment), not through theone-off load/unload mechanism Spack has.

Thoughts on this PEP

I had a strong negative reaction to this PEP when I first read it, but I read it over a few times more and tried to follow the discussion here, and in the end I don’t think it affects Spack too much.

So here are some thoughts and concerns specific to the PEP:

  1. The PEP doesn’t currently say anything about the many installation modes that tools like pip support. In particular, I think it should say something about --prefix installs, and I would love it if it specifically called those modes out as exempted from the venv recommendation. If I’m doing a --prefix install, I really don’t care about whether there is a venv loaded. I know what I’m doing. With this added, I don’t think the PEP will break us at all.

  2. The PEP concerns me a little bit in that I think it can be misread (as I initially misread it) as encouraging every Python package to assume a virtualenv. This fear is possibly unfounded, but if people read the PEP this way, it will be really problematic for our model.

    I am imagining a future where Python developers misinterpret this PEP to mean that installing every Python package in a virtual environment is somehow the “best practice”. Then I worry that packagers are going to start assuming a venv in their setup.py / whatever other build tool they’re using, and they’re going to tell us Spack people that our --prefix installs are nonstandard and stop supporting them. I don’t know how they would make their packages break without a virtualenv (maybe they would require that the .venv directory exists or something), but Hyrum’s law tells me we may start to see packages that simply don’t install outside of a venv.

    IMO that would be bad, and would stifle innovation by de-facto requiring projects like Spack and Conda to implement only the “standard” venv semantics, when we’re trying to do something more versatile. To be clear, I don’t think that’s the intent of the PEP, but I think it could be an effect of this PEP if we’re not careful how we couch it for packagers.

  3. (minor) This language is a bit confusing to me:

    This PEP recommends that package installers like pip require a virtual environment by default on Python 3.13+.

    What’s an “installer like pip”? Is Spack one? Kind of… if you asked me for “tools like pip” I’d probably name conda and spack. But neither of those is likely to implement this PEP. So how should I interpret the first line of the PEP there if I want to try to play nicer with the Python community?

That’s it. I hope this is helpful and I’m happy to answer more questions. I’ll try to be better about participating in the discussions here :slight_smile:.

6 Likes

Thanks for the well-written and detailed background! Your contributions to packaging discussions have certainly been very much appreciate :slightly_smiling_face:

+1, assuming the PEP goes through the first place. Although, since as you mention you’re already passing a long list of flags, it would at most just mean adding one more.

While my concerns regarding the PEP center around its impact to distributions like Conda, Spack, Nix, etc. and the assumptions that packaging tools may make regarding virtual environments in this regard, that will be incompatible with them, while not outright impossible, it’s hard to think of any plausible motive nor means for non-packaging-related packages to break when not installed in a virtual environment, outside of highly specialized corner cases that (already) have some very specific intentional reason to, and already-pathological cases that would have to rely on already incredibly hacky and fragile logic using a dynamic build script (i.e. a setup.py). Such latter cases would be presumably considered fundamentally broken and not officially unsupported by modern packaging tooling, not the converse.

And with…

  • The world (slowly but inexorably) moving away from setup.py to declarative config files (pyproject.toml and backed-specific alternatives like setup.cfg)
  • Package authors migrating away from Setuptools, the one mainstream backend that uses dynamic setup.py build scripts in the first place and toward other modern backends that only allow declarative metadata and prevent such hackiness in the first place,
  • Setuptools itself moving to generally discouraging setup.py in favor of setup.cfg/pyproject.toml for the large majority of projects that don’t actually need it, and breaking lots of old hacky distutils-based stuff used in those dynamic setup.pys on a seemingly regular basis now

…these sorts of issues are gradually going away and seem unlikely to get worse rather than better even if this PEP is adopted.

I (and others) would really like to see this (and other key distinctions, like whether a Conda environment counts as a “virtual environment”) explicitly defined in the PEP as well, as it seems rather underspecified to me at the moment. However, as I understand it, the intention here is Python-specific installers that work with a standard Python distribution, e.g. PDM, Hatch, Poetry, Flit, etc., rather than general-purpose, Python-independent package installers that install Python itself (like Spack, Conda, Nix, etc).

This seems like a pretty key issue to fix in the Python ecosystem. Packaging is an important part of Python for many users, and it seems like it would be good to have one group (Steering Council seems like the logical option) to be ultimately responsible for the direction of the language and packaging.

1 Like

Just chiming in on the “detecting venvs” question:

  • “sys.real_prefix” isn’t in the venv docs because it’s a virtualenv specific workaround for the lack of native venv support in older Python version: Determine if Python is running inside virtualenv - Stack Overflow
  • PEP 405 chose “sys.base_prefix” as the preferred name, and virtualenv also switched over to that in version 20
  • conda envs don’t get picked up as Python envs because they don’t provide a pyvenv.cfg file with “home” set

Checking for base_prefix != prefix has been the recommended way of detecting virtual environments since Python 3.3, so if conda/spack/et al want Python level tools to behave as if they’re already inside a virtual environment, then the way to indicate that is to define a suitable pyvenv.cfg file and provide it by default. The division of packages between the two doesn’t need to be the same as what venv creates for this to work (I suspect for conda et al it would make the most sense for the nominal base directory to contain symlinks back to env-specific Python installation directories). It would take some fiddling when developing the PRs for affected tools to make it work, but once done, PEP 704 would have zero impact on users of environment managers that advertised themselves as already providing Python-compatible isolated environments.

Thanks – this is great. I really hope that pip and python itself will try to be a good citizen within arbitrary other package management systems – good to get the the details of others so we don’t just make something conda-friendly.

Maybe that would take care of it – but it really feels like a kludge to me – can’t we have a way (environment variable, or …) to just say:

“I’m managing stuff here, get out of the way”

rather than faking a virtual environment?

sys.base_prefix != sys.prefix is a very specific check for a virtual environment, not a generic isolated environment. Setting that is technically incorrect, and is likely to run into a bunch of corner cases - in particular for build and environment related tools. @jezdez and I happened to have discussed this just last week, and decided it wasn’t a healthy idea for conda to do.

If there’s a need to identify isolated environments rather than virtual environments, then that should be a real concept in Python.

3 Likes

Also, quoting from my comment above that addressed this topic, it wouldn’t really make sense for Conda to set sys.prefix != sys.base_prefix, as that would be contrary to their currently specified, documented and understood meanings:

I agree, this is not the right thing for conda to do. (If only because getpath.py is holding together with really really cheap and nasty duct tape, and any attempt to induce it to do something specific is going to be disastrous.)

It is a real concept, it’s just so widely ignored due to how Linux distros lay out their Python installs that sys.prefix isn’t actually a good way to tell you where your isolated Python install is (contrast with Windows and Conda, where sys.prefix is entirely sufficient to tell you where your isolated Python is).

But this isn’t the axis that’s relevant here. The problem is the same as it’s always been - when a Python environment is externally managed then it shouldn’t be managed by tools that don’t know how to manage it.

Whether it’s “isolated” or not really just depends on whether you’re going to use it for multiple apps - it’s not a function of Python itself, but the user’s intent (or the *waves hands* administrator’s intent, e.g. if a Linux distro wants a Python environment just for their in-built apps, then they intend it to be isolated, and need to take their own steps to achieve that).

2 Likes

FYI this is off-topic for this PEP. If people want to discuss it then I am going to ask they start a new topic.

1 Like

The Wanting a singular packaging tool/vision thread is literally on this topic, so it can simply move there (and anyone wanting to discuss it should review the ground we’ve covered so far :wink: )

1 Like

Yes, that’s EXTERNALLY-MANAGED.

That isn’t what the conda devs want though (outside the base environment), since they want to allow pip et al to still to be used to manage non-conda packages - the breakage in PEP 704 for conda is that pip would stop installing into conda environments (since they don’t register as virtual environments in the venv sense, and making them do so would require contorting conda’s Python installation in admittedly weird and wacky ways).

I’m honestly not sure what aspect of the problem space PEP 704 is designed to address given that PEP 668 has been accepted and the implementation is being rolled out:

  • environments that don’t want pip (et al) to be used at all can set EXTERNALLY-MANAGED
  • environments that do want pip (et al) to be used may want to restrict or influence what Python-specific tools do, but requiring the providers of those environments to create or emulate a virtual environment just to get back the functionality they already have today would be a pointless inconvenience rather than something that improved their level of influence or control

Rejecting or withdrawing the PEP on that basis would make more sense to me than trying to morph it into covering a different idea (and I say that as someone that has published PEPs where the idea that I ended up submitting was wildly different from what I proposed in the initial drafts. Sometimes it’s useful to go that route, but I don’t think this is one of those cases, since the presented idea is quite clear. I just don’t think it will help, and it will definitely hurt).

The rejection/withdrawal might provide inspiration for other ideas, but those can always be pursued separately.

1 Like

My feeling is that the PEP covered two different purposes:

  1. Putting in writing somewhere the common convention that a project virtual environment should be called .venv and be located in the project directory.
  2. Checking whether pip’s plan to disable installing into the system environment by default had any major flaws.

For (1), I think it’s a worthy goal to try to formulate some “best practices” in this area, so that tools and IDEs can set comfortable defaults. I’m not sure there’s sufficient consensus for this to be a standard, though (PEP 704 doesn’t attempt to do that, to be clear) so I feel that a PEP is the wrong place for such a guideline (much like we wouldn’t try to make the “src layout” for projects a PEP).

For (2), I think it’s clear that pip doing this would cause problems, for conda at least. I’m not sure that the “pip design discussion via PEP” approach was ideal, and I’d be reluctant to use it in future, but it’s got the information that we need. At this point, I think the PEP can be withdrawn or rejected and the pip developers can take the information and decide how to proceed. To be clear, though, I do not think that rejection would imply “pip must not do this” - pip’s feature set is not controlled by PEPs in that way. But withdrawal is probably better to avoid any confusion.

1 Like

Indeed. My plan is to withdraw the PEP at this point – I mostly need to find time to write around the withdrawal notice that I’d place on the PEP when I do so. :slight_smile:

3 Likes

functionally, mostly. Though the wording of the error message (set by pip, not the one that can be overridden in the EXTERNALLY-MANAGED file) is a bit specific (I’ve put in a PR for that), and the flag name is ever worse.

But it’s actually kinda the opposite of what I (not necessary others in the conda community) want – EXTERNALLY-MANAGED means “disable pip altogether” – which is pretty much what we all do want for “base” environments, but for other conda environments, maybe we don’t want disabled pip, but have pip behave differently – a “minimal pip”.

Yes, but I at least would like to see it used in a more controlled way – but that’s a conda problem :slight_smile:

Anyway, I think that the recent addition of the ability to override pip defaults may work for what I have in mind – experiments underway. And if so, then there’s nothing left to be done by the pip folks.

@pf_moore and @pradyunsg: Sounds great – thanks I think this PEP did prompt some great discussion and better understanding among all.

If this is something people want, I’m obviously interested/motivated to help with this. I tried to get some consensus in Setting up some guidelines around discovering/finding/naming virtual environments - #57 by pradyunsg , but it didn’t get far as it was discussing too many use cases at once.

python -m venv currently prints an error:

venv: error: the following arguments are required: ENV_DIR

Could it instead default to behave like python -m venv .venv? (If that is / will be the recommendation.)

That would look less confusing and overwhelming to beginners, is easier to type, reduces the amount of details to lookup or remember, feels more “official”, and has no obvious downsides or impact on more advanced use cases or users that prefer a different name.

1 Like