PEP 704 - Require virtual environments by default for package installers

CAM-Gerlach · February 19, 2023, 9:46pm

Slightly OT, but It would be great if PEP 582 could be brought up there too. PEP 704 seems to be getting most of the attention on the Conda front and I’ve seen much less attention to the Conda implications of PEP 582 despite potentially having at least a similar level of disruption, at least if not mitigated/prepared for, since it’s essentially forcing the same pip/conda prefix split as discussed here in addition to several other potential impacts.

pf_moore · February 19, 2023, 10:56pm

You should probably make sure @kushaldas is aware of any such discussion. I don’t get the impression he follows the broader packaging discussions, so keeping him in the loop explicitly (on the dedicated PEP 582 thread) is probably advisable.

CAM-Gerlach · February 20, 2023, 12:23am

I guess I was more thinking of ensuring the Conda folks were fully aware of the implications of PEP 582 as well, so they could reach out on the PEP 582 thread, as opposed to relying on Kushal to initiate that. But I’ll mention that over on that thread as well, thanks.

rgommers · February 20, 2023, 9:21am

I don’t think PEP 582 has a similar impact at all. PEP 704 contains two topics:

A name for a default virtual env - chosen to be .venv. This is fairly similar to PEP 582’s __pypackages__.
A change to pip and other Python package installers to only install into virtual envs.

(2) has no analog in PEP 582 (unless I am completely misreading that PEP) and is a major backwards compatibility break. (2) does not work at all for Conda, so needs a global opt-out (not per user, but something conda can set permanently). In this thread it has already be identified that (2) also doesn’t work in very common scenarios like CI config scripts and for Docker use. It similarly won’t work for pip usage in Spack and other packaging environments, nor for packagers who use pip in their distro packaging workflows.

I don’t have a strong preference for (1) between what PEP 582 and PEP 704 propose, but (2) is major breakage for many scenarios for both end users and packagers, and therefore I think that it should simply be removed from PEP 704.

h-vetinari · February 20, 2023, 11:04am

Unless “virtual env” is loosened to something that can be filled by a conda environment, then it would actually help a lot IMO, because always^[1] having (some form of) an environment would be very helpful IMO.

To be honest, that’s how I initially interpreted the PEP as it grew out of the previous discussions. If it was always ever only intended to cover the venv case, then I’d agree with Ralf that it’ll break many things.

even cases that opt out from creating an environment would still have an environment – the base install. ↩︎

pf_moore · February 20, 2023, 2:13pm

This proposal (the original pip feature request, and the “part (2)” section of this PEP) resulted from real-world issues caused by people installing stuff into their system environment.

But it was originally proposed before PEP 668 (“externally managed” environments) was available. Maybe it’s not needed any more? People will still be able to install into their Windows system environment (because the Windows Python installers don’t set EXTERNALLY-MANAGED) and maybe also into conda base environments or spack environments, depending on what those distributors decide to do about PEP 668. But maybe that’s good enough?

Pip can’t solve all of the problems of the world, maybe we need to accept that “people not using environments and isolation options appropriately” is just not pip’s issue? Maybe it’s up to the core Python team and the virtualenv developers to address the (percieved) usability issues with virtual environments? PEP 582 can be considered as part of that discussion, I guess…

CAM-Gerlach · February 22, 2023, 7:45am

Right, this part of this PEP merely proposes a standard name for an in-tree virtual env, which is essentially harmless from Conda’s perspective (whereas it is item 2 that of course presents the major challenges and concerns with this PEP, as discussed here). But it’s not totally comparable to what PEP 582 is proposing; there the default name is merely incidental to the behavior changes for installers and for Python itself, which is more analogous to the problems with the previous proposal to separate Conda and pip-installed packages into separate prefixes. There’s some more detail in my reply to that thread (to avoid straying too far OT on this one):

For Conda, it would indeed seem to be good enough as far as this department is concerned—see conda/conda#12245, which @pradyunsg actually opened as a result of me suggesting the potential benefits for Conda of this PEP covering Conda envs too, i.e.

As Conda would be able to achieve the same effect with EXTERNALLY-MANAGED, just under it’s control, this PEP isn’t needed to achieve the same effect (as Pradyun himself pointed out to me).

h-vetinari · February 22, 2023, 11:09am

I understand that; the reason I’m interested in PEP 704 is in order to establish a global (i.e. not just conda or whichever tool) baseline or at least default to install packages into an environment ^[1].

Just having that aspect (even if the environment in question ends up using a different tool/implementation) would IMO be beneficial, because it would just remove one huge and unproductive detail to explain and document.

And if we’re thinking longer-term, then that would pave the way for convergence on several axes, not least by not having to constantly implement everything twice for the env & non-env case, as well as the interaction of the those two modes.

even if it ends up being the system level environment, after explicitly opting into manipulating that. ↩︎

PythonCHB · February 23, 2023, 8:02pm

This is a very long thread, so I think everything I have to say has been said, but a couple general comments, just in case:

I agree about the “not just conda” part – Python / PyPa / pip should not make any assumption about what environment / system its running within – any conda-specific solution is not helpful.

But I disagree about establishing a global recommendation / approach / standard – let the other package / environment systems make their own choices.

I wear (at least) two distinct hats:

Developer of tools based heavily on the scipy stack, for which I use conda pretty much exclusively. I do not like this PEP from that perspective – it’s jsut going to make things harder for me and my users (who are often Python newbies)
Instructor of Python to newbies that may go in many directions with Python – data science, sysadmin, web application development, what have you.

From that perspective I don’t like this PEP either – virtual environments of any sort are confusing and weird and absolutely not necessary for a lot of work with Python, certainly not early learning.

I tried introducing virtualenv in day one of an intro Python class – it did not go well.

maybe someone, some day, will come up with a smooth, easy, standard way to manage environments, but we’re not there yet.

h-vetinari · February 23, 2023, 11:14pm

The gap between what I was trying to say, and what it seems you took away from it, is that I think it’s possible to keep this completely behind the scenes from the POV of users who don’t opt into manipulating their environments^[1], but provides IMO much-needed uniformity on the infrastructure side.

i.e. they get one by default, no need to do anything ↩︎

BrenBarn · February 24, 2023, 4:02am

I’ve taught Python to similar groups but I had a different experience. I agree that thinking about virtual environments is confusing, but I found it fairly easy to get people up and running with conda just by saying “create an environment called [short name of the class you’re taking], use the Anaconda GUI, and just remember to always select that environment before you do anything”. That way they don’t really have to learn much and just think of it as “oh I created some kind of folder/project/label/category/whatever for this class”.

However, because of that, I also don’t see the need for this PEP, since I don’t think it will make that situation any easier than it already is.

CAM-Gerlach · February 24, 2023, 5:29am

Environment management is a hard concept to teach, no doubt about it, both theoretically and practically—especially to students for which programming is just a tool rather than the primary focus. We definitely need both better resources, better UX and better tooling to make that easier.

However, at least if you define “virtual environments” broadly enough to include conda environments, I’ve also had a difference experience. For context, my own perspective is as someone both of someone who’s taught, mentored, tutored and helped write course materials for students (mainly atmospheric/ocean/geosciences) in a university setting (and other contexts), as well as provided troubleshooting help for hundreds, maybe thousands of Spyder (primarily online), and a number of more general beginners here and elsewhere. These are mostly folks who are not only students, but typically programming is a means to an end rather than their primary focus, and mostly use a Conda-based distribution and tools.

I’ve found emphasizing using environments (even just one non-base working environment to start) right from the beginning to be particularly critical for these users, as:

There’s more that can potentially go wrong, with PyPI, Conda-Forge, Anaconda/defaults and possibly other Conda channels are all in play and can potentially conflict and contaminate one another
These users tend to not have the motivation or background to understand and avoid the sorts of mistakes that end up breaking their environments down the road, as opposed to copying the first command they find online that claims to solve their problem without being aware of the implications.
When things do go wrong, they likewise tend to not have the skillset to solve the problem (and sometimes dig the hole deeper instead), and the least expert resources to turn to (subject matter experts, not programming experts), and often waste many hours in frustration or give up on Python entirely

In this situation, I’ve found it very common that these types of users eventually end up mixing defaults, Conda-Forge and PyPI packages in their base env, and I’ve had to help countless among them who’ve run into the resulting problems and frustration from doing so, which generally requires a full reinstall of their Conda distribution and packages (rather than just deleting and recreating one environment). If users were taught from the very start, before installing packages themselves, that they should always work in an environment, and their tools encouraged that and made it easy and cheap to do, this problem could be avoided.

All that said, as has already been covered at some length by myself and others, it doesn’t seem to me that PEP is either necessary or sufficient to solve that problem at least for those users. Conda adding an EXTERNALLY-MANAGED file to the base env would at least avoid the most serious problem (pip install in base) and have the same effect as this PEP would in have the ideal case (tools detect conda envs as “activated” along with venvs), and to me the UI/UX improvements is enough to justify the potential pain to many use cases, and could mostly be done in tools right now without the PEP anyway. The one area where I think it does have value is standardizing in-tree venv naming and activation, which would be helpful for both users and tooling/IDEs, but I’d like to see it go further on that to really make a significant difference.

Anaconda Navigator does work for some, but in my own experience and helping users with it, it’s been rather slow, buggy, has a number of limitations, and of course only works with Conda packages and environments. Users also need to remember to open a separate application, Navigator, and always launch their Python shells and applications through it.

With the Spyder team (which originally prototyped what became the Anaconda Navigator application), the approach we’re working on to help address this is integrating our new experimental micromamba-based standalone installers for Spyder 6 and our prototype environment manager pane so that users can create and manage Python environments right within Spyder with a few clicks, without having to install and set up a separate Conda-based distribution, remember to activate the (right) environment every time or open a separate application.

The idea is when users first open Spyder, they’re already in a default working environment (separate from Spyder’s runtime env and the base env) created at install time, and they can create more and switch between them with just a couple clicks, optionally tied to a Spyder project (so that a new console in the selected env opens automatically when opening/switching to the project), so they don’t even need to think that much about env management, just open their project, install what they need and done, more or less a higher-level managed version of the PEP 582 approach. It would also prevent or mitigate by default the unsafe practices that routinely get unwary users into trouble (mixing Conda-Forge/defaults, mixing PyPI and conda, etc), while making it easy to the safer options.

Of course, advanced users will still be able to use their own Python interpreter/environments (Conda, Venv, system, etc) exactly as they do now, but the idea is to make it easy for the average user to get started the right way with minimal effort or frustration learning the mechanics of creating and using an environment or dealing with the problems that result from not doing so.

PythonCHB · February 24, 2023, 6:43am

That would be great – if you can come up with something robust, let’s have a proposal for that, first.

PythonCHB · February 24, 2023, 6:52am

If I was teaching scientific programming (which is my day job, as it happens) I might do that too. Particularly if I could control all the tools.

This is getting OT, but my recommend best practice for conda is to NEVER use the base environment as the default – I have my shell set up to activate my “default” environment whenever it starts, so if I don’t need a custom environment, I can just fire away – and if I make a mess of it, i can blow it away and start again.

I would like to see conda move to something like this by default – it’s not great to have conda itself be hosted by the base environment that a user can make a mess of [*]. Honesty, re-installing miniconda isn’t that big a deal, but it would be better not to have to do that.

I don’t know if there is an equivalent option for pip / virtualenv – the thing with conda is that it manages everything, including Python itself – but if there were a way to do it with Python/pip, I’d love to see that proposal.

[*] maybe mamba and micromamba will solve this some day …

BrenBarn · February 24, 2023, 7:09am

This is getting even more off-topic, but it’s been proposed (and I think there are some other issues on the tracker that were variations on the same thing).

CAM-Gerlach · February 24, 2023, 8:40am

Yeah, same. I really wish there was an easy built-in conda option to prevent modifying base completely (with a helpful error message), except for a bespoke command that updated conda itself (and/or anaconda, for the full-fat Anaconda distribution)—would be so handy not only for myself, but more so the students that might do so accidentally. I.e., the conda equivalent of this PEP. IMO, that’s a more complete solution than moving conda out of base, since otherwise there’s still no easy way to fix base short of a complete re-install since it’s the root prefix. Of course, you could make base “just another env” and keep conda at the (immutable, except through special commands) root prefix, but that’s getting even more OT.

“That proposal” is called PEP 704, no? Or is there something I’m missing? Pip currently has such an option, require-virtualenv, which this PEP essentially proposes enabling by default, which can be done via

python -m pip config set global.require-virtualenv True

But the issue with this, like with the PEP, is that it doesn’t detect conda envs so it’s not useful for us who use it.

steve.dower · February 24, 2023, 1:34pm

There has been some work to make pip run as a .pyz (a bundled app, essentially) rather than having to be installed into the current working environment. And you’ve been able to specify command line options to run it against a different environment for ages now. But it’s not the norm, and I’ve certainly never seen it in tutorials or lessons.

I added this to Visual Studio years ago and it worked great. The advantage there, of course, is that VS really does define a workflow, including its own project files. So once you’ve decided to go into a VS workflow, it’s real easy to support one (or more) virtual or conda environments as part of it.

“Standard” workflows today don’t have any such project file - they rely entirely on CWD and environment variables to work - and so it’s much harder to support anything like this. Moving to use well-known filesystem objects (files/directories/symlinks) is an improvement, but still a significant workflow change.

PythonCHB · February 24, 2023, 6:03pm

Maybe I have misunderstood, but I don’t think so.

What I’m suggesting, in a hand-wavy way, is that there IS a default, single global environment that gets used when folks type python and pip install at the command line, but that not be the actual Python install. So you can make a mess of it, and rebuild it without re-installing Python itself. Honestly, I don’t know if it’s as easy to “break” a Python install by pip installing / uninstalling packages as it is with conda, but it’s a bit hard to believe that things wouldn’t get quite messy after a while of adding / removing a whole pile of complex interdependencies…

Side Note: On another thread, folks are hashing out the idea of having pip manage shared libraries as well – if it goes there, this kind of environment “breaking” might become more common.

brettcannon · February 25, 2023, 1:26am

More motivation to factor out the location code the Python Launcher for Unix and make that library crate cross-platform it already does exactly this search/selection for an environment.

The Python Launcher for Unix does this plus the PATH search. But with the library crate above it wouldn’t be hard to build a custom solution that left out the PATH search.

steve.dower · February 26, 2023, 5:14pm

That would be user site packages (which ought to be the default anywhere it can be controlled by the admin), to have a global (per-user) directory, or PEP 582 to have a default per-project-ish directory.

Incidentally, the default for a Windows Store install of Python is to use the user site packages, and if you mess things up you can use the standard app “Reset” button to clean it all up. So the general concept exists and is usable, it just isn’t popular.