I completely believe you, but that is my whole point! There are a lot of valid ways to solve this particular problem and the differences between them likely skew in to subjective preference. If I wrote this pep the message would look like :
Please create a venv! run the command python -m venv ~/.virtualenvs/sys310 & . ~/.virtualenvs/sys310/bin/activate and then re-run the command
(with templating based on the Python version) and would expect to be argued with because that would be overly privileging my (possibly narrow) view of how this should be done.
The analogy that comes to mind here is the, in principle, very helpful message GitHub gives the user when they try to push to a branch which has diverged from their local branch that suggest pulling the remote branch and then pushing again. In many cases this is the right thing to do, however on Matplotlib when there are merge conflicts between a feature branch we prefer rebasing over merging upstream main into the feature branch (please accept this position and that we do not squash merge as a given to avoid also spawning a discussion of the ārightā git workflow ).
This means that the first time many contributors have to do a rebase they (reasonably) follow the instructions git told them and end up with a whole bunch of extra commits and then we have to walk them through un-doing it (we finally wrote it up in the docs).
If this PEP does go in I fear the amount of āyes, pip does say that and in some cases it is right, but ā¦ā discussions that will have to be had.
Iām generally someone who avoids putting my virtual environments into my project directory, and thatās honestly because doing so ends up having a lot of bad behaviors by default. Lots of tools recurse from the current directory, and when you stick your virtual environment in that current directory, it means itās going to recurse into that as well.
Now almost all of those tools offer some mechanism to fix it, typically by ignoring that directory, but that ends up requiring me to configure each of those tools independently, and often times per project, so when i switch to another project, the bad behavior comes back.
That being said, presumably if we standardized on something in tree, eventually tools would ignore those paths by default, and the biggest pain point of them goes away.
Though it would likely by ideal if we could pick something that supported multiple interpreters, because I suspect a non trivial number of people have reason to have multiple environments per project, and .venv doesnāt enable that.
I share some of Thomasā concerns (although for many projects of mine an in-tree virtualenv would work fine).
The solution I use these days is to have a .venv file that contains the path to a virtualenv that lives somewhere else. It works pretty well and is very flexible, e.g. for switching between multiple interpreters on the same project or re-using the same environment for multiple projects.
This is easy to build tooling off of, e.g. I use a shell plugin that checks this to automatically activate environments when I change cwd. And maybe Brett could teach this trick to VSCode
Personally, Iāve been using .venv as a symlink to the actual virtual environment for some time now to be able to switch between multiple venvs for a project. Some things expect .venv to be a folder rather than a file and symlinks work great with that. I still put all the venvs in-tree within a .venvs folder (which has caused me problems a bit more often than .venv does as not as many tools ignore it by default) as I like to keep everything in the project folder but thereās nothing that would prevent this from working with a global directory for all venvs. Sadly, symlinks arenāt universally supported by all file systems so itās not really a viable option as a standard. Still, using .venvfile would probably cause more problems with various tooling than a painless symlink.
Iām open to ideas for better conveying this in the PEP, but hereās my take on this: An in-tree .venv is a reasonable default. Weāre not locking people out of their workflows by deciding on a default.
I want to go a bit further though: Do anyone think itās not reasonable to say āAn active virtual environment is required. If you donāt have one, you can create it by ā¦ā?
While it clearly biases toward one approach (the one suggested), it certainly doesnāt lock out of other approaches for managing virtual environments. It also provides a clear āhereās how to get to a free-of-paper-cuts setupā approach.
Yes, I understand and appreciate that there are workflows where the proposed approach isnt sufficient. However, no one is talking about preventing people from creating virtual environments in other locations.
We canāt do something that works for every known workflow, therefore we shouldnāt pick a default is a bad approach.
My take on multiple interpreters is that you should be using an environment management tool at that point, circa nox and friends, to do that work.
Iāll update the PEP to cover Conda, multiple environments per project and centralised storage of environments; but thatās not gonna happen until tomorrow.
If tools like venv and virtualenv supported that usage, that would be a reasonable possibility (although it still doesnāt solve the āback linkā issue for identifying where the venv is being used). But manually creating a virtual environment in a shared location and then creating a symlink is a lot less convenient than python -m venv .venv. Particularly as I very rarely use symlinks, so I can never remember how to create them - for reference, itās New-Item -Type SymbolicLink .venv -Target C:\Some\Path\To\shared\venv where it appears that you canāt use ~ in the target or some weird things go wrongā¦
If thereās a standards-based recommendation for the name of the expected virtual environment (i.e. what this PEP is trying to establish) then yes, this is a reasonable message. But like most other people, I donāt think itās up to a PEP to make that statement, itās for the maintainers of the individual tool(s) to choose how to let the user know.
Even without an established standard, tools are welcome to add this message now, and can suggest whatever environment name they like. A PEP is only needed for us all to agree on what name we want to assume in the absence of any other information.
It seems like this PEP is drifting towards just saying:
Installers SHOULD refuse to install into any environment that isnāt a virtual environment without an explicit opt-in from the user (in the form of either an explicit install destination like --prefix or --target, or a specific --i-know-what-i-am-doing flag).
Tools wanting to determine a userās default/intended virtual environment SHOULD look for a virtual environment named .venv, in the project root (or āby searching directories upwards from the current directoryā, or āalongside pyproject.tomlā, or whatever you want to say here).
Tools wanting to create a virtual environment on a userās behalf SHOULD give it (or recommend) the name .venv, in (whatever directory matches the logic from (2) above) unless the user explicitly requests another name.
To be honest, (1) seems quite different from the other two points.
Also, (2) and (3) probably need some refining, as we may need something to address the possibility that the user explicitly requests a different name (as allowed by (3)) and then the logic in (2) gets hopelessly confused because itās not aware of that decisionā¦
To put this another way:
I think requiring an active virtual environment is an independent point, and given that in reality pip is the only installer likely to be affected, I think itās something pip can do without a PEP.
I think virtual environment naming is worthy of an (interoperability) PEP if we want to standardise it, but it needs more substance, to cover how we track the userās choice of name if they override the standardised default. Given that venv is a stdlib module, this may even take such a PEP beyond packaging standards and into the area of a core standard (for example, if we want to add āownerā metadata to virtual environments).
I definitely agree with that, Iām mostly working on Linux nowadays but I still have trouble remembering if the first argument is the target or link name. Since itās a workflow I havenāt seen anywhere else, I personally just wrote a direnv layout script that prepares this for me (creates a .venvs/3.x venv using a specified version with ā$dirname-3.xā prompt if it doesnāt exist and updates the .venv symlink appropriately) but having some support from venv/virtualenv would make this more manageable. But even then, Iām not convinced this would be a good solution due to the problems with symlinks/junction links on Windows and file systems that simply donāt have support for symlinks at all.
Iām unsure what you mean by that. If a venv is put in .venvs directory inside the project folder, it should be known where the venv is used but if itās not (in case the symlink just points to a venv in some global venvs directory), the custom venvās prompt should still make it clear what the venv is used for (unless youāre looking to figure this out programatically). Did you mean something else by the āback linkā issue?
Sorry, the conversation is split across two threads, I think. What I mean is that if I delete my project directory, thereās nothing that lets me know that the environment (in a central location) is now āorphanedā and can be deleted.
Tools could exist that manage this (remove orphaned environments, for example) and disciplined use of tools/processes could avoid it (a delete-project script that tidies up referenced environments). But Iām not disciplined, and mistakes happen, so having the information recorded in the first place is important.
Yes, that is what bothers me with virtual environments that are not next to the project. How do you handle the orphaned environments? For example Poetry does that, and last time I checked there was no clean way to handle this. No way to know if a environment is still in use or not. I do not know what a clean workflow would look like. How do you do it? Do you have maybe some text file or database of which environment corresponds to which project? I am honestly curious.
Yes, I think itās unreasonable to say that, at least so briefly. āRequiredā for what, exactly? Iām pretty sure a venv is not required simply for running Python; is a venv required for any use of pip? For any use of pip with these parameters? (Or without, same difference.)
Iām also not a fan of the statement that virtual environments are āessentialā, since - again - anything that doesnāt require any third-party software shouldnāt require a venv (unless Iām completely misunderstanding something here).
To be completely honest, if pip starts saying āthere MUST be a venv active AT ALL TIMESā, Iām just going to create a single venv in my home directory and activate it in my .bashrc to shut up the message. In effect, it would be exactly the same as the current form of user-level installation. Whatās the advantage? (The PEP as currently written hints at a theoretical way to opt out of this demand, so if that exists, Iād use it; but if it doesnāt, a single global venv is basically the same thing anyway.)
Virtual environments are extremely helpful for applications that are going to get deployed. For everything else, why are they mandatory?
@Rosuav Iām not sure I follow what youāre saying. The PEPās language is:
When a user runs an installer without an active virtual environment, the installer SHOULD print an error message and exit with a non-zero exit code.
Do you want āruns an installerā to say more specific?
If it isnāt this, what language in the PEP isnāt sufficiently clear?
Only if you do additional things ā i.e. create a virtual environment with --use-system-site-packages (or however thatās spelt).
Thatās not however the default for virtual environments, and that isolation from the global/system environment is (partly) the whole point of virtual environments in this context.
Is pip an installer, or is pip install an installer? Can I pip search without a venv? (Probably but Iām not entirely sure.) Can I pip freeze without a venv? (No idea.)
Would appreciate some clarity on this point too, then; what exactly ARE the differences between all the different ways of isolating? Clearly my mental model of user installations and virtual environments is wrong, given that I have generally thought that /usr/local/lib/python3.X/dist-packages is āstuff installed by your system package managerā, ~/.local/lib/python3.X/site-packages is āstuff you installed globally with pipā, and the currently-active venv is āstuff you installed for this app only, with pipā. Where am I wrong here? Or is it that venvs are supposed to replace the second category?
Some additional more content-relevant notes on the PEP, originally made on the review:
What counts as a āvirtual environmentā? Only an environment created with venv/virtualenv? What about Conda environments? Or PEP 582 environments? Or other types of isolated environments? IMO, the PEP should either explicitly define this, or link (:term:) to a precise and authoritative definition, e.g. in the PyPA glossary.
[Also, what counts as āan installerā? Obviously, pip counts (as its mentioned by name), and Iām assuming Hatch, PDM and Poetry count as well when used in that capacity, while apt, dnf, brew and choco donāt, though that isnāt explicitly stated.
If you limit it to āPython-specificā installer, what about tools like shiv, pex, etc.? And what about, of course, Conda? IMO, the PEP should provide a precise definition of that term and examples of what tools would and would not quality (similar to PEP 668 for what qualifies as an externally-managed environment).
Further, what āPython versionā is being referred toāthe version in the environment being installed in to? The version in the installerās own runtime? Something else? The PEP should be specific here.
Workflow tools (which manage virtual environments for the user, under the hood) should be unaffected, since they should already be using the virtual environment for running the installer.
They will be affected if they donāt use venv/virtualenv virtual environments (e.g. Conda envs, Iād assume PDM PEP 582 envs, and possibly others), assuming tools donāt detect them. Either way, seems like that should be explicitly addressed in the PEP.
FWIW, this is also my experience this or something somewhat similar is a pretty common workflow as well, what I often practice and recommend to others. Additionally, another very common scenario I run into a lot, both inside and outside of the sciences, is multiple libraries and/or applications developed together that interact and must live in the same environment.
Yeah; that was my concern above as well. My impression is that given how common this is on the interwebs and in printed messages, despite the fact that it is usually (though not always) a bad idea, that enough users are likely to do this such that this would be likely to be a large-scale UX problem.
As helpfully advised by @pradyunsg, I am cross-posting my comment here. Just coming from Apache Airflow context where we weāve been discussing it for a long time and have been involved in many discussions about it:
Just a comment about that one. I really like that we have an opt-out now rather than opt-in (but itās good we have an explicit opt-out as well that is useful in many cases - for example in most container use-cases. Those are specific use-cases and having and opt-out possible is more than enough for those cases - especially for legacy use cases.
Also having a default convention for the venv usage is great as part of this PEP. This will make a number of use-cases simpler and less decisions to make and having implicit steps for activation of ~/.venv environment is a good one too.
Iāve been reading through the discussion again prior to another round of updates. Other than requests for clarifying language like āwhat is an installerā/āwhat is an environmentā etc, Iām noticing two things here:
Concerns that UX of tooling isnāt supposed to be a PEP.
Concerns that having a single virtual environment workflow documented as the default is problematic because single virtual environment based workflows donāt cover all use cases/workflows.
For the firstā¦ I guess Iām hitting a governance/process issue. Iād figured that weād want this to be a widely discussed thing that benefits from going through the same framework as a PEP, so why not make this a PEP. And the counter argument of we donāt do PEPs like that isā¦ frustrating but fair. Iām not sure what to do about this. I donāt think that discussing this only on pipās issue tracker is the right way to go about this because it affects not just pip but also how it interacts with multiple other things! I guess Iām hitting the wall of our process not fitting what I think we need here, and Iāll take that discussion to a separate thread.
For the latterā¦ I know and agree. Nothing is blocking you from having a multiple virtual environment workflow or having a workflow where thereās a centralised management of virtual environments ā having a consistent default suggestion isnāt going to block projects that need more complex workflows from continuing to use them.
Regarding conda, if we draw the line as Conda environments are basically system environments rather than Python environments because Conda ships everything, the obvious cororally to that is that it shouldnāt be treated differently and should require virtual environments. Now, it is well known and well documented that conda and pip interoperability isnāt great and that the two operate on different metadata models. This PEP would effectively enforce a clear split between managed-by-conda and managed-by-pip. Now, Iāll admit, with this view, Iām suggesting that we break user workflows ā and, that aspect of this PEP should be better clarified; Iāll do so. I do think that enforcing this clear separation between managed-by-pip and managed-by-conda packages will be a good thing.
Itās a balancing act though, and if folks think that we should be doing something different, Iām all ears.
This PEP currently does not require any sort of automatic activation of environments. That will be a stumbling block for people who want a no-extra-steps workflow. However, it also means that weāre changing a subtle failure/thing-that-would-cause-issues-later with an explicit error which will also provide guidance on what to do. Subjectively, I think thatās a better place to be in since consistent errors and clear guidance are better than inconsistent failure modes and difficult to find/locate/apply guidance.
Agreed. No one is saying that you need to create a virtual environment to use Python. The PEP is saying you should be creating a virtual environment for installing and using third-party software; by default. Thereās an opt-out for workflows that need it.
Perfect, youāre the exact sort of user persona that I want to have an opt-out for.
FWIW, this isnāt limited to sciences.
This is generally what gets recommended for reusable functionality vs business logic, for example.
I noticed that I didnāt clarify when I responded to this earlier: the proposal is that in-tree virtualenvs are good-enough to be a default suggestion while being easy-enough to discover and reason about. The difference is perhaps subtle but important. To be explicit, theyāre not universally the best!
I wasnāt sure how to respond to this or whether to let this slide unresponded to. Iām reading this as implying that this is what is happening here and be careful to not do it ā if so, IMO that is not correct. Avoiding that from happening is literally why I wanted this to be something thatās not just discussed on pipās issue tracker.
FWIW, I guess I should clarify that the things that the PEP suggests arenāt āthings someone likes and wants to push on everyoneā. The whole point of this PEP is to change a workflow expectation: that pip itself can be used outside of virtual environments and itāll unpack to the user-site or system-site by default. If we donāt want to change that workflow expectation, thatās fine. I donāt have a horse in this race; how a PEP like this changes things for experts who maintain Python or Pythonās packaging tooling isnāt really something I want to optimise for, Iād much rather focus on the UX aspects here for the broader audience.
This is not a good idea. Virtualenvs on top of conda envs do not work well (worse that pip-installing into a conda env). Itās also not necessary and treating conda envs like system envs is conceptually not quite right. Conda envs are like system envs in terms of what they are able to contain, but much more like virtualenvs than like real system envs in terms of their most important characteristics: they need activation, you can have multiple of them, they have their own lock/requirements files, theyāre ephemeral (destroy/recreate rather than updating them often is recommended).
If you want to include conda environments in your picture here, then you could:
rely on the externally managed designator for installers (good idea for the base env at least, and thereās an active discussion on that),
treat non-base conda envs like virtualenvs rather than like system envs, either by special-casing conda envs or by generalizing whatever you do to āuser-activated environmentsā (Iād quite like the latter),
or leave things as they are.
All those options are better than what you are suggesting here.