Pre-PEPs to bring some ideas from PEP 582 back to life

uranusjr · February 7, 2021, 2:16pm

I’ve drafted two PEPs that extract ideas I found in PEP 582, but modified to apply to virtual environments. Links to rendered version below.

Virtual environment locations in a project

This one proposes a directory similar to __pypackages__ as a standard locations to store virtual environments in a project root. Directories in it should be named like cpython-3.8-macosx_10.15_x86_64 so tools can reliably locate environments created by another tool.

Configure where an interpreter loads local packages

This is a writeup of my own idea on the interpreter-independent virtual environment. This adds a PYTHONVENV environment variable to interpreter startup, so a site-packages prefix can be used without having a python command inside it. This can be used to loosen virtual environment requirements, and allow more flexible workflows like:

python -m pip install --prefix ./env
PYTHONVENV=$PWD/env python  # Use packages installed in the prefix.

Tools wrapping the virtual environment (e.g. tox, pipenv) workflow can also use this to more easily set up a script to run inside an environment.

I need sponsors to submit them to the PEPs respository as drafts. Each of them will get its own post under the PEPs category when they are merged, but feel free to discuss anything down here before that happens as well.

bernatgabor · February 7, 2021, 3:24pm

May I propose instead to drop the platform naming directory (implementation - version - platform)? Instead, we encode these metadata into the pyvenv.cfg as key values. virtualenv already does this and is a much more scalable approach.

I’d also like to see some flag explaining if the creator allows other tools to modify the virtual environments. For example, why I like tox virtual environments to be discoverable, I’d rather not allow IDEs to alter those on their own (aka install new packages). Because if some other tool than tox can alter the contents of those virtual environments tox has to do full environment validation every time to ensure it’s still in sync with the configuration.

Overall I’m +0.6 on this.

posix and nt schemes create virtual environments with different internal structures, and even finding the correct python executable can be a bit of an adventure.

Tools such as tox/pipenv can already solve this by calling a shutil.which before passing on the invocation, tox already does this. So I don’t think this is needed for tools, so probably more useful for human users. Altering with environment variables, though, is something that may come naturally for advanced users but is hard to teach and use correctly (is very brittle - because some script somewhere might have altered your environment variables, and you don’t see it).

python pip install --prefix ./env

I assume here you actually propose that pip actually to support such behavior. At the moment for example this invocation generates broken console scripts. I feel this proposal needs more explanation than at the moment contains.

uranusjr · February 7, 2021, 4:17pm

From what I can tell, both approaches are more scalable in certain aspects. Not including these in the directory name may make it too resource intensive for something like pythonloc to find a suitable environment (it will need to read many setup.cfg files).

Sounds reasonable.

Run shutil.which on where? nt and posix using different names for the script directory is an annoyance everyone can do without. Yes, it’s not a show-stopper. Most of the things virtual environments don’t do well are not show stoppers on their own, and they get a poor reputation because these things add up.

That’s an implicit benefit of this proposal, actually. pip can fix this once and for all, by generating something like:

#!/bin/sh
export PTHONVENV=$path_to_user_specified_prefix
${sys.base_executable} -c 'code that loads the entry point'

bernatgabor · February 7, 2021, 4:41pm

It needs to only read pyvenv.cfg. Why would it need to read setup.cfg? Reading a configuration file is trivial.

Use sysconfig.get_paths('scripts') to get the executable folder, and then use on that shutil.which.

One thing I haven’t any word on in proposal 2 is how would it behave when one passes a python3.6 to python3.9 env. Arguably you’d only want to allow a single python version per site-package, and not mix and match them.

uranusjr · February 7, 2021, 4:47pm

Sorry, it’s a typo, I meant to write pyvenv.cfg. Yeah, it’s trivial to read, but still a file to read. It’s not free, or even cheap when there are more than a few environments.

Run sysconfig.get_paths('scripts') against what? To do that you’d need to know the path to the executable in the environment, which is exactly what the tool does not know in the first place. This is alreay known to be an unsolvable problem, and the best tools can do is to guess.

The program fails. You can already do that in a lot of ways now already, the environment variable doesn’t really make this easier or more difficult.

pf_moore · February 7, 2021, 4:48pm

This looks incredibly annoying if you want to use the environment outside the tool. I really don’t want to have to type __pyvenvs__\cpython-3.10-win_amd64\Scripts\python.exe -m pip to invoke pip from my shell…

What’s the rationale for needing multiple virtualenvs? I’ve always got on fine with just <project root>\.venv, and most of the tools I use have as well (or they create “private” virtualenvs, that I don’t need to interact with, like tox and nox for instance).

bernatgabor · February 7, 2021, 4:52pm

I believe part of the idea here is that we want tools (read IDEs) to use the private virtualenvs by those tools, rather than force users to create their own dev version. And tools want multiple environments for sanity. Think you’re still supporting python3.6 and python3.10, you might need two venv in your root to check on both.

Note this would no longer be the case with my proposal above to move the version/platform into pyvenv.cfg of the virtual environment.

uranusjr · February 8, 2021, 5:44am

This is true, and I don’t have a good answer for it. Plain virtual environments are not going anywhere (not every Python program is a “project” after all), so you can still do python -m venv .venv and use that.

The main targets of this proposal are tools that do have a “project” concept to support. Those tools either already support running the code against multiple Python environments (e.g. tox and nox), or have users request this feature. It is not uncommon for projects to switch between Python versions during development to ensure the code against all of them. Ideally the user shouldn’t need to directly call executables inside the environment, but use the environments through the project management tools instead.

Even if the information is store in pyvenv.cfg, we still need a naming scheme to distinguish (say) test environment for CPython 3.6 and test environment for PyPy 3.5, and all the possible disambiguation markers put together would basically bring us back to something not far to the current scheme. We can probably make it shorter though, like how wheels shorten cpython-3.6 to cp36.

Another approach to this is to give up on platform disambiguation altogether and always call the test environment e.g. tests (continuing with the example above). If that environment is built against CPython 3.6, and the user runs pythonloc -e test-pypy, pythonloc would detect test environment as incompatible and automatically rebuild from scratch. That’s how Cargo does it IIRC. But I’m not sure how tools e.g. tox would like this approach.

FRidh · February 9, 2021, 3:20pm

This is what we do in Nixpkgs. We have the Python packages sets, and we have applications (tools). Packages in the set are primarily libraries, but can also be used as applications.

The tricky part is to not leak variables specific to the environments. For example, to compose an environment, we use NIX_PYTHONHOME which works like PYTHONHOME, and is unset in sitecustomize.py so it won’t leak. All packages including the interpreter are symlinked together. Applications on the other hand have typically a line of code injected in them that calls site.addsitedir. Other variables (e.g. PATH) can also be added to the wrappers or injected, however, these are not unset.

Note we don’t use virtualenv or venv to build an environment, since we want to avoid duplicate packages. We also don’t use pyvenv.cfg because its in the / folder, and we don’t allow files there.