Setting up some guidelines around discovering/finding/naming virtual environments

steve.dower · January 24, 2023, 2:38pm

I’m -1 because I think this entrenches dependence on an environment format that we know is confusing, wasteful, and fragile. Once we start promoting these kinds of guidelines, it becomes much harder to try something else.

If it’s not going to go into VCS, then it’s got no reason to be portable between tools. It’s a per-user setting at that point. Users who regularly switch between editors are the special breed.

If it’s not in VCS, it doesn’t help someone bootstrap. You can’t just clone a repo and “venv run” or whatever to get going.

On the basis that the “presence of a special directory is too magical” for PEP 582, I fail to see why the presence of a special file isn’t just as magical.

Having a file point to a shell script (which shell[s]?) that modifies the environment (which process?) so that a PATH search will find a symlink/copy/redirector to a Python runtime is a very convoluted solution.

Standardising a setting for editors/tools just allows those tools to refuse to engage with alternatives that don’t/can’t use that standard. I’m sure most won’t want to do that, because it’s pretty user-hostile, but then the standardisation hasn’t really helped anyone.

For reference, when I was maintaining an editor our approach was “provide the path to Python and we’ll run that, or if you want us to create an environment we’ll use venv”. We also detected requirements.txt and offered to create an environment using venv, and we would do a shallow directory search for Scripts\python.exe to find likely environments. The only issues our users ran into were installing binary packages (much less likely to work 5+ years ago) and when referring to embedded copies of Python didn’t work well (IIRC, Maya was the worst). This file would not have helped.

If we were to go anywhere here, I’d suggest we do a PEP 517 but for environment creation and package installation:

[dev-environment]
requires=["pip"]
env-backend="venv.HypotheticalPublicApiForThisInterface"
env-name=".venv"
install-backend="pip.HypotheticalPublicApiForThisInterface"
install="file:dev-requirements.txt"  # I forget if there's existing syntax for this

All it has to support is a “create an env when none exists”, “install packages into an otherwise empty env”. The env backend API knows how to create and can provide launch params (args/env/cwd/etc.). The install backend API knows how to resolve, download, extract, etc. They could be the same tool.

Front ends can allow overriding the env-name and install parameters, but a default install should use them. And a user can use whatever tools they want and just ignore pyproject.toml completely - it’s just a central place to provide the preferences for the project, just like PEP 517.

Compared to the .venv file, this encourages innovation. Posy could support it. Conda could support it. Pip/venv could support it. Heroku/equivalent could support it. GitHub Actions could support it. Packages can be installed by linking, or by copying, or by installing import hooks, or whatever they like, because the activation arguments could turn python myproject.py into python -m envlauncher.run myproject.py, or (the equivalent of) subprocess.run("python myproject.py", env={"PYTHONPATH": ...}).

Compared to the .venv file, second-order tools (editors/IDEs) don’t need to know how to invoke the first-order tools (venv et al.), which means anyone can work in VS Code without Brett’s blessing

It still allows the env creator to put it wherever the user wants. One user may want to keep them inside working directories by default - another might want them stored in a specific central location. That remains a per-user configuration option, just like .venv would be, except it can be configured once for a particular env backend and then used automatically for any project using that backend.

But I think the strongest point is that we get to reuse our existing magic file, which is quickly becoming a dumping ground for everyone’s dev settings anyway. No matter what was intended, this is what it is, so we might as well embrace it and add sections that will be useful to our users.

rgommers · January 24, 2023, 7:15pm

+1 to the basic idea of a PEP 517 for environments, that is the right way to go I believe, given that there’s multiple incompatible environment types and places to put them.

This will require some more thought. The pyproject.toml content you sketched out works only for single-user or small team projects where everyone agrees to use the same thing. It doesn’t work for, say, larger open source projects. You can standardize the env name there, but you cannot standardize the environment type or install tool - the point is to let different contributors use their preferred tools (without hard to spell front-end overrides). So what you need is a way to add multiple configs. It’s exactly the opposite of a build backend, where it’s “one project, one build backend”.

In this case, you want the project to have (not valid syntax, just the logic):

env-name = 'mypkgname-env'  # .venv is a poor choice for anywhere outside the repo, it's going to clash
install = {
  "pypi": "file:dev-requirements.txt",
  "conda": "file:environment.yml"
  ... # can be extended if desired
}

You want to have the workflow tool have:

env-backend="venv.HypotheticalPublicApiForThisInterface"
install-backend="pip.HypotheticalPublicApiForThisInterface"
...
# same mapping for other environment managers and installers

And you want the user to be able to override the defaults of the workflow tool:

env-manager = "venv"
env-scheme = "/path/to/all/my/venvs/"  # could default to `.venv` inside repo if not given, not so important
installer = "pip"

steve.dower · January 24, 2023, 7:19pm

There’s no “hard to spell front-end override” - the commands to use a different tool are identical to what they would be today.

The “default” or “recommended” approach is simplified through a single tool, which I’m imagining to be something on the level of VS Code or PyCharm or a dedicated tool like build is for doing builds.

If a project desperately wants two different defaults, then I’m sure a backend will be developed that can provide both. That’s the point of this model.

dstufft · January 24, 2023, 7:33pm

I’m not sure I see the value in a PEP517 but for environments?

This feels like a case of "when the only tool you have is a ~~hammer~~ interoperability standard, everything looks like a nail.

rgommers · January 24, 2023, 7:34pm

This doesn’t add up. It’s individuals that have preferences here, not projects. What’s the point of having project 1 hard-code venv as its default, project 2 conda and project 3 posy? Then one user has to have all these tools installed if its wants to use the default tool.

So that would let PyCharm make the choice on the users behalf? That seems like the wrong place to do that.

steve.dower · January 24, 2023, 7:54pm

If it’s the wrong place, then PyCharm’s users will complain about it and PyCharm will fix it

But you’re still assuming that projects want to list all their dependencies multiple times in multiple formats so that their contributors can choose the workflow that works for them, rather than the one that the project has agreed on and is using consistently. Maybe you’re lucky enough to work on such considerate projects, but I’ve never seen one - at best, they have a supported dev workflow, some hints for alternatives, and perhaps some release build processes. This helps formalise the supported dev workflow, without impacting anything else.

Dev tooling can offer all the approaches they like. But they should offer “use the project’s recommendation” which would be whatever is listed in the pyproject.toml. There’s no way to override that, you simply use a different workflow and figure it out yourself.

The original proposal feels like “when the only tool you have is venv, everything is venv”

But the original proposal is also literally describing an interoperability standard. Using an existing standard in place of creating a new one is generally a good idea - I’m sure someone posted that XKCD recently.

rgommers · January 24, 2023, 8:22pm

Indeed. Short of the package name mapping that I’d like to have but doesn’t exist, it’s either multiple formats or just lacking them and still having contributors using a mix of pip/conda/docker/whatever.

Here you go: numpy/numpy, scipy/scipy and pandas-dev/pandas all have environment.yml and *_requirements.txt files in the root of the repo. Sometimes a Docker file hidden away somewhere too for good measure.

I’m fairly sure that if those projects would be forced to choose one tool/workflow, that’d be the trigger for long and painful discussions. Which is unnecessary and not desirable.

I’m just going to say that I strongly disagree here. The correct players to choose a default are:

A workflow tool or standard ^[1]
Individual users

A project has dependencies, and a build setup to get itself installed. There is no need for it to concern itself with how its contributors are installing those dependencies.

for a global, common default per workflow tool or across all workflow tools - it doesn’t matter too much what that default is imho ↩︎

steve.dower · January 24, 2023, 8:43pm

Nobody gets forced into anything. They can keep documenting how to set up a dev environment as they do today, because no tools actually force a change upon them.

Now, users or first-time contributors may come and say that it would be easier for them if the project specified a default workflow, but even that doesn’t preclude the project from saying “here’s one way among three to set up your dev environment”.

(Worth noting that the original proposal in this thread doesn’t do anything for this scenario either. They’re both neutral on it, except that it’s very unclear how the .venv file knows to activate a Conda environment, whereas my proposal at least makes it a possibility.)

We have an entire survey of individual users saying that they’re not the correct player, and are insisting that we define the global workflow tool. A few hundred posts and multiple spinoff ideas later, that still isn’t going anywhere. So the next best thing is to define the interface between the two small groups everyone funnels through (i.e. the frontends and the projects who choose a backend).

But if we go any further off track, all these posts will just be pushed into a separate thread and the original proposal will appear unopposed. All I want to present is that the original proposal is at best not very useful, and at worst limits our ability to make improvements in the future, and those potential improvements exist but aren’t agreed upon yet. So I’m -1 on accepting the original proposal.

dstufft · January 24, 2023, 8:50pm

I’m just not sure what we’re even trying to solve with this?

Like I understand the thing that the original proposal is trying to solve, multiple tools all want to know where a virtual environment is located, so we define a way to determine where that is.

But I don’t understand at all what the “PEP 517, but for environments” is trying to solve. Surely the environment tool is an end user decision, not a project level decision? Having end users mutate the pyproject.toml for an existing project, with user specific configuration feels really bad to me.

Is the concern just that VSCode (or some other tool) needs to write code that can create a virtual environement of of multiple technologies at once?

rgommers · January 24, 2023, 8:51pm

That I agree with.

brettcannon · January 24, 2023, 9:14pm

I am because no one really latched on to that idea.

Correct, or more to the point it’s up to tooling to decide.

I don’t view this as a “switching” issue, more of an “integration between tools” issue.

Who said anything about pointing at shell scripts when it came to the .venv file idea? And virtual environment don’t require PATH manipulation to work, so I don’t see what that has to do with this either.

And that’s what we already do in VS Code. But people also want support for their package management tooling which has already decided where things like a virtual environment are going to be. And this tooling can be by preference or by edict from their team. And people don’t love having to specify something that is already known somewhere else. And I hate having to write custom code for every package management tool just to find something they already created.

A potential issue with this is the boilerplate that projects would have people copy over to their pyproject.toml. The .venv solution at least leaves it with the tools already doing the work that users installed and are using.

What “potential improvements” are you thinking of here that we have not agreed to yet? Is this a reference to the other discussions going on about a unified packaging tool?

As for the usefulness, I personally have a use case today that has existed for years and persists being a problem. If virtualenvwrapper and/or pyenv-virtualenv adopted the .venv file proposal it would be a big win. Add on tools like Poetry, Hatch, and PDM, then suddenly their workflows participate with tools. (I’m not worrying about conda as people can specify conda environment names in an environment.yml file).

brettcannon · January 24, 2023, 10:37pm

One bit of clarification here is I am not explicitly seeking a PEP for any of this. Some offline feedback I have gotten about this has suggested some of you may be reading more into this than I am. Right now I am just looking for community buy-in so I can get support in the places I have influence (i.e. VS Code and the Python Launcher for Unix) and then reaching out to other tools to add equivalent support (e.g. virtualenvwrapper).

jack1142 · January 24, 2023, 11:15pm

Personally, I’ve seen some tool (I can’t remember the name, I think it was rather new at the time) in the past that used specifically a .venv file but I didn’t like that this conflicts with the existing convention of using .venv as the venv’s directory (as then you can’t have both a .venv file pointing at the venv and a venv in .venv directory) so I think it would be better if this kind of solution didn’t reuse that specific name and instead went with something that isn’t already used by something else.

brettcannon · January 24, 2023, 11:20pm

That’s somewhat the point. You should only have the file exist because you have a virtual environment somewhere else. I would say that tools that ignore a preexisting .venv in any form probably has a bug or is missing some logic.

Kwpolska · January 25, 2023, 6:19pm

At least one tool that does this is the oh-my-zsh virtualenvwrapper plugin. The plugin expects an environment name (relative to $WORKON_HOME), an environment path, or an activate script path — and it is smart enough to detect and support a venv in .venv.

And in my opinion, this would be a way to go for this proposal. Tools which support this proposal should support all scenarios (in which .venv is a file, or a venv directory, or possibly even a symlink to a directory^[1]). Regardless of how this part goes, I’m +0 on the proposal (as this is an improvement, but I still believe PEP 582 would be superior over any venv stuff).

On the other hand, pyproject.toml is not the best place to specify paths for ephemeral, machine-specific, and user-specific values — while forcing all developers in a company to use venv and a .venv directory is okay and a good thing to do, it won’t fly in an open-source projects. If the env-name standard supports ~, then it might be usable for people who prefer a central venv directory in their home directory, as long as everyone agrees to use the same directory name. And open-source projects which do support the pyproject.toml environment specification standard will have to make sure the file isn’t overwritten by a random contributor who adjusted the file to their preferences and ran an overzealous git add ..

I would be fine with the symlink option not existing, especially if there are major tools that are confused by symlinks to a venv or other important blockers. Windows making symlink creation slightly harder is not an important blocker, since once you get past launching an administrative PowerShell and typing out the New-Item incantation, symlinks work really well. ↩︎

methane · January 25, 2023, 11:48pm

PEP 582 won’t cover many use cases venv supports. So making venv workflow better is very important regardless PEP 582.

pradyunsg · January 26, 2023, 12:25am

I’ll note that @Kwpolska’s advocacy for PEP 582 stems (at least) in part from use of PDM^[1], which does not implement PEP 582 as written; even prior to the changes that the PEP’s author noted in the above quoted comment.

Based on How to improve Python packaging, or why fourteen tools are at least twelve too many | Chris Warrick, which also includes the quote: “I consider that the PyPA must be destroyed. The strategy discussion highlights the fact that they are unable to make Python packaging work the way the users expect.” ↩︎