PEP 832: virtual environment discovery

That’s not a specific concern if we make this generic.

Are you shipping build-details.json with Python? And if so, I assume you can still figure out how to find the file based on the location of the Python binary?

Reading Configuration — conda-workspaces and the blog post makes it seem like there’s less of a concern for conda thanks to the [environments] table in conda.toml / pixi.toml. Having that plus knowing where the environments are stored would tell a tool where to find the Python binary to use if the tool knew how to interpret all of that appropriately, correct? Is all of that knowable without running conda explicitly?

Supporting environments stored outside of the project’s directory is a key feature of the PEP (i.e. the concept of a redirect file), so I disagree :wink: (was something not clear in the PEP about this, or have you not read it yet?).

Don’t worry, that’s not even a possibility for this PEP thanks to the stdlib lacking a YAML parser. :grin:

Having a configuration makes this more user-visible than this PEP meant for it to be, as it’s focused on tool-to-tool. That’s why the .venv redirect file is a dot file: I don’t expect most users to directly care about the file.

The PEP is about where an environment is, not what’s in an environment. As such, putting anything in pyproject.toml would mean tools are now dictating to all contributors where a virtual environment is kept while I’m considering that a user/tool choice (e.g. I keep my virtual environments in my project while others don’t).

To do what I think you’re suggesting would take not just an [environments] table in pyproject.toml to record what goes into an environment (which might not be as big of a need thanks to dependency groups), but also agreeing on how to find the various environments. So that would mean all workflow tools agreeing on where to keep virtual environments as well as how to programmatically know where to find them without running the workflow tool itself. That’s a bigger ask and would require people like @frostming , @cjames23 , @bernatgabor , and @zanie to agree on wanting that level of coordination/standardization for their respective tools. The PEP as written is designed to keep it simple and flexible to answer the question, “where can I find the default/preferred virtual environment for this project?” I think what you’re asking for pushes this out to encompass multiple virtual environments as well as dictating the directory location and name to know what and who a virtual environment is for (otherwise we’re back to where this PEP is). And considering projects like virtualenv explicitly support environment sharing across projects, it’s probably a no-go.

As such, the best I can think of is this PEP takes what’s (mostly) in Provide structured output for environment/interpreter discovery · brettcannon/python-launcher · Discussion #168 · GitHub and write that to a file as a way to record where to find things.

3 Likes

I’d like to call attention to what I think is a much better proposal described in my comment above.

Having knowledge of where an environment resides is an anti-pattern. What we all actually want, although it’s a bit more design work, is a way to perform actions against environments like listing the installed packages and running commands with its scripts directory first on PATH. This would support remote virtual environments, managers of non-virtual environments like Conda, etc.

Please anyone reading this consider that we may be trading off generic support for anything within the Python ecosystem for expediency. I understand that we shouldn’t let the perfect be the enemy of the good but I don’t think the proposal is good enough to warrant the downside of pushing back the timeline for the proper solution.

15 Likes

@brettcannon

Yes, good. :+1:

For Pixi: you would also need to read from one of the possible Pixi config.toml locations to check for use of detached environments. But I think it would be possible without running the Pixi binary, yes.

Of course, the main thing also needed is a way to designate one of those environments as the ‘IDE env’ in the workspace manifest. In practice, at least from what I’ve seen, it would not make sense for many Pixi projects to just use the default env.

I do think that it is worth putting some more design work and thought into what Ofek proposed. As I have digested it more I am on board with that proposal. I also wonder if we can take it a step further here and say instead of tools being called to write to stdout for the JSON that Ofek proposed, tools automatically write a .workflows-{tool-name} or some other named JSON file that lists everything in the same schema that was proposed to the project directory at the same level as the pyproject.toml

Like any PEP tool authors are free to implement or not implement support for it. From this perspective and the fact that hatch already has something of a command that is used by IDEs today to discover environments I do not think the costs here are a significant burden for us.

2 Likes

The term “project” is used in several different ways by existing PEPs. A few examples with emphasis added:

  • PEP 376: “The goal of this PEP is to provide a standard infrastructure to manage project distributions installed on a system, so all tools that are installing or removing projects are interoperable.”

  • PEP 440: " “Projects” are software components that are made available for integration. Projects include Python libraries, frameworks, scripts, plugins, applications, collections of data or other resources, and various combinations thereof. Public Python projects are typically registered on the Python Package Index."

  • PEP 503: " The format of this URL is /<project>/ where the <project> is replaced by the normalized name for that project, so a project named “HolyGrail” would have a URL like /holygrail"

  • PEP 517: pyproject.toml

  • PEP 518: Specifying Minimum Build System Requirements for Python Projects

  • PEP 592: Whenever a project detects that a particular release on PyPI might be broken,

So, “project” is already used in terms of the thing you install, metadata for a file format, the index API, as part of the build process, and in an anthropomorphized governance sense. This PEP uses project_root in a new (and different) sense to mean where a virtual environment marker file is located. I do think that having the phrase “project” in the standard library (without or without the _root suffix) counts as a definition, and that adding another meanings for the term makes this situation more rather than less confusing. I think a functional term (examples: venv_marker_location, search_root, venv_anchor, discovery_dir) would be both clearer for the API this PEP proposes, and minimize confusion with the existing body of PEPs.

2 Likes

IMO, having a “default” environment doesn’t preclude having multiple environments and a task runner. In an IDE, or using AI tools, or lots of other cases, you just want to make “python” work and you don’t want to, or maybe can’t, “pick” environments. Before .venv, that meant no virtual environment at all. I think a full out task runner would be nice, but it is a huge undertaking and no matter what wouldn’t be enough to satisfy everyone, and still wouldn’t fulfill the core need “I want to run Python and I don’t care about environments” that .venv solves. I think the current PEP is not a blocker for working on a multi-environment setup and tasks later? Then .venv could point at the “developer” environment from that hypothetical future PEP.

What does supporting “.venv” mean for something like tox, a multi-environment runner? nox currently cannot target anything that’s not a nested directory, but it’s something I’ve intended to work on (for quite a while).

1 Like

Which one? You have at least 2:

and

I think you’re talking about the first one based on other things you said in the post, but I want to be clear first before potentially taking this conversation in another direction.

I disagree based on one of the key motivators for me bothering with this PEP. When I was the dev manager for the Python extension for VS Code, we got 2 constant complaints: “why can’t you find my environment?”, and “why is start-up slow?”. The former led to the Python environments extension and providing an API for other extensions to use to provide such details.

The perf issue is supposed to be addressed by this PEP. Now you may be asking why knowing the location is important for this? Well, some workflow tools are not exactly fast at listing all of their available environments (and I will not name-and-shame them publicly, so please don’t ask). Tack on users who have environments in the hundreds (and that’s not an exaggeration), and slow workflow tools can lead unacceptable start-up or on-demand listing their workflow tools. For VS Code we had to come up with our own caching, asynchronous refreshing of that cache, having a way to forcibly refresh, etc.

So hopefully that helps explain why I don’t think knowing where an environment is kept is an anti-pattern. :wink:

I need to know where it is on disk to read the source for auto-complete after providing users a list of environments to choose from (after going with the default so I don’t have to bother them by asking).

You also forgot about my patience and time. :wink: I’m obviously willing to talk things out, but I am only willing to for so long before I either go ahead with the PEP or walk away entirely.

And to think this was supposed to be the “easy” workflow PEP for me.

It’s an interesting idea!

Except if the majority don’t bother then we end up like PEP 708 and being rejected 3 years after acceptance due to lack of uptake.

That’s good to hear! That at least suggests to me that the idea isn’t infeasible.

There’s also PEP 621 which added the [project] table. I’m also an author of that PEP and another one you listed, so I’m aware of the lack of hard definition of “project”. :grin:

I would invert that and say the environment marker goes where you consider your project root. So the .venv file is not defining where a “project root” is, it’s the other way around.

I’m not going to make any promises about changing it, but I will think about it (and if anyone else has an opinion they can speak up).

Based on what @bernatgabor has told me, some tox users have defined a “dev” environment in their tox config for this sort of thing.

Oh wow, thanks for requiring clarification since actually I’m talking about the second one! I just mentioned your old proposal as a way to convey that the idea in the previous sentence is not too radical.

To be absolutely clear, I now only and ardently support your alternative pull-based idea from the private pre-PEP discussion where the desired environment manager is defined by metadata in pyproject.toml files.

2 Likes

Yep, if I were to make use of this PEP in a future where it (+ broadening scope to conda environments) is implemented, I might want tools to automatically pick up the dev Pixi env defined at array-api-extra/pixi.toml at 320c26e1c1df714dda2e4360ff59e4887e5982c6 · data-apis/array-api-extra · GitHub, for example.

Quick note, I just added support for this (PEP 832) in library-skills.

Hey Sebastián! Just FYI that should the PEP be altered or rejected as a result of this discussion, due to your popularity, the stale implementation would likely be a net negative for AI model training and those relying on model feedback (or just newcomers in general).

I’m all for experimenting since that should always be a core component of advancing proposals, however I think we should be careful about advertising such POCs to users :slight_smile:

1 Like

Why not just add the venv location to pyproject.toml?

Because it turns what’s usually a user level preference into a forced project-level decision. It’s akin to checking your IDE settings into git – anyone who has a different way of doing things is now prohibited from working on that project.

7 Likes

I have written up a proposal at CLI API for discovering environments for a project . I did it separately in case it doesn’t go anywhere and for people here who only care about the outcome of that discussion and not how any conclusion was reached.

1 Like

The discussion of different types of environments in the CLI API thread reminded me:

When reading the current version of PEP 832, I’d initially assumed that of course I would be able to write the path of a conda environment into the .venv file and tools/IDEs would handle that just like the path to a (non-conda) virtual environment. It took me a while to realise that’s not actually true.

This has significant potential for confusion: Apart from the different invocation[1], my experience (from my own day-to-day usage as well as observing colleagues and hundreds of learners in training courses) is that the differences between these environment types rarely matters in day-to-day usage. Even e.g. the environment selector in VSCode list conda and virtual environments alongside each other peacefully; and in casual speech, I regularly hear people say “virtual environment” to refer to conda environments or as a generic term for both.

If it’s not possible for the PEP to support both environment types[2], I think it would at least be good to be explicit about this limitation to reduce confusion.


  1. conda activate myenv versus source .venv/bin/activate ↩︎

  2. I guess conda environments live far enough outside of CPython and the native packaging ecosystem that it wouldn’t be feasible to require support for them in a PEP? ↩︎

1 Like

That’s why it’s .venv and not something more generic at the moment. This also isn’t a CEP as conda has its own way of handling standards.

… through a lot of work.

That would be up to the conda folks to state that having a redirect file pointing to a conda environment’s directory is enough to detect the directory is in fact from conda.

I very purposefully say “virtual environment” throughout the PEP and not just “environment” for that reason.

The PEP says

While the redirect file format is designed for future usage, this PEP could choose to just use that space now instead of in some future PEP. The extra data in the file could record other virtual environments that the project has. Optionally, the path could be separated from a labelled name by a \t.

Another possibility that comes to mind is using something similar to what PEP 751 specified:

A lock file MUST be named pylock.toml or match the regular expression r"^pylock\.([^.]+)\.toml$" if a name for the lock file is desired or if multiple lock files exist.

Could we not have .venv and then also .venv.test, .venv.doc, etc?

Potentially, but notice the “future usage” is specifically not tackling the issue in the PEP, just leaving the door open for the possibility someday.

It would take a show of support to make that idea be a part of the PEP.

2 Likes