All three of those tutorials are relatively recent (with the oldest being 7 months old). All of them contain bad practices that can confuse newcomers and break their environments.
Bad tutorials aside, you can find tutorials recommending pip+venv, pipenv, poetry, pdm. Is this not confusing to newcomers? Would PEP 582 be special in that regard?
Nobody can control the tutorial-writers of the world, and they can write anything they want. That said, if I were writing tutorials, especially tutorials on packaging.python.org, I would consider PEP 582/__pypackages__ as the main way to get things done, and venvs as the more advanced route with specific use-cases. A tutorial or some university course teaching people how to work with requests can just say mkdir __pypackages__; pip install requests and be done with it, without having to explain the venv situation and without having issues with people forgetting to activate their venvs. This does not preclude a tutorial teaching venvs, listing the cases where they are more useful than PEP 582 (testing different extras/package configurations, for example), and listing their pros and cons compared to PEP 582.
I remain strongly against pip installing into __pypackages__ in pretty much any form that isn’t opt-in. I absolutely agree that changing behaviour based on whether the directory is present is a non-starter. I don’t know about anyone else, but I’d pretty routinely forget that I had a __pypackages__, and end up with stuff in weird places.
The PEP seems to suggest that if __pypackages__ is present, installers should install to it, even if being run by a Python version that doesn’t support __pypackages__. And it should create the directory if it’s not present, but in this case depending on the Python version (presumably “if the Python version supports PEP 582, then do otherwise don’t”). That all seems very muddled, and I’m not at all sure I can work out what the UX would be like from that.
And IMO creating a __pypackages__ if it doesn’t exist is very intrusive. Not every directory with a Python file in it is a Python project directory, and if I forget where I am, I might very well not want pip to install locally. I certainly don’t want to have to go around housekeeping old __pypackages__ directories from scratch directories, because I’ve forgotten what I put in them.
By the way, we should also be very cautious about making this discussion about pip. The PEP refers to “package management tools”. What constitutes a package management tool? Is conda one? If not, then we’ve deepened even further the split between conda-based systems and “standard” ones. And what about installer - must the command line invocation python -m installer <wheel_file> install to __pypackages__? If not, then what exempts installer and why can’t pip be exempt too?
Sigh. I’ve said this before, but nothing has changed. IMO the PEP is simply not explicit enough in its proposals, and not careful enough about describing the implications and consequences of what it’s describing. I really hope that it doesn’t get accepted in its present form (even though I’m not against continuing to explore options in this area - I equally don’t think that PEP 704 deserves to be the final say on this matter).
Must? Might? the PEP is woefully vague on what’s essential to the proposal and what’s optional/nice to have. ↩︎
I don’t have time for a hugely well written argument here, I apologise in advance!
I think one important question is “who are we helping?”. Advanced users don’t even think about virtual environments. In particular, users who are comfortable enough to write a Dockerfile are likely more comfortable with the principles of environment management.
I feel that our default case should be sane, safe, and obvious. To me, local state management in the form of __pypackages__ seems like a good solution here — it is easy to create a new env (just delete it), no environment management / tracking (no use of activate), etc. So, I’d see this as something to opt out of instead of opt-in.
That might mean always creating and using such a directory by default unless a flag is passed. I’m not sure what that means for e.g. system Python yet. I’m sure we can treat that separately, though.
That said, I often find myself writing the same direnv edit . to create and activate a transient, local environment. ↩︎
That’s assuming they get used when they’d be implemented. A small data point about much narrower scoped fat wheels: I’ve asked a number of projects to ship “universal2” wheels when we added support for universal2 wheels (x86_64 and arm64) to CPython. At the time at least some of them choose not to do that due to size concerns. That was just after M1 Macs were introduced, the situation may have changed. But fat wheels targeting all major platforms would make the size concern a lot larger.
I think PEP-704 and PEP-582 (with a little tweaking) can actually coexist, and deliver on the original intent of PEP-582.
PEP-704 standardizes the .venv virtual environment. It is not hard to imagine a flag or the default behavior of a packaging tool like pip to automatically create a virtual environment to install to if one is not available (pipx does this).
This is roughly equivalent to auto-installing to a folder named __pypackages__, it just happens to be a full virtual environment at a standardized path of .venv. The key thing is that it’s transparent to the user; they don’t have to know about it, or the tool can easily point them to it, which is beginner friendly.
Assuming the above behavior is implemented, the next step would be to modify Python itself to automatically walk up the directory tree, looking for the .venv folder (instead of __pypackages__). (If a virtual environment were already active, it would use it without looking for the .venv directory.)
The subtle but important differences here is the .venv folder is fully in-line with all the existing tooling and ecosystem, whereas __pypackages__ is orthogonal and somewhat jarring.
If we go back to the three bullet points of this PEP’s motivation:
How virtual environments work is a lot of information for anyone new. It takes a lot of extra time and effort to explain them.
This is addressed by pip (or equivalent) automatically creating and using the .venv Virtual Environment (can be behind a new flag)
Different platforms and shell environments require different sets of commands to activate the virtual environments. Any workshop or teaching environment with people coming with different operating systems installed on their laptops create a lot of confusion among the participants.
Addressed by the Python interpreter itself looking for the .venv Virtual Environment instead of __pypackages__ (this PEP’s modificaitons).
Virtual environments need to be activated on each opened terminal. If someone creates/opens a new terminal, that by default does not get the same environment as in a previous terminal with virtual environment activated.
Also addressed by Python itself looking for the .venv Virtual Environment
As I think more about this, I’m of the opinion that the issues potentially solved by these two PEPs are a subset of the problem of not having the unified/recommended tool which that other thread is discussing.
Like, it is unequivocally true that if one uses Hatch or Poetry the problems expressed here go away, especially for beginners.
The one thing that I’ve not loved about tools that require e.g. hatch run, or poetry run (tool agnostic) is that they’re an extra layer. Whilst I’ve just made the argument that we should assume that advanced users will know what they’re doing (and therefore lean towards solutions that benefit non-advanced users), we can’t entirely discount the ergonomics argument. Having to invoke an entrypoint to launch a script interferes with lots of existing tools that run a Python binary directly.
My point is that if Python were smart enough to look at a local environment scheme, it would "just work"™ much more smoothly than the current support for PEP 582 in e.g. pdm, which is restricted to a special shell plugin to activate the environment, and the pdm binary itself.
That said, I agree that the problem is directly related. A large swathe of “which tool do I use” goes away if you don’t need the tool to do environment management for a basic project.
One problem that doesn’t (as far as I know) go away is the problem of someone writing a script that uses (say) requests, and then wanting to share it. Having a semi-magical place where requests got installed, which the user maybe doesn’t even know about, is highly non-obvious to a beginner. And IMO it’s too big of an ask to expect a relative beginner to know how to express “here’s my script, and this is what you need to do to run it…”. Yes, making a “project directory”, and building a lockfile, and then sharing the script and lockfile, makes the script shareable, but that’s not a beginner sort of workflow - at least, most of the people I taught Python to would have struggled with it. And you still need to explain to the recipient how to use the lockfile to set up the environment.
Accepted, PEP 582 and PEP 704 don’t address this either. But that’s sort of my point - we’re assuming a particular picture of a “beginner” which really doesn’t match my experience. In fact, in my experience, the people used to creating project directories are much more likely to be experts, building distributable libraries and applications.
My experience is that Python beginners come from backgrounds like shell scripting (sysadmin) or Excel hacking (data analysis). And both of those backgrounds are strongly based around sharing single-file “scripts”, not building “project directories”.
So what I’m saying is that while the solutions we’re discussing might help a certain type of beginner, let’s not make the mistake of thinking that we’re necessarily making Python more approachable for everyone. Quite the opposite, in fact - many people will be put off by being expected to create a project directory every time they want to do anything (I know I found it frustrating when trying to do Advent of Code in Rust, ending up with one directory per “day” when I just wanted one directory with a bunch of small programs in it).
That actually sounds like a large change to how virtual environments currently work, that would make their behaviour closer (if not equivalent) to what PEP 582 proposes for __pypackages__. Right now, what the python command means is solely dependent on $PATH, and the site-packages in use/sys.path is solely dependent on the executable that is run (that is found in $PATH or that the user executed explicitly). If I activate a venv, and then run /usr/bin/python explicitly, I am still running the system /usr/bin/python. This behaviour is needed in some cases, eg. for OS tools written in Python and installed into system site-packages.
If Python were to detect and use a .venv in the directory tree, then running /usr/bin/python would no longer guarantee the use of the system site-packages. This would require adding a switch for Python to ignore the .venv directory and use the system site-packages . This proposal would also mean there are now three ways of working with the project’s environment (cd into it and run python, activate, or run .venv/bin/python explicitly), which seems like a way to add more confusion — especially if some actions (system Python upgrade, project move) may break only some of the three ways to use the venv.
Assuming no sitecustomize.py or other shenanigans, just a plain venv working in the usual way. ↩︎
This switch is also necessary for PEP 582, if it will disable the system site-packages, and it definitely should do that. ↩︎
This is also orthogonal to level-of-experience in my view! It’s just simply more friction to share a script and say “here’s the deps you need”. A cross-platform solution is to share a pyproject.toml… which is definitely not quick.
@pf_moore forgive me if this is something I can easily find, but how did your explorations with pipx run integration go? I still share your view that this would be highly valuable, independently of this conversation.
If installing into global environments is disabled by default, and iff. we add an entrypoint like py to launch this “new” Python, then we make it opt-in, and protect users who mistakenly try to use the conventional Python binary to do bad things.
I think virtual-environments and __pypackages__ could still coexist, although at a highly superficial level, a virtualenv now looks like a relocatable __pypackages__, so perhaps there’s unification that could be done at the implementation level.
In any case, given how mature Python is at this point, we have to be pragmatic (I assert). Here I’ll talk about PEP 582, but in the general sense of a fixed location, automatically active environment.
Let’s run a hypothetical. We can likely agree that something like PEP 704/668 is a good change irrespective of PEP 582. What is the least-worst option, assuming we have PEP 704/668
the status quo, whereby users need to learn about virtual-environments in order to install dependencies
Why they’re needed
How to activate them
How to deactivate them
Where to create them
a __pypackages__ directory
Why it’s needed
This is probably overly reductive given my preference, but I think that’s how I’d want to tackle this problem; we already have a bad status-quo for beginners (I think we all can agree that beginners get environment management wrong).
My background is in physics, and I can’t tell you how many researchers have catch-all (system! sometimes) environments with everything dropped in there, without a pyproject.toml in sight. I’m not suggesting we encourage this, but we can perhaps accept that we need to lower the bar to entry for good practices.
Having recently made precisely that decision I can say that I did so simply because it seemed that it would be more difficult to produce the universal2 wheels (I was explicitly advised that it would be more difficult). Since producing wheels at all was difficult I didn’t want to opt in to making it any more difficult for myself for something that seemed to offer almost no gain to users since pip would install from a valid wheel either way.
Absolutely they would. The manylinux wheels I’ve made are huge so bundling all of that into the OSX or Windows wheels would increase their size many times over.
This is what the Python Launcher for Unix does, so are you essentially saying you want the python binary to take on a similar role here? And I assume this would be some special mode just for python and not python3 or python3.11 since that gets a bit specific about what Python you want to use?
Nope, it doesn’t, but that’s not a solved problem anyway. At some point there is a “practicality beats purity” argument to be made about trying to shove every problem into one solution when they have different requirements.
And I’ve had somewhat of an opposite experience of people creating separate directories for organizational purposes or because they have been told to (e.g. students). Python’s user base is so huge and broad we are going to have an ample amount of people falling into any categorization we create here that fits “beginner”.
I don’t know if anyone is making that claim specifically, but there is a large number of beginners where something that makes virtual environment-like experiences easier is a net win for a good amount of people.
I don’t think any of these proposals require you do that. If you want to have multiple files with differing requirements and put all of their disparate dependencies into a single environment then you can do so with either approach from what I can tell. Otherwise I feel like you’re advocating for Allow running scripts with dependencies using pipx · Issue #913 · pypa/pipx · GitHub without explicitly saying it and/or another PEP that says virtual environments should not be stored in the directory with the code, but elsewhere in the file system (which is a legitimate approach, but I haven’t read the PEP 704 topic yet to see if that’s being advocated for over there).
Given the number of invisible environment variables and scattered configuration files that change the behaviour of installation, the existence of a directory seems like an interesting place to draw the line.
(Your comments on behaviour getting weird when you mix up Python versions is fair, but also the nature of a transition period. In ten years time, nobody will ever think about that again.)
I will say that I don’t think a solution here has to be everything to every class of user.
What I do worry about is whether a solution is actually serving the people it purports to serve, in the way that wants to serve them, whether there’s a better and/or more general solution for those users, and whether it introduces hidden failure cases or issues that aren’t being seen, plus how it impacts people who the solution isn’t attempting to serve.
I’m hesitant on the PEP 582 idea, largely because I’m not sure that I see a big win here that can’t be solved another way, that reuses the tools and concepts that already exist. The __pypackages__ directory, as implemented in PEP 582, doesn’t give you a good stepping stone into other tools or other workflows. If you outgrow it and need to use virtual environments, there’s not an easy path forward that isn’t “transition everything you use to use a different isolation mechanism” or “bifurcate isolation mechanisms between projects”.
It also complicates things for existing users, who now have yet another isolation mechanism that they need to handle, understand, etc.
If we instead focused on something that revolved around venv, then we have a better path between workflows, new users can have something that tries to paper over the venv, but as they learn and grow, they can start looking behind the curtain at the underlying venv, and even decide that maybe they would prefer something like virtualenvwrapper that keeps all the same concepts they’ve already learned, but moves them into another location.
(excuse my newly invented shell syntax - adapt for your own language)
The primary thing PEP 582 is intended to deal with is having to understand the shell. If we can paper over “full path to Python” and “environment variables”, I’ll be happy. Even better if “clone a repo, double-click the .py file and it has all its packages” also works (as it would under 582, though there are a couple more options with that one).
A user in an IDE shouldn’t have to drop out of it to run shell commands. A user cloning a repository shouldn’t have to look up the “create an env and install packages” steps for it. A user on PowerShell shouldn’t have to translate a .env list of environment variables. There’s a huge reliance on being a shell expert just to get basic Python stuff running, and anyone who has run a workshop at a PyCon has struggled to get all their attendees past this point (wonder why hosted Jupyter is so popular? No shell skills required. Don’t believe me? Go to PyCon and check out the tutorials.)
I’m far more concerned about users growing up from installing apps on their phone to writing Python scripts than I am about users growing up from “my packages go where I’m working” to “I designate a specific location for packages and tools that I’m currently working with”.
If you have an idea that handles this as a layer on top of the current tools without needing to install more shell integrations, please, go ahead and share it. If it looks like it works, I’ll back you all the way - I’m not wedded to this idea. But speculation that such an idea might exist isn’t the same thing.
Choosing a default requirements.txt file/equivalent is definitely not covered here, but it doesn’t need to be. ↩︎