I’m not sure what I how I can help out with this, but I’d like to volunteer. I’ve got the time right now and can contribute anything from proofreading docs to testing proposed implementations to writing and/or reviewing code.
I’m a long time Python developer and have made a few open source contributions here and there, but I’ve never been involved in something like this, so I’m not really sure how to jump in, but I’m enthusiastic about this proposal and would like to see it implemented.
I’m not sure how the SC could accept this when there’s no consensus from the packaging side about whether it’s even a good idea. But having the PEP hanging around forever in Proposed limbo is clearly causing a lot of confusion in the broader community, so it would be good to get some kind of final resolution, just to stop everyone from having to guess.
Personally I would mark it Rejected (with regret). The goal is great, but there are just too many conflicts between what people want from a low level “run the interpreter on this precise file in this precise environment” tool is now, and a high level “do the right thing for my dev workflow” tool for a single cli command to handle both. And since the python cli is already committed to the former, there’s no way to provide a really good dev workflow in the same tool. A frontend approach like pdm takes is just better all around – and also much safer for evolution and consensus building in a complex area where there’s still active development happening. Once something is part of python it becomes frozen forever and can’t be updated or fixed; frontend tools can be a lot more flexible.
@frostming before Kushal sends this PEP to the SC, did you have any input to provide? Did PDM 2 switch simply because PEP 582 never got accepted and you would happily kept using it if it had (that’s what the PDM 2 post seems to indirectly suggest)? Or was there another reason?
Following from @njs’s comment, I think a node/npm / rust/cargo setup might be the best resolution–when installing Python, the default package manager (let’s call it ppm) is also installed, but it’s a standalone tool with its own release cycle.
If this hypothetical ppm were part of pip, that might be “ideal” since pip is installed along with Python in most (all?) cases already and is the default package manager that most project docs suggest using.
But if ppm can’t be part of pip due to scope or other reasons, perhaps one of the existing package managers could be promoted to the default package manager and always be installed alongside Python.
Of course, like yarn in node land, one could always opt to use an alternative package manager, but this should be a deliberate choice and not something you have to think about unless it’s really necessary.
Selecting one of the existing tools as the default would be somewhat of a challenge, but I’m starting to think that’s what needs to happen here.
In this scenario, this PEP would still be useful in standardizing the layout of __pypackages__ and so forth so that package management tools have a baseline to build on.
FWIW, I’ve created PEP 704 as an alternative to this PEP, which I will withdraw should this PEP get accepted. It is intended to resolve the same concerns, albeit with a different approach: requiring virtual environments for pip install (and equivalents) and establishing a conventional name for virtual environments.
Personally, I’m ambivalent on which approach is picked between PEP 582 or that proposed PEP.
As much as I wanted to see 582 land, I tend to agree the ecosystem and tooling has evolved in a direction that makes it nearly impossible to fit this in without creating even more confusion.
I’ve read that venv are best practice, but as a casual user of python (mainly small scripts for business related problems) - I may not program for 1/2 a year, only to dig back in when something is easier done than in excel. I struggle enough just creating a solution with the std lib / learning how to use the libraries I pip install.
Having the default python search the script pypackages for dependencies sounds appealing. Then at least it’ll sorta self-document working version numbers of dependencies Right?
I’m surprised by the pessimistic reading of this thread. I read all the comments, and while there are concerns (from a vocal few?), initial reactions are good and there isn’t a lot of outright hate.
Why not do a poll? Is it possible to do one with “select all the ones you agree with”?
Implement 582 as is
[EDIT] Implement as module with command line switch
[EDIT] implement, but without python version sub folder (more compatibility across computers with pure-python-ondependencies)
Implement 582, assuming pip could have option to install into pypackages easily.
Implement 582, after consensus with package managers about exact directory structure.
Implement 582, with python option to exclude site/user paths
Reject 582.
(I may be missing concerns)
I still think there’s a potentially good option in a stdlib module that enables PEP 582 support, maybe also with a short command line switch which would be the equivalent of python -m pep582.
I think the ultimate question is whether PEP 582 still helps when it requires opt in? If users have to remember to type python -m pep582 or python -?, then is that any harder than .venv/bin/python?
As I understand it, the primary benefits to PEP 582 is such that it just does the “right” thing for users, without them having to learn or remember to do the magic incantations to make it work, which you lose with something to make it opt-in.
Overall, I’d also be concerned about how to manage the migration in PEP 582. I’m not particularly enthused about a workflow where the place you install to depends on whether or not a __pypackages__ directory exists or not. That feels like spooky action at a distance behavior. The other option is to (eventually) just make __pypackages__ the default no matter what.
I believe PEP 582 should be the default behaviour of Python, and part of the interpreter itself. The existence of __pypackages__ should be a way of opting into this mechanism, just like .git opts you into using Git features of IDEs. Virtual environments are confusing to users and have many weird edge cases not seen in __pypackages__ and node_modules.
PEP 582 does have some issues, the most notable one is the fact that it does not disable system site-packages (and it should).
If there is no PEP and PDM continues to implement it manually via sitecustomize.py, it will severely hinder the adoption of __pypackages__, especially by IDEs and other tools.
I’m on the fence about PEP 582, but I think that the existence of some magic directory toggling the behavior on or off is spooky action at a distance that will just confuse people. We should either decide that this is the path forward as the default and commit to it, or we should not do it at all.
I don’t think the .git comparison is a particularly good one, because .git doesn’t have a behavior other than to error if you run the command outside of of a directory tree that doesn’t have a .git directory, so while the presence of a .git directory signifies this is a git repository, it doesn’t change the behavior of git from one non error state to another.
Compare this to PEP 582, where the end result would be people accidentally installing things into their global environment when they didn’t mean to, causing more confusion about where exactly they’re going to install to or not.
I’m on the fence about PEP 582 because as a casual, only sometimes user of node.js and npm, I find node_modules behavior incredibly frustrating, and I don’t think I’ve worked on a project in Node where I wasn’t wishing for something akin to virtualenv by the time I was done. Obviously a lot of people use and like that particular feature of the node ecosystem, so it’s not clear to me whether I’m just used to my workflow or whether there’s some underlying problem with the node_modules approach.
I do know that one place this causes a lot of frustration for me personally is using within a Docker container that you’re bind mounting a host path to. Because the node_modules, or in this case __pypackages__ lives in the same place the code does, it makes it difficult and annoying not to share the same set of installed packages in the host and inside the container [1]. This can work sort of ok in an ecosystem like Node where a lot of projects may never have anything but a pure javascript dependency tree, but it stops working the moment you have any sort of compiled artifact in your build, which as it turns out, most projects in Python end up having at least one unless they’re very small or take great care not to.
Now a observant person would notice that this is essentially a problem of trying to re-use the __pypackages__ across multiple operating systems and/or computers, which is obviously a nonsense thing to do, but by colocating the installed environment with the project, you’re more or less making the behavior of re-using __pypackages__ between OSs/computers the default behavior anytime you have a single directory project being shared among multiple, which can happen for a number of reasons such as:
Docker containers bind mounting (or even just copying the files in if __pyproject__ already exists on the host).
WSL2 mounts on Windows
Bind mounting file systems into virtual environments.
Dual booting computers with a shared data drive
Any tool like a NAS, Dropbox, etc that shares a directory between multiple computers.
If we were to go forward with something like PEP 582, I would want to at least see the PEP be updated to acknowledge these problems, and explicitly declare that it thinks the improvements to the new user workflow is worth the cost of breaking all these other workflows (which may also impact new users of course) or come up with some solution to that particular problem.
It’s not in the PEP (and I’m not planning to add it, but Kushal might), but I’d always kind of hoped this would help push us towards fat wheels, which just contain all the binaries needed for the range of versions/platforms they support.
Obviously there are a few cases where this would be prohibitive, because the binaries make up the majority of the package. But it seems in very many this isn’t the case, and everyone would be better off with a single wheel containing all the extension modules that just works on whatever version is being used, rather than having to duplicate all of the platform-independent files for the sake of one small platform-dependent one. (It might even encourage the “prohibitive” cases to restructure their native code to be more independent from the particular Python version, either through abi3 or their own dynamic library.)
Fat wheels narrow the problems down [1], assuming that the implementation of them means that all of the binary modules for different environments and we rely on the ABI tagging in Python extensions to handle binary compatibility… but I don’t think it solves the problem? Like a common situation might be a developer running on a macOS machine, with Docker running in a VM (because that’s how you run Docker on macOS), running a container that is running Alpine Linux.
We don’t support MUSL wheels at all currently, so if you did the pip install from inside the container, you’d get a version compiled inside the container that effectively only works inside that specific container. If you did it from macOS, you’d get ones that work for macOS (and maybe other OSs if we bundle Linux and Windows into an even fatter wheel) but that wouldn’t work for the Alpine container.
That also assumes of course that it’s just installing the same packages, but with different ABIs required for different OSs, but different OSs may also just require different dependencies installed completely, some of which may be OS specific and don’t even install on other OSs (e.g. pywin32).
Though there’s an unrelated problem of filename length already being an issue on PyPI, and as we try to make a Wheel cover more use cases, we’ll be forced to add more tags to it’s filename, making it even longer. ↩︎
Yeah, I agree. But those are solvable if we decide they need to be solved, and right now they don’t because MUSL users can just install from source (or something ).
Or alternatively, those issues need to be solved anyway because they exist regardless of how you create the environment. Given the number of venvs checked into GitHub, I’d expect that problem to only get incrementally worse, rather than suddenly becoming a new thing.[1] (Meanwhile, a pure-Python environment [without the version number directory people added since my original proposal] is totally fine to check into GitHub, and if that’s all your users need, then they can just clone and run. This is possible today, of course, as is making fat wheels, so the whole proposal is really to formalise it and make it natural rather than a frowned-upon hack.)
Fun fact: at work we added a scan for pip’s --extra-index-url command line/environment variable. The vast majority of uses were people who had checked in their environment and it was being found in pip’s own source code. ↩︎
I guess it depends who “we” is. Cibuildwheel can build musl linux wheels, and cryptography ships them.
The NumPy wheel, when zipped, weighs about 15MB, has a 24MB 6MB c-extension module in it. Bundling that c-extension module for 6 platforms (2 macos, 2 windows, 2 linux (x86_64 and aarch64)) together would be quite large.
Edit: 6MB not 24MB. I was looking at a debug build
I think there could be a middle ground, that would be achieved by accepting something that merges PEP 582 (with some improvements perhaps) and of PEP 704 (Require virtual environments by default for installers). Something like this (rough, opinionated draft):
If both __pypackages__ is present and using a venv/conda env,[1] Python should raise a warning on startup, use only the venv/conda env,[2] and pip/package managers should refuse to install (with an escape hatch, i.e. an option to specify the destination).
If using a venv/conda env, Python should use packages from it and ignore system site-packages.
If __pypackages__ is present, Python should use packages from it and ignore system site-packages.
If neither venv/conda env nor __pypackages__ are present, Python should use system site-packages, and pip should refuse to install (unless the user explicitly accepts making a mess in the system site-packages) and tell the user how to fix this, with __pypackages__ being the recommended fix (considering its simplicity for beginners) and venvs also being mentioned as a valid and supported option.
Regarding the cases with sharing a folder, all those cases are equally broken/problematic if you’re using virtual environments stored in .venv in the project folder (as recommended by PEP 704). Even if you use something like .venv-windows and .venv-linux, the performance is terrible in the WSL2 case. Centrally-stored venvs would be immune from those problems, although note they were the minority in Brett’s recent Fediverse polls.
That said, the I don’t think those cases should be the deciding factor, and they are quite easy to fix (by eg. storing the code in the Linux filesystem on WSL2, by a .dockerignore for Docker builds, or by limiting your Docker bind mounts to only the source code directories relevant at runtime). And if some workflows are severely hindered by PEP 582, they can keep using virtual environments (nobody is proposing to get rid of them)—but at the same time, many people, especially beginners, would benefit from a simplification to the environment management story.
Or alternatively refuse to run, although that seems too strong. Or alternatively __pypackages__ could win, but IMO an explicit mechanism should win over an implicit mechanism. ↩︎
Wouldn’t this create one more way people can do things and confuse even more newcomers? You’d have tutorials using virtual environments and also PEP-582 now…