I just saw this article How to improve Python packaging, or why fourteen tools are at least twelve too many | Chris Warrick seems like this PEP is getting more and more people interested. It’s a shame even people like me or Wyatt are willing to help to promote this are we just getting ignored and this PEP is just sitting here…
I am planning to send the PEP to SC this week and see what SC thinks.
I’m not sure how the SC could accept this when there’s no consensus from the packaging side about whether it’s even a good idea. But having the PEP hanging around forever in Proposed limbo is clearly causing a lot of confusion in the broader community, so it would be good to get some kind of final resolution, just to stop everyone from having to guess.
Personally I would mark it Rejected (with regret). The goal is great, but there are just too many conflicts between what people want from a low level “run the interpreter on this precise file in this precise environment” tool is now, and a high level “do the right thing for my dev workflow” tool for a single cli command to handle both. And since the
python cli is already committed to the former, there’s no way to provide a really good dev workflow in the same tool. A frontend approach like
pdm takes is just better all around – and also much safer for evolution and consensus building in a complex area where there’s still active development happening. Once something is part of
python it becomes frozen forever and can’t be updated or fixed; frontend tools can be a lot more flexible.
@frostming before Kushal sends this PEP to the SC, did you have any input to provide? Did PDM 2 switch simply because PEP 582 never got accepted and you would happily kept using it if it had (that’s what the PDM 2 post seems to indirectly suggest)? Or was there another reason?
Following from @njs’s comment, I think a
cargo setup might be the best resolution–when installing Python, the default package manager (let’s call it
ppm) is also installed, but it’s a standalone tool with its own release cycle.
If this hypothetical
ppm were part of
pip, that might be “ideal” since
pip is installed along with Python in most (all?) cases already and is the default package manager that most project docs suggest using.
ppm can’t be part of
pip due to scope or other reasons, perhaps one of the existing package managers could be promoted to the default package manager and always be installed alongside Python.
Of course, like
node land, one could always opt to use an alternative package manager, but this should be a deliberate choice and not something you have to think about unless it’s really necessary.
Selecting one of the existing tools as the default would be somewhat of a challenge, but I’m starting to think that’s what needs to happen here.
In this scenario, this PEP would still be useful in standardizing the layout of
__pypackages__ and so forth so that package management tools have a baseline to build on.
FWIW, I’ve created PEP 704 as an alternative to this PEP, which I will withdraw should this PEP get accepted. It is intended to resolve the same concerns, albeit with a different approach: requiring virtual environments for
pip install (and equivalents) and establishing a conventional name for virtual environments.
Personally, I’m ambivalent on which approach is picked between PEP 582 or that proposed PEP.
As much as I wanted to see 582 land, I tend to agree the ecosystem and tooling has evolved in a direction that makes it nearly impossible to fit this in without creating even more confusion.
PEP-704 looks like the right move to me.
I prefer PEP 582 approach over a full venv.
I’ve read that venv are best practice, but as a casual user of python (mainly small scripts for business related problems) - I may not program for 1/2 a year, only to dig back in when something is easier done than in excel. I struggle enough just creating a solution with the std lib / learning how to use the libraries I pip install.
Having the default python search the script pypackages for dependencies sounds appealing. Then at least it’ll sorta self-document working version numbers of dependencies Right?
I’m surprised by the pessimistic reading of this thread. I read all the comments, and while there are concerns (from a vocal few?), initial reactions are good and there isn’t a lot of outright hate.
Why not do a poll? Is it possible to do one with “select all the ones you agree with”?
Implement 582 as is
[EDIT] Implement as module with command line switch
[EDIT] implement, but without python version sub folder (more compatibility across computers with pure-python-ondependencies)
Implement 582, assuming pip could have option to install into pypackages easily.
Implement 582, after consensus with package managers about exact directory structure.
Implement 582, with python option to exclude site/user paths
(I may be missing concerns)
I still think there’s a potentially good option in a stdlib module that enables PEP 582 support, maybe also with a short command line switch which would be the equivalent of
python -m pep582.
I think the ultimate question is whether PEP 582 still helps when it requires opt in? If users have to remember to type
python -m pep582 or
python -?, then is that any harder than
As I understand it, the primary benefits to PEP 582 is such that it just does the “right” thing for users, without them having to learn or remember to do the magic incantations to make it work, which you lose with something to make it opt-in.
Overall, I’d also be concerned about how to manage the migration in PEP 582. I’m not particularly enthused about a workflow where the place you install to depends on whether or not a
__pypackages__ directory exists or not. That feels like spooky action at a distance behavior. The other option is to (eventually) just make
__pypackages__ the default no matter what.
I believe PEP 582 should be the default behaviour of Python, and part of the interpreter itself. The existence of
__pypackages__ should be a way of opting into this mechanism, just like
.git opts you into using Git features of IDEs. Virtual environments are confusing to users and have many weird edge cases not seen in
PEP 582 does have some issues, the most notable one is the fact that it does not disable system site-packages (and it should).
If there is no PEP and PDM continues to implement it manually via
sitecustomize.py, it will severely hinder the adoption of
__pypackages__, especially by IDEs and other tools.
Yes, this is the main reason. And that the ecosystem doesn’t have enough support, such as IDEs won’t run the hack in
I’m on the fence about PEP 582, but I think that the existence of some magic directory toggling the behavior on or off is spooky action at a distance that will just confuse people. We should either decide that this is the path forward as the default and commit to it, or we should not do it at all.
I don’t think the
.git comparison is a particularly good one, because
.git doesn’t have a behavior other than to error if you run the command outside of of a directory tree that doesn’t have a
.git directory, so while the presence of a
.git directory signifies this is a git repository, it doesn’t change the behavior of git from one non error state to another.
Compare this to PEP 582, where the end result would be people accidentally installing things into their global environment when they didn’t mean to, causing more confusion about where exactly they’re going to install to or not.
I’m on the fence about PEP 582 because as a casual, only sometimes user of node.js and npm, I find
node_modules behavior incredibly frustrating, and I don’t think I’ve worked on a project in Node where I wasn’t wishing for something akin to virtualenv by the time I was done. Obviously a lot of people use and like that particular feature of the node ecosystem, so it’s not clear to me whether I’m just used to my workflow or whether there’s some underlying problem with the
I do know that one place this causes a lot of frustration for me personally is using within a Docker container that you’re bind mounting a host path to. Because the
node_modules, or in this case
Now a observant person would notice that this is essentially a problem of trying to re-use the
__pypackages__ across multiple operating systems and/or computers, which is obviously a nonsense thing to do, but by colocating the installed environment with the project, you’re more or less making the behavior of re-using
__pypackages__ between OSs/computers the default behavior anytime you have a single directory project being shared among multiple, which can happen for a number of reasons such as:
- Docker containers bind mounting (or even just copying the files in if
__pyproject__already exists on the host).
- WSL2 mounts on Windows
- Bind mounting file systems into virtual environments.
- Dual booting computers with a shared data drive
- Any tool like a NAS, Dropbox, etc that shares a directory between multiple computers.
If we were to go forward with something like PEP 582, I would want to at least see the PEP be updated to acknowledge these problems, and explicitly declare that it thinks the improvements to the new user workflow is worth the cost of breaking all these other workflows (which may also impact new users of course) or come up with some solution to that particular problem.
See Warehouse for instance as an example of trying to avoid getting
node_modulesfrom getting mixed up between the host and the container. ↩︎
It’s not in the PEP (and I’m not planning to add it, but Kushal might), but I’d always kind of hoped this would help push us towards fat wheels, which just contain all the binaries needed for the range of versions/platforms they support.
Obviously there are a few cases where this would be prohibitive, because the binaries make up the majority of the package. But it seems in very many this isn’t the case, and everyone would be better off with a single wheel containing all the extension modules that just works on whatever version is being used, rather than having to duplicate all of the platform-independent files for the sake of one small platform-dependent one. (It might even encourage the “prohibitive” cases to restructure their native code to be more independent from the particular Python version, either through abi3 or their own dynamic library.)
But then, I’m a dreamer
Fat wheels narrow the problems down , assuming that the implementation of them means that all of the binary modules for different environments and we rely on the ABI tagging in Python extensions to handle binary compatibility… but I don’t think it solves the problem? Like a common situation might be a developer running on a macOS machine, with Docker running in a VM (because that’s how you run Docker on macOS), running a container that is running Alpine Linux.
We don’t support MUSL wheels at all currently, so if you did the
pip install from inside the container, you’d get a version compiled inside the container that effectively only works inside that specific container. If you did it from macOS, you’d get ones that work for macOS (and maybe other OSs if we bundle Linux and Windows into an even fatter wheel) but that wouldn’t work for the Alpine container.
That also assumes of course that it’s just installing the same packages, but with different ABIs required for different OSs, but different OSs may also just require different dependencies installed completely, some of which may be OS specific and don’t even install on other OSs (e.g. pywin32).
Though there’s an unrelated problem of filename length already being an issue on PyPI, and as we try to make a Wheel cover more use cases, we’ll be forced to add more tags to it’s filename, making it even longer. ↩︎
Yeah, I agree. But those are solvable if we decide they need to be solved, and right now they don’t because MUSL users can just install from source (or something ).
Or alternatively, those issues need to be solved anyway because they exist regardless of how you create the environment. Given the number of venvs checked into GitHub, I’d expect that problem to only get incrementally worse, rather than suddenly becoming a new thing. (Meanwhile, a pure-Python environment [without the version number directory people added since my original proposal] is totally fine to check into GitHub, and if that’s all your users need, then they can just clone and run. This is possible today, of course, as is making fat wheels, so the whole proposal is really to formalise it and make it natural rather than a frowned-upon hack.)
Fun fact: at work we added a scan for pip’s
--extra-index-urlcommand line/environment variable. The vast majority of uses were people who had checked in their environment and it was being found in pip’s own source code. ↩︎
I guess it depends who “we” is. Cibuildwheel can build musl linux wheels, and cryptography ships them.
The NumPy wheel, when zipped, weighs about 15MB, has a
24MB 6MB c-extension module in it. Bundling that c-extension module for 6 platforms (2 macos, 2 windows, 2 linux (x86_64 and aarch64)) together would be quite large.
Edit: 6MB not 24MB. I was looking at a debug build
I think there could be a middle ground, that would be achieved by accepting something that merges PEP 582 (with some improvements perhaps) and of PEP 704 (Require virtual environments by default for installers). Something like this (rough, opinionated draft):
- If both
__pypackages__is present and using a venv/conda env, Python should raise a warning on startup, use only the venv/conda env, and pip/package managers should refuse to install (with an escape hatch, i.e. an option to specify the destination).
- If using a venv/conda env, Python should use packages from it and ignore system site-packages.
__pypackages__is present, Python should use packages from it and ignore system site-packages.
- If neither venv/conda env nor
__pypackages__are present, Python should use system site-packages, and pip should refuse to install (unless the user explicitly accepts making a mess in the system site-packages) and tell the user how to fix this, with
__pypackages__being the recommended fix (considering its simplicity for beginners) and venvs also being mentioned as a valid and supported option.
Regarding the cases with sharing a folder, all those cases are equally broken/problematic if you’re using virtual environments stored in
.venv in the project folder (as recommended by PEP 704). Even if you use something like
.venv-linux, the performance is terrible in the WSL2 case. Centrally-stored venvs would be immune from those problems, although note they were the minority in Brett’s recent Fediverse polls.
That said, the I don’t think those cases should be the deciding factor, and they are quite easy to fix (by eg. storing the code in the Linux filesystem on WSL2, by a
.dockerignore for Docker builds, or by limiting your Docker bind mounts to only the source code directories relevant at runtime). And if some workflows are severely hindered by PEP 582, they can keep using virtual environments (nobody is proposing to get rid of them)—but at the same time, many people, especially beginners, would benefit from a simplification to the environment management story.
Conda envs are taken into account too, in line with comments under PEP 704 discussion ↩︎
Or alternatively refuse to run, although that seems too strong. Or alternatively
__pypackages__could win, but IMO an explicit mechanism should win over an implicit mechanism. ↩︎
Wouldn’t this create one more way people can do things and confuse even more newcomers? You’d have tutorials using virtual environments and also PEP-582 now…
The world has a lot of confusing or questionable tutorials.
- You can find tutorials teaching
easy_install, last updated June 2022. Though for some reason, this one uses
pipto install an ancient version of setuptools.
- Or a tutorial utilizing
sudo pip, which claims to be from December 2022.
- Or a tutorial (though more of an advertisement) with
python setup.py installfrom August 2022.
All three of those tutorials are relatively recent (with the oldest being 7 months old). All of them contain bad practices that can confuse newcomers and break their environments.
Bad tutorials aside, you can find tutorials recommending pip+venv, pipenv, poetry, pdm. Is this not confusing to newcomers? Would PEP 582 be special in that regard?
Nobody can control the tutorial-writers of the world, and they can write anything they want. That said, if I were writing tutorials, especially tutorials on packaging.python.org, I would consider PEP 582/
__pypackages__ as the main way to get things done, and venvs as the more advanced route with specific use-cases. A tutorial or some university course teaching people how to work with
requests can just say
mkdir __pypackages__; pip install requests and be done with it, without having to explain the venv situation and without having issues with people forgetting to activate their venvs. This does not preclude a tutorial teaching venvs, listing the cases where they are more useful than PEP 582 (testing different extras/package configurations, for example), and listing their pros and cons compared to PEP 582.