PEP 582 - Python local packages directory

CAM-Gerlach · February 20, 2023, 12:28am

Just FYI @kushaldas , there has been a lot of interest and active discussions happening in the Conda/scientific Python community about PEP 704 and related pip-conda interaction issues, and it would be good to ensure PEP 582 is brought up there too. I’ve seen a lot of attention paid to the potential implications of PEP 704 for Conda and how to handle them, but it seems many/most of the other scientific Python and Conda folks aren’t as aware of this PEP and its similar possible impact on the situation, so it would be good to ensure it is also part of the conversation:

Saphyel · February 20, 2023, 9:04pm

I think Conda already replied in this discussion saying they don’t care about this PEP, so we can ignore them.

PythonCHB · February 21, 2023, 6:02am

I’m not sure who/what “Conda” is that can reply – but as an active member of the conda commumity, I can say I care a little bit about this PEP – it won’t have much effect on conda itself, but it may provide yet another source of confusion for newbies

CAM-Gerlach · February 22, 2023, 7:44am

A couple of the potential issues I can think of for Conda users include:

The presence of a __pypackages__ dir meaning that installers will install into it by default, rather than the active environment, which has the same issues as the parallel pip and conda prefixes proposed and discussed above, along with it meaning whether pip installs into a Conda env or __pypackages__ depends on the user’s CWD, and impossible for Conda to manage
Python adds it to sys.path and any packages here will shadow any pip or conda installed packages in the Conda env, potentially breaking the environment and with no opportunity for Conda to detect or warn about it (as it can with pip-installed packages in its own site-packages)

As far as I’m aware, that was just informed speculation by Conda-using community members that it would likely not be practical or desirable for Conda as an installer to implement this PEP, not that the PEP would have no effect on the UX of Conda users as a whole.

pf_moore · February 22, 2023, 7:57am

That is no longer something the PEP proposes, nor is it something that I as a pip maintainer would support. Having said that, training courses and tutorials beaded around __pypackages__ would result in people installing there without understanding the trade offs, so your point still stands.

methane · February 22, 2023, 8:24am

I don’t think so. This PEP can not replace virtual environments. It just add another way for isolation in Python ecosystem. Since it can not cover all venv use cases, people need to learn both PEP 582 and venv, and chose one.

And people still need to chose tool from pip/pdm/poetry/pipenv/conda/etc…

This PEP will make some use cases simple, but won’t make total Python packaging ecosystem.
I still think recommending tools hiding venv like poetry to beginners is better than PEP 582.

PythonCHB · February 23, 2023, 7:05am

I was wondering about that myself – there is no “conda” to have an opinion. Maybe the core developers of conda itself might present an opinion, but I don’t think they have done so.

As a member of the conda community, I don’t think the proposal will break anything in conda, and it could be used within a conda environment – but I DO think it would provide one more avenue of confusion – I don’t think using __pypackages__ alongside conda would ever be the “right” thing to do.

And as I understand it, the intended use cases are newbies, and tutorials, and putting a __pypackages__ dir in a repo, etc – which means people that do not yet understand the intricacies of pip, virtual environments, conda, etc will find themselves using it,and if they do want to work with conda, there will be confusion.

brettcannon · February 23, 2023, 11:36pm

Instead of speculating, we can just loop in conda developers like @jezdez @LtDan33 @dholth to provide any opinion the conda team may have.

BrenBarn · February 24, 2023, 3:59am

I agree. I think it’s one more avenue of confusion no matter how you slice it. I’m basically -1 on any PEP like this that wants to add one more wrinkle to the already heavily corrugated packaging landscape.

Saphyel · March 5, 2023, 8:48pm

@BrenBarn so your suggestion is let’s keep our heavily corrugated packaging landscape ?

This PEP at least tries a good approach and very popular in most of the other languages so it’s not a random and risky approach or complex for the end user.

BrenBarn · March 5, 2023, 10:35pm

Yes, if the alternative is to corrugate it even more. What I’m saying is that this change will complicate, not simplify, that landscape.

methane · March 7, 2023, 3:00am

No. It is different from other language tools.

https://nodejs.org/docs/v6.10.3/api/modules.html#modules_loading_from_node_modules_folders

Node.js searches node_modules folder from parent directories. So Node.js user can put their scripts in project/bin directory.

This PEP don’t search __pypackages__ directory from parent directories. project/bin/mytool.py can not use packages in project/__pypackages__. This is huge restriction for many Python projects. Python users will need to learn both of __pypackages__ and venv.

takluyver · March 7, 2023, 12:09pm

The ‘relationship to virtual environments’ section in the PEP reads to me like it’s downplaying a major concern: that this would make the landscape of Python modules more confusing and error-prone, not less. Three bits in particular jump out at me:

__pypackages__ can be viewed as providing an isolation capability…

‘Can be viewed’ implies ‘some people would see it this way’, but the motivation section of the same PEP describes it as ‘a lightweight solution that gives isolation’.

it is no different in principle to the presence of the current directory on sys.path (or mechanisms like the PYTHONPATH environment variable). The only difference is in degree…

Those mechanisms already cause user confusion - checking what’s in the working directory and if PYTHONPATH has been set are common steps when helping confused users with import issues. And we probably avoid more problems with PYTHONPATH because we know about the potential problems and rarely recommend setting it.

So it being like those things, but used & recommended substantially more, sounds like a recipe for significant confusion.

The PEP authors believe that developers using virtual environments should be experienced enough to understand the issue and anticipate and avoid any problems.

This seems extremely optimistic to me. The message that everyone should use virtualenvs has been spread far and wide for years, there are scripts and tools that create and use them, and an entire separate environment system (conda) which will be affected by the same concerns. I don’t think you can assume that only experts use environments.

BrenBarn · March 8, 2023, 6:57am

I agree. In fact, I think the better direction to go would be the opposite, and rather try to push for a vision where everyone, expert or not, is always using environments. Some of those environments may correspond to things like “an environment for system-level Python tools” or “my grab bag environment for fiddling with a motley collection of scripts”, but those would still be environments.

That may sound extreme, but I’d say it’s not impossible over time with the right framing. Even very novice users typically have some familiarity with the idea of a “project” or a “folder” or some type of organizational structure for things. The only thing that’s needed is to nudge people to think “Every time you run Python, it’s a particular Python that’s for a particular task or category, even if that category is some very generic ‘miscellaneous’ thing”.

I think part of the problem we have around packaging and environments is that in various situations Python, or Python users, or some other Python tool, tries to pretend there is no environment. But there’s always an environment. It’s just better to make those environments explicitly recognized as such, rather than have them exist implicitly.

brettcannon · March 8, 2023, 8:13pm

I agree, and this is why VS Code is actually working towards this. Our plan is to nudge folks as much as we can to create and use environments as much as possible without being annoying about it (and we are being environment-agnostic, so both virtual and conda environments). My hope is we can make this simple enough that it becomes second nature for folks to do this for every project/workspace they use Python in. And then with VS Code’s reach, my hope is this idea that you create an environment the instant you start a new project becomes a bit more normalized in the community and less of a scary thing.

Now if this PEP gets accepted we will probably lean into it, but even if it doesn’t we are still moving forward with normalizing environments for folks as much as we can.

BrenBarn · March 8, 2023, 11:41pm

That’s good to hear! For me a caveat is that I’m a bit leery of “create an environment when you create a project”, since I don’t see projects and environments as necessarily in one-to-one correspondence. But either way, if something like VS code can get people more comfortable with the idea that “there is always an environment”, that would be great.

What I have more skepticism about, though, is whether the necessary degree of change is going to happen just by making it easier for people to use environments. What also needs to happen is it needs to become harder to not use environments — i.e., like I keep saying in all these threads, the mainstream ways that Python gets distributed to users (such as the Python.org versions) need to eventually shift wholesale to a manager-first approach.

brettcannon · March 9, 2023, 10:50pm

You’re definitely not the only one, but potentially the minority based on an informal poll I did for a blog post on this subject. But the key thing here is what’s best for beginners, and I personally don’t think trying to do a N:M relationship of projects to environments will help. This is why we are keeping it simple and opinionated and just defaulting to sticking the environment in .venv locally.

That’s part of the work we are doing: it’s not just a Create Environment command, but also actively suggesting to users they probably forgot to create an environment when they don’t have one selected when performing actions where a virtual environment would probably be a good thing to have (e.g. running code since you probably have some dependencies you should install). We’re still working out where to insert such prompting, but this isn’t going to be passive since I know from experience people who aren’t embedded in Python development do not necessarily know environments even exist.

PythonCHB · March 11, 2023, 7:59am

I was curious about what community you sampled, and immediately noted that “When I talk about virtual environments, I am not talking about conda environments.”

and:

“This entire post is assuming that using virtual environments is something everyone should be doing, and so there will be no time spent justifying their use.”

This is starting off with a bias – it feels like what you learned is not how people use virtual environments, but how people that already have a particular work flow and use case use virtual environments – which may be appropriate for what you’re thinking about, but maybe not fully applicable to a more general discussion.

Then there is:
" When I did a poll via Mastodon to figure out why people used a central directory approach, the majority of people did it that way because their tool happened to work that way or it was just habit (53%). The next biggest group kept their environments in a central directory for environment reuse (24%)."

So that’s 24% – a minority for sure, but not a negligible one. And of the 58% maybe some of those folks would choose to use a central directory if they thought about it carefully. So who knows?

Anyway, the take-away for me that is not storing them locally is indeed an important use case.

I think the best thing for beginers is to have a single environment to work with, so not N:M, but 1:M. most of the really important reasons to use virtual environments don’t apply to beginners (of course, there’s a lot of different kinds of “beginners” [*]).

What probably most of us DO agree with is that using the “system” python as your default general purpose environment is probably not a good idea - particularly if you’re running an OS that uses the system python for its own use. That being said, I’ve had exactly zero problems in my intro courses using no environments at all (at least for Windows and Mac users – linux folk are going to start having issues now that EXTERNALLY-MANAGED is becoming a thing).

What I would like to see is not the current “use the system python by default” or “you have to make and use a virtual environment” – what I think would serve folks best is if the out of the box Python install came with a separate-from-the-main-system default environment. Then a simple, nothing special pip install whatever would go in this environment (probably in the user’s local dir) Then folks wouldn’t have to learn all about environments when they don’t need to, but would also not tend to break their system using the default workflow.

This is how I set up my machines with conda – I create a default environment for my general work, and I have my shell set up to activate it by default – this may not work well with IDEs, etc, but it works great for me. If there could be a way to get that experience out of the box with standard Python and pip, I think that would go a long way.

As a rule, I think it’s best for it to be easy to do the right thing, and harder to do the wrong thing – so kinda yes. But please don’t make it harder to not use environments without also making it easier to use them.

Again: does “use environments” mean “use a separate environment for every project” or “don’t use the system python” – Ideally make it hard to use the system Python, but dead easy to use a single, per-user environment out of the box.

[*] KInds of Beginners:
If you are teaching someone that’s already a developer how to create web apps in Python then yes, start with “create an environment for your project”

If you are teaching folks that are brand new to Python, and maybe brand new to coding in general, then there is no reason to make virtual environments, and certainly not one per project – what is a “project” in that case? small little exercises for a class do not need the overhead of a virtual environment.

If you are teaching folks that want to write sysadmin scripts, or do a little data analysis or text file crunching, then virtualenvs are also not the best way to start.

BrenBarn · March 11, 2023, 8:39am

As @PythonCHB noted, the fact that you explicitly excluded conda there means we’re not really talking about the same thing. What I mean by “environment” is just “some way of isolating/separating different dependency trees”. It’s a concept, not tied to any particular tool.

Still, if the VS Code thing just gets people into the mindset of “hey everything I do is in some environment”, that’s still progress in my book. The hard part is getting people from thinking “what are environments? I thought I was just using Python” to “okay so everything I do with Python is in some kind of environment”. Once they’ve grasped that, it’s much easier for them to understand that there could be different tools to manage those environments. (They may still like or dislike some of those tools for various reasons, of course.)

I mean, I don’t use us VS Code myself, but the way I think of it is even more extreme. It’s more like, you cannot run code at all without an environment, because without an environment, you don’t have Python. If someone’s in an IDE and tries to run code, they have to have some environment. The question is just what is that environment and how is selected (e.g., are they prompted to create one when creating a project, are they choosing from a list of existing environments, etc.).

BrenBarn · March 11, 2023, 8:56am

What I mean by “harder” here is more like “it should be hard to get to a situation where you are able to run Python without some awareness of what environment you’re in”. If that environment is “the environment in which critical systemwide OS components run”, that’s fine, it’s just that the person using it should realize that is what it is.

Definitely not separate env per project, at least for me! But in my utopian future, there is no “system Python” as we know it now. The system Python is just another environment. It’s an environment that probably has some safeguards against accidentally messing it up (akin to what’s being discussed for protecting the conda base environment), but it’s not different in kind. There could even be multiple system environments for different system tools, allowing such tools to have distinct dependency trees.

I’d say there is still a reason to make and use environments in that case. In fact part of what you would teach them is just “When you use Python, you’re always using some particular Python setup, which we call an environment. That could represent a single project, or a ‘workspace’ used for many projects, or various things. For now, we’ll just create an environment for this class. . .” You don’t need to spend a lot of time on it, but just introduce the concept as an inseparable component of what you do whenever you use Python.

It does seem from this discussion that the thing that everyone agrees on “using the system Python by default is bad”, and maybe even “using the system Python at all is generally bad” (because the only people for whom it’s clearly good are those writing system tools, who know what they’re doing well enough to ignore any such dictum). My own inclination is to try to abstract that a bit and say “it is bad to be doing work for a particular task in an environment that also controls more global important things that you do not want to mess up”. This includes the system Python, but also things like conda base. It could even include, environments created by pipx containing apps you use globally.

One way or another though, we want the default tooling to firmly and unapologetically put people in a situation where they are not messing up an important/global environment with task-specific work. In that sense this PEP may be a positive thing, but I’m still against it because it encourages the “don’t-think-about-environments” mindset.