Conda python, just like any Python, allows things to be added to sys.path, and it calculates the initial sys.path the same as standard Python (because it is standard python, just with a different build process). So I don’t see how conda python could not support this PEP.
Conda the package manager could choose not to provide a way to install into a __pypackages__ directory. So could pip. That’s not (an enforceable) part of the PEP, so that’s fine.
The implications of people putting stuff into __pypackages__ is something conda would have to deal with. Just like pip will. And just like both of us have to deal with people setting PYTHONPATH, or manipulating sys.path at runtime right now.
Someone sufficiently insane could try to write a packaging PEP to lay down rules on how installers should deal with PYTHONPATH or runtime sys.path manipulation, or indeed PEP 582, and get both pip and conda to buy into that standard. Good luck with that - I’ll get popcorn and watch the show
It is true that the core of the PEP is adding some entries to sys.path, but I think it’s a bit more subtle than just “it’s of benefit to some use cases, and does no harm if not used”.
On the benefit side, it’s typical in tools that implement a PEP 582 like workflow to not just check the current directory, but also recurse upwards. This isn’t a bad thing, because it lets things still work if a beginner is in a sub directory of their project, and it matches expectations that comes from tools like npm, git, etc. I think it would be confusing to people if our implementation of that implicit local environment didn’t also carry that behavior, and I think it blunts a lot of the benefit that can be had by limiting itself in this way.
On the harm side, I think that’s underestimating things. Every path on sys.path comes with a cost, while a lot of work has been done to improve the impact on interpreter startup and import system, it’s still not zero. But there’s another cost here in that these new directories are going to do the wrong thing sometimes, and now people who don’t even use or want the feature are going to have to be aware of it and know how to work around it, particularly if tools start to install there by default. This harm of course grows if the PEP doesn’t blunt it’s usefulness by limiting it’s ability to scan for a directory, but even in it’s current iteration it’s still there.
One of the things that I’ve noticed is that the discussion around this PEP is messy, because a lot of people when discussing the benefits either assume that it’s going to recurse into parent directories (because that’s what basically every other tool that implements something like this does) or they make claims that to realize the benefits they claim, rely on it.
If PEP 582 goes in without recursing, I suspect that we will immediately hear complaints that we behave differently than every other tool like this does, which is why I agree with @njs that it feels like a compromise that makes nobody happy.
From beginning this PEP only talked about scanning current directory and nothing else (unless it is a script, where we check the script’s directory). The people who are asking for scanning parent directories are looking for much special usecase where they also take care of the security side (who are allowed to write in parent directories etc).
It is also not supported in the current Python interpreters, even while using virtual environments.
Also, the major intended folks who will be highest gainer from the PEP are new folks are Python, and they are not the one talking in this thread, instead folks with much more experience asking for exact corner cases of their needs.
Agreed. If that’s enough of a problem, the PEP will be rejected. @kushaldas said he didn’t want to add that (because of the security implications). Fair enough. For what it’s worth, my use cases for this feature revolve around bundling dependencies for scripts, and in that context, scanning parents isn’t needed or helpful. So I’m OK with this. People wanting a “virtualenv-lite” solution won’t like it. I’m not personally convinced that “virtualenv-lite” is an important use case. People wanting a simpler way of teaching Python (a separate case than “virtualenv-lite”) will have to judge for themselves, when I taught Python I didn’t have this issue.
I wasn’t saying that there’s no harm, just that the harm can (mostly!) be assessed in the same way as any other proposal to add to sys.path. You enumerated some of those. I’ll add that like adding the CWD, this proposal adds a context-dependent value, which has additional risks. Such as shadowing the stdlib with a local file, or allowing users to forget they had a __pypackages__ in a particular location.
I actually agree with your estimate of the harm. I’m happy for the SC to judge harm vs benefit, as long as we’re giving a clear picture of the situation.
Yeah, this is the one that bugs me. I’m strongly against pip installing to __pypackages__ by default. Adding an option to do this in an opt-in fashion is one thing (there’s a bunch of questions to answer, but given that it’s essentially just a shortcut for --prefix __pypackages__ it’s hard to object too strongly) but I’m a hard “no” on changing the default, as I’ve already said.
I can’t insist on what other tools do, but I’ll go on record as saying that I’d strongly prefer them to make __pypackages__ an explicit opt-in. I am worried that they might not - the rush to implement draft versions of this PEP before it was approved suggest that people don’t think the same as me on this.
Should the PEP take a stance on this? I’ve argued strongly in the past that it’s not the place of PEPs (especially core Python PEPs) to dictate how language features are used, and I’ll stick to that and say we can’t have anything enforceable. But should we have a non-normative “what should tools do” section? I’ve gone back and forth on this. On the one hand, it would clearly set expectations. On the other hand, it’s bound to annoy at least half of the audience whatever it says. Personally, I’d prefer the PEP to come out and say that defaulting to __pypackages__ is not recommended, but I’m also happy to have it just say nothing, as I suspect that whatever it says, some tools will ignore it and that will damage the credibility of the PEP as a whole.
Oh, boy, yes. This is why I was frustrated with the original PEP, because it left too much for people to make assumptions about. Hopefully the new version will be more explicit (I’ve seen it, and I think it is, but I don’t know what others might think).
If it goes in without recursion, then those complaints will be directrd at the SC, who approved the PEP. If the SC come back with a position on recursing, I’d hope we would follow their guidance. And I’ve only just thought of this, but maybe the PEP should add an “Open question” on the matter (or at least a “rejected ideas” item).
Having been in the position of having to approve controversial PEPs like this, I have to say I don’t envy the SC deciding on this, no matter what the PEP ends up proposing. But withdrawing the PEP doesn’t help, either - we’d just have it come up again in the future.
It’s really hard to use a small number of words to explain what I mean IN that context “incompatible” meant that conda can’t install into __pypackages__, nor use pyproject.toml, etc.
Which is why I’ve mostly said this would “create confusion” for conda users.
Yes, a conda-managed python could (and would) respect __pypacakges__. But that’s exactly the problem. You could have any number of conda environments, and they would all use a __pypacakges__ dir if it was there, and the packages in there would likely be incompatible with some of conda environments. And the Python packages installed would have no knowledge of what might be in a future __pypackages__ dir, and if/when someone used pip to install stuff into __pypacakges__ it would have no idea what conda environment it might be running in in the future.
Exactly – that’s the “confusion” I refer to. Which why I like the idea of having this feature opt-in – then conda (who who knows what other system (spack?, apt, yum?) could opt-out, and then their users would be a lot less likely to get confused.
I"m not saying that this PEP should not be implemented because conda or any other tool can’t use it – I’m just saying that it would be good for folks to consider the impacts on the users of other package / environment systems.
I think that the problem here is that the “every other tool” people are thinking of is project management tools. This isn’t a project management feature - the PEP never said it was. It’s just that people are trying to fit it into that niche. And it sort-of fits, so they see the rough edges as flaws with the proposal rather than a mismatch with their expectations.
It’s understandable that this has happened - the messaging hasn’t been particularly clear, and sub-discussions based on incorrect assumptions haven’t been shut down fast enough to stop those assumptions taking hold - but that doesn’t mean that the PEP is wrong to take that position.
Maybe there’s a different PEP needed, proposing a “project local” package directory. That would make different trade-offs and could well address different use cases (I feel like it wouldn’t be what I’d want for bundling script dependencies, but maybe I’m wrong). Then the two PEPs could be submitted, either as competing or complementary proposals.
But I don’t think a single solution will satisfy everyone. Maybe I’m wrong, though. If anyone wants to take that as a challenge to come up with a “unified” PEP, go for it!
Isn’t that the case right now with usersite, or the CWD being on sys.path, or PYTHONPATH, or siteconfig hacks, though? I’m doing my best to frame this PEP as “just another way things get added to sys.path” (and I think @kushaldas agrees with that idea) - so what’s so unusual about this PEP that you want this case to be treated differently (I’m not sure if it’s you that’s arguing for a separate launcher or if that’s someone else, but “differently” seems like it’s the main thing). Maybe you have a very different view to me of what the world will look like if this PEP gets accepted.
Anyway, I guess I don’t really have a strong view. I think I said previously (although I searched for 20 minutes and couldn’t find where ) I’m basically neutral on the PEP, as a sys.path change. Which is what I understand it to be. At a more detailed level, if the PEP gets accepted:
I will strongly oppose any suggestion that pip change its default install behaviour to install into __pypackages__. If I’m overridden, I will ensure that there’s an option to turn the behaviour off, in configuration.
I won’t object if people want to add an option to pip which explicitly installs into __pypackages__. I may even use it occasionally.
I will continue to use virtual environments rather than __pypackages__ in my own work.
I won’t use tools that use __pypackages__ rather than virtual environments, and I will prefer tools that make virtual environments the default and __pypackages__ an opt-in (or not supported at all), rather than the other way around.
I will use __pypackages__ for bundling script dependencies, to save a small amount of runtime sys.path hacking.
In that situation, I will be glad if my tools (e.g., VS Code) recognise my dependencies (for auto-completion, type checking, etc) as a consequence of me using a standard location rather than runtime path changes.
That’s basically the only ways this PEP will affect me if it’s accepted. Well, there’s two others, I guess:
I will get endlessly frustrated at people telling me I should be using __pypackages__ more, because it’s “better” in some unspecified way.
I will be mildly irritated in an “I told you so” sort of way with people who loudly complain that __pypackages__ isn’t any use to them because it doesn’t do things the PEP never promised it would do.
PEP 582 does not solve any need I have, so I’m not asking for a corner case for my needs, because whether it lands or not or in what fashion basically doesn’t affect me. However, I think for people new to Python, not scanning parent directories is more confusing than scanning them. You can disagree with that opinion, that’s fine, I’m just stating that I think the vast majority of people for whom this is intended to serve, will be more confused by it not recursing.
Why I think that is for two reasons:
People new to programming in general in my (I’ll admit, limited) experience tend to struggle with the idea of internalizing navigating around a directory structure, knowing where they are in that directory structure, etc. I think we’ll see a lot of people create a __pypackages__ in one directory, then CD down into another directory, and get confused thinking their packages got uninstalled.
People not new to programming, but new to Python are going to be a lot more familiar with existing tools and languages like Node.js that implement the recursive behavior, and will naturally assume that Python’s version of this has similar behavior, and will get confused (or at least frustrated) when it does not.
An interesting thing here I think, is if I remember correctly, and I may be wrong, but PDM which attempted to implement PEP 582 actually implemented it recursing up into parent directories looking for the __pypackages__.
The Node.js interpreter itself is where the behavior of node_modules comes from, which is, I think, where most people draw their assumptions how this will work from. I think most of the conversations where I’ve seen an end user ask for something like this, they’ve even directly referenced node_modules.
Sure, I’m mostly commenting because I think those complaints will come, and I’d hope we can head them off, whether I’m the person getting complained at or not
But a proposal to add scanning for __pypackages__ needs to discuss how Python will avoid issues like the git vulnerability CVE-2022-24765. And someone needs to write that proposal - @kushaldas has rather clearly said he doesn’t intend to do so. That’s what I meant by competing PEPs.
But PEP 582 should explicitly add directory scanning as a rejected idea, explaining that it was rejected because we didn’t have a good solution to the risk of this type of vulnerability. It doesn’t, yet, but I assume @kushaldas will add it.
One thing I will say is @kushaldas has taught a lot of beginners over the years and I believe has used his own script that implemented PEP 582 on them and from my understanding it went well (albeit with the rough edge of having to instruct folks to use the script instead of calling Python directly), so this isn’t coming from a theoretical place for him.
After some meditation, some thoughts. No particular time-order is implied.
I did a quick survey of the common languages other than Python that I use and asked “What do they do?”
Go: Scans upwards until it hits the root of a vcs repo, $HOME, or /, starting from the current directory, but this varies depending on if GO11MODULES is set to yes/no/auto.
npm/node: current directory that the interpreter was started from. Documentation.
.NETCore: dependencies are defined differently and arbitrarily, however either a project or solution is being looked at already, and all paths are relative to that file.
So, no major consensus, however, many people have likened the PEP to that of node_modules, and I think the comparison is apt.
I will wholly agree that there has been a certain degree of justification-wringing. I’ll raise my hand first to try and say that I’m probably guilty of it.
This was part of the reason why I suggested “stop at home/root” and idly suggested that it needed to be adjacent to a pyproject.toml.
I strongly agree that the behavior should be opt-in. I actually think pip should force the user to choose where it will be installed (--global, --user, --local/--project)
That’s your perogative
I don’t see why the two can’t coexist, or you make your own decision.
I was under the impression that’s an unstated 80% of the goal.
In thinking about this, I think I’ve a proposal, @kushaldas:
Make explicit the default behavior described in the PEP: __pymodules__ must be adjacent to the module that __main__ is in or in the current directory, with preference to wherever __main__ resides. To further avoid any confusion, the module MUST be in a child path of the current directory if it is not already there (that is, the path that __pymodules__ exists in must be a prefix of the full path that __main__ resides in) and MUST be owned by the same user/etc as is plausible – working with Dev Containers in VS Code often leaves me with sticky files owned by other users because of the container uid:gid mangling.
Allowing for tools to override this behavior is essential: Having an environment variable, perhaps PYTHON_MODPATH, that allows a user to override this logic allows tools like Conda, pipenv, py, etc. to all make choices on their own. This also allows for things like PyTest to handle multiple versions of libraries, etc.
I see this as related to other “packaging strategy” thread. Increasingly I am of the opinion that more and more PEPs handling this or that use case not only will not help the situation, but will actually make things worse, because they are just adding one more alternative to all the existing ways of handling packages, one more option people must consider when figuring out how they want to do things. I think it would be better for the Python world if no further changes were made to packaging/environments at all except as part of a single unified attempt to handle a large number of use cases and, in the process, clear the field as much as possible of competing existing solutions. Otherwise we are just digging deeper in the same hole we’re in (namely, that there are many conflicting and confusing things to consider when dealing with Python packages and environments).
Unless I’m understanding this incorrectly, this won’t work with any project layout (e.g. the recommended src layout) other than the discouraged “implicit”/“flat” one dumping the top-level import modules/packages directly in the project dir (unless you put the __pypackages__ dir inside the src dir or equivalent, which seems rather silly and is explicitly not in the CWD).
By the way, for people not monitoring the PEP repository, there is a new version of the PEP. It’s a pretty major rewrite, so please read it to see if it makes things clearer. Ideally, try to ensure further discussion is related to the new text, and not to the old version, or to things people have assumed earlier in this discussion - but we all know that’s not always as easy as we’d hope
One way to make it more likely (that people will have read the rewrite before commenting) would be to start a new topic with “PEP 582 (take 2)” or so as the title and close/lock this thread – based on how PEP 594 was handled.
Fair point. Technically there is a __main__, but I’m being extremely pedantic.
Apologies, I had just gotten off a work call where the word “module” had to have been said over 100 times in under 10 minutes and my brain filled in a word.
On reading the new version:
In another example scenario, a trainer of a Python class can say “Today we are going to learn how to use Twisted! To start, please checkout our example project, go to that directory, and then run python3 -m pip install twisted.”
The PEP authors believe that developers using virtual environments should be experienced enough to understand the issue and anticipate and avoid any problems.
This absolutely needs to not change the default behavior of Pip. It will absolutely wreak havoc on Docker containers (who often just git checkout [repo] && pip install -r repo/requirements.txt) and Jupyter notebooks used in ML/Datascience from places like Google depend on weird virtualenv setups (for example, DTensor’s example notebook straight up shells out to pip to install multiple packages, with in this case the ipynb kernel running somewhere like / )
A --here or --local option, or a pkg-install command needs to be how this is interacted with in pip.