PEP 582 - Python local packages directory

I’m not sure I understand that – but it sounds like a great collaboration.

To be clear, I’m not suggesting that this PEP would be incompatible with conda – you’d “simply” have to not use __pypackages__ with conda environments.

But I do think it could lead to a lot of confusion for folks, particularly newbies – if this is widely adopted, then projects will ship with a __pypackages__ dir, and the “quick start” docs will tell folks how to use it, etc. And conda users will get confused.

I used to see a lot of messages on the conda lists about folks getting tangled up using virtualenv with / inside conda environments – it CAN be done, but it creates a mess, and it is totally unnecessary. I haven’t seen a much of that for a while, so maybe it’s sorted itself out.

And maybe this will too.

1 Like

I don’t understand – how could conda adopt PEP-621? It is NOT a Python package manager, it is a general purpose package manager. pyproject.toml is just what it says it is – metadata about a Python project.

I don’t think so – in fact, a well-build conda package of a Python distribution is installed exactly how it would be by pip. Because it was installed by pip, during the conda build process. pip shouldn’t behave differently if it is called by conda-build, that’s kinda the point.

The way conda environments work is that everything they need is inside one dir. This PEP proposes to put python packages in arbitrary places in the file system, it’s completely incompatible. [note: folks smarter than me might be be able to figure out a way to make it work, I don’t know]

3 Likes

That’s a fair read. To be honest, 99% of my time trying to work with Conda has been swearing at it and wondering how people use it for anything useful, then throwing it away and using virtualenvs anyway. This has been my attempt to use it over and over again for machine learning projects.

What would the cost of treating Conda like a funny virtualenv?

To be precise, Conda is a package manager, and it makes no sense to compare it with virtualenv. If you mean Conda environment, it would kind of make sense to compare it to virtualenv for certain use cases (when you populate Python and Python packages into the Conda environment, granted that’s probably >90% of its use cases in practice).

That would have been me years ago with pip and any python package that required compilation, before I found conda. I spent a LOT of time compiling packages for the Mac, and if Christoph Gohlke’s old Windows package repo wasn’t there, we’d have been dead in the water on Windows. it was truly unusable – conda was a massive step in the right direction.

What with wheels and all (and note that anylinux was inspired by conda) things are MUCH better now, but conda still has a major edge if you step out of the just Python world. And even today – try using the osgeo stack on anything other than LInux.

Once conds-forge gained momentum, it’s been remarkably easy to get stuff done with conda.

If you struggle, than either:

  • no one is providing the packages you need for conda :frowning:
    or
  • you’re trying to use it in a way that it wasn’t designed for.

I don’t do ML, but a heck of a lot of folks do use conda for it.

I don’t follow. conda environments are kinds/sorta like virtualenv – I honestly don’t know virtualenv well enough to be able to describe the differences – except that conda manages non-python libraries, etc as well.

1 Like

To be fair, the core of the PEP is nothing more than adding a couple of entries to sys.path on startup. That’s neither more nor less useful than the fact that the current directory, or python.zip, is added to sys.path on startup - it’s of benefit for certain expected use cases, and does no harm if not used.

The problem lies in the fact that the motivation for the change is around particular ways of using that addition. And there’s a lot of confusion and mistaken expectations about those changes to workflows - which isn’t helped by the many years this PEP has been around, and people’s speculation about what might happen.

Yes, the current version of the PEP[1] suggests changes to pip as well, but (1) these have significant issues around backward compatibility, and (2) a core PEP isn’t the right place to propose changes to pip’s UI anyway. We’ll thrash out how pip wants to respond to this PEP if it gets accepted on pip’s tracker, not here.

Maybe in the end, the PEP is something that doesn’t appeal to people. Maybe just adding entries to sys.path is a compromise that doesn’t do enough to be useful. That’s fine, in that case the PEP will be rejected.


  1. There’s a revision in the works. ↩︎

3 Likes

Conda python, just like any Python, allows things to be added to sys.path, and it calculates the initial sys.path the same as standard Python (because it is standard python, just with a different build process). So I don’t see how conda python could not support this PEP.

Conda the package manager could choose not to provide a way to install into a __pypackages__ directory. So could pip. That’s not (an enforceable) part of the PEP, so that’s fine.

The implications of people putting stuff into __pypackages__ is something conda would have to deal with. Just like pip will. And just like both of us have to deal with people setting PYTHONPATH, or manipulating sys.path at runtime right now.

Someone sufficiently insane could try to write a packaging PEP to lay down rules on how installers should deal with PYTHONPATH or runtime sys.path manipulation, or indeed PEP 582, and get both pip and conda to buy into that standard. Good luck with that - I’ll get popcorn and watch the show :slightly_smiling_face:

2 Likes

It is true that the core of the PEP is adding some entries to sys.path, but I think it’s a bit more subtle than just “it’s of benefit to some use cases, and does no harm if not used”.

On the benefit side, it’s typical in tools that implement a PEP 582 like workflow to not just check the current directory, but also recurse upwards. This isn’t a bad thing, because it lets things still work if a beginner is in a sub directory of their project, and it matches expectations that comes from tools like npm, git, etc. I think it would be confusing to people if our implementation of that implicit local environment didn’t also carry that behavior, and I think it blunts a lot of the benefit that can be had by limiting itself in this way.

On the harm side, I think that’s underestimating things. Every path on sys.path comes with a cost, while a lot of work has been done to improve the impact on interpreter startup and import system, it’s still not zero. But there’s another cost here in that these new directories are going to do the wrong thing sometimes, and now people who don’t even use or want the feature are going to have to be aware of it and know how to work around it, particularly if tools start to install there by default. This harm of course grows if the PEP doesn’t blunt it’s usefulness by limiting it’s ability to scan for a directory, but even in it’s current iteration it’s still there.

One of the things that I’ve noticed is that the discussion around this PEP is messy, because a lot of people when discussing the benefits either assume that it’s going to recurse into parent directories (because that’s what basically every other tool that implements something like this does) or they make claims that to realize the benefits they claim, rely on it.

If PEP 582 goes in without recursing, I suspect that we will immediately hear complaints that we behave differently than every other tool like this does, which is why I agree with @njs that it feels like a compromise that makes nobody happy.

2 Likes

From beginning this PEP only talked about scanning current directory and nothing else (unless it is a script, where we check the script’s directory). The people who are asking for scanning parent directories are looking for much special usecase where they also take care of the security side (who are allowed to write in parent directories etc).

It is also not supported in the current Python interpreters, even while using virtual environments.

Also, the major intended folks who will be highest gainer from the PEP are new folks are Python, and they are not the one talking in this thread, instead folks with much more experience asking for exact corner cases of their needs.

2 Likes

Agreed. If that’s enough of a problem, the PEP will be rejected. @kushaldas said he didn’t want to add that (because of the security implications). Fair enough. For what it’s worth, my use cases for this feature revolve around bundling dependencies for scripts, and in that context, scanning parents isn’t needed or helpful. So I’m OK with this. People wanting a “virtualenv-lite” solution won’t like it. I’m not personally convinced that “virtualenv-lite” is an important use case. People wanting a simpler way of teaching Python (a separate case than “virtualenv-lite”) will have to judge for themselves, when I taught Python I didn’t have this issue.

I wasn’t saying that there’s no harm, just that the harm can (mostly!) be assessed in the same way as any other proposal to add to sys.path. You enumerated some of those. I’ll add that like adding the CWD, this proposal adds a context-dependent value, which has additional risks. Such as shadowing the stdlib with a local file, or allowing users to forget they had a __pypackages__ in a particular location.

I actually agree with your estimate of the harm. I’m happy for the SC to judge harm vs benefit, as long as we’re giving a clear picture of the situation.

Yeah, this is the one that bugs me. I’m strongly against pip installing to __pypackages__ by default. Adding an option to do this in an opt-in fashion is one thing (there’s a bunch of questions to answer, but given that it’s essentially just a shortcut for --prefix __pypackages__ it’s hard to object too strongly) but I’m a hard “no” on changing the default, as I’ve already said.

I can’t insist on what other tools do, but I’ll go on record as saying that I’d strongly prefer them to make __pypackages__ an explicit opt-in. I am worried that they might not - the rush to implement draft versions of this PEP before it was approved suggest that people don’t think the same as me on this.

Should the PEP take a stance on this? I’ve argued strongly in the past that it’s not the place of PEPs (especially core Python PEPs) to dictate how language features are used, and I’ll stick to that and say we can’t have anything enforceable. But should we have a non-normative “what should tools do” section? I’ve gone back and forth on this. On the one hand, it would clearly set expectations. On the other hand, it’s bound to annoy at least half of the audience whatever it says. Personally, I’d prefer the PEP to come out and say that defaulting to __pypackages__ is not recommended, but I’m also happy to have it just say nothing, as I suspect that whatever it says, some tools will ignore it and that will damage the credibility of the PEP as a whole.

Oh, boy, yes. This is why I was frustrated with the original PEP, because it left too much for people to make assumptions about. Hopefully the new version will be more explicit (I’ve seen it, and I think it is, but I don’t know what others might think).

If it goes in without recursion, then those complaints will be directrd at the SC, who approved the PEP. If the SC come back with a position on recursing, I’d hope we would follow their guidance. And I’ve only just thought of this, but maybe the PEP should add an “Open question” on the matter (or at least a “rejected ideas” item).

Having been in the position of having to approve controversial PEPs like this, I have to say I don’t envy the SC deciding on this, no matter what the PEP ends up proposing. But withdrawing the PEP doesn’t help, either - we’d just have it come up again in the future.

It’s really hard to use a small number of words to explain what I mean :frowning: IN that context “incompatible” meant that conda can’t install into __pypackages__, nor use pyproject.toml, etc.

Which is why I’ve mostly said this would “create confusion” for conda users.

Yes, a conda-managed python could (and would) respect __pypacakges__. But that’s exactly the problem. You could have any number of conda environments, and they would all use a __pypacakges__ dir if it was there, and the packages in there would likely be incompatible with some of conda environments. And the Python packages installed would have no knowledge of what might be in a future __pypackages__ dir, and if/when someone used pip to install stuff into __pypacakges__ it would have no idea what conda environment it might be running in in the future.

Exactly – that’s the “confusion” I refer to. Which why I like the idea of having this feature opt-in – then conda (who who knows what other system (spack?, apt, yum?) could opt-out, and then their users would be a lot less likely to get confused.

I"m not saying that this PEP should not be implemented because conda or any other tool can’t use it – I’m just saying that it would be good for folks to consider the impacts on the users of other package / environment systems.

I think that the problem here is that the “every other tool” people are thinking of is project management tools. This isn’t a project management feature - the PEP never said it was. It’s just that people are trying to fit it into that niche. And it sort-of fits, so they see the rough edges as flaws with the proposal rather than a mismatch with their expectations.

It’s understandable that this has happened - the messaging hasn’t been particularly clear, and sub-discussions based on incorrect assumptions haven’t been shut down fast enough to stop those assumptions taking hold - but that doesn’t mean that the PEP is wrong to take that position.

Maybe there’s a different PEP needed, proposing a “project local” package directory. That would make different trade-offs and could well address different use cases (I feel like it wouldn’t be what I’d want for bundling script dependencies, but maybe I’m wrong). Then the two PEPs could be submitted, either as competing or complementary proposals.

But I don’t think a single solution will satisfy everyone. Maybe I’m wrong, though. If anyone wants to take that as a challenge to come up with a “unified” PEP, go for it!

1 Like

Isn’t that the case right now with usersite, or the CWD being on sys.path, or PYTHONPATH, or siteconfig hacks, though? I’m doing my best to frame this PEP as “just another way things get added to sys.path” (and I think @kushaldas agrees with that idea) - so what’s so unusual about this PEP that you want this case to be treated differently (I’m not sure if it’s you that’s arguing for a separate launcher or if that’s someone else, but “differently” seems like it’s the main thing). Maybe you have a very different view to me of what the world will look like if this PEP gets accepted.

Anyway, I guess I don’t really have a strong view. I think I said previously (although I searched for 20 minutes and couldn’t find where :slightly_frowning_face:) I’m basically neutral on the PEP, as a sys.path change. Which is what I understand it to be. At a more detailed level, if the PEP gets accepted:

  • I will strongly oppose any suggestion that pip change its default install behaviour to install into __pypackages__. If I’m overridden, I will ensure that there’s an option to turn the behaviour off, in configuration.
  • I won’t object if people want to add an option to pip which explicitly installs into __pypackages__. I may even use it occasionally.
  • I will continue to use virtual environments rather than __pypackages__ in my own work.
  • I won’t use tools that use __pypackages__ rather than virtual environments, and I will prefer tools that make virtual environments the default and __pypackages__ an opt-in (or not supported at all), rather than the other way around.
  • I will use __pypackages__ for bundling script dependencies, to save a small amount of runtime sys.path hacking.
  • In that situation, I will be glad if my tools (e.g., VS Code) recognise my dependencies (for auto-completion, type checking, etc) as a consequence of me using a standard location rather than runtime path changes.

That’s basically the only ways this PEP will affect me if it’s accepted. Well, there’s two others, I guess:

  • I will get endlessly frustrated at people telling me I should be using __pypackages__ more, because it’s “better” in some unspecified way.
  • I will be mildly irritated in an “I told you so” sort of way with people who loudly complain that __pypackages__ isn’t any use to them because it doesn’t do things the PEP never promised it would do.

:slight_smile:

3 Likes

PEP 582 does not solve any need I have, so I’m not asking for a corner case for my needs, because whether it lands or not or in what fashion basically doesn’t affect me. However, I think for people new to Python, not scanning parent directories is more confusing than scanning them. You can disagree with that opinion, that’s fine, I’m just stating that I think the vast majority of people for whom this is intended to serve, will be more confused by it not recursing.

Why I think that is for two reasons:

  1. People new to programming in general in my (I’ll admit, limited) experience tend to struggle with the idea of internalizing navigating around a directory structure, knowing where they are in that directory structure, etc. I think we’ll see a lot of people create a __pypackages__ in one directory, then CD down into another directory, and get confused thinking their packages got uninstalled.

  2. People not new to programming, but new to Python are going to be a lot more familiar with existing tools and languages like Node.js that implement the recursive behavior, and will naturally assume that Python’s version of this has similar behavior, and will get confused (or at least frustrated) when it does not.

An interesting thing here I think, is if I remember correctly, and I may be wrong, but PDM which attempted to implement PEP 582 actually implemented it recursing up into parent directories looking for the __pypackages__.

The Node.js interpreter itself is where the behavior of node_modules comes from, which is, I think, where most people draw their assumptions how this will work from. I think most of the conversations where I’ve seen an end user ask for something like this, they’ve even directly referenced node_modules.

Sure, I’m mostly commenting because I think those complaints will come, and I’d hope we can head them off, whether I’m the person getting complained at or not :wink:

6 Likes

Ah, yes. I’m not that familiar with Javascript development, so I forget that node does this. I agree that having something that’s unlike node will be a stumbling block for newcomers (many of whom are new to Python, but not new to programming).

But a proposal to add scanning for __pypackages__ needs to discuss how Python will avoid issues like the git vulnerability CVE-2022-24765. And someone needs to write that proposal - @kushaldas has rather clearly said he doesn’t intend to do so. That’s what I meant by competing PEPs.

But PEP 582 should explicitly add directory scanning as a rejected idea, explaining that it was rejected because we didn’t have a good solution to the risk of this type of vulnerability. It doesn’t, yet, but I assume @kushaldas will add it.

2 Likes

One thing I will say is @kushaldas has taught a lot of beginners over the years and I believe has used his own script that implemented PEP 582 on them and from my understanding it went well (albeit with the rough edge of having to instruct folks to use the script instead of calling Python directly), so this isn’t coming from a theoretical place for him.

3 Likes

After some meditation, some thoughts. No particular time-order is implied.

I did a quick survey of the common languages other than Python that I use and asked “What do they do?”

  • Go: Scans upwards until it hits the root of a vcs repo, $HOME, or /, starting from the current directory, but this varies depending on if GO11MODULES is set to yes/no/auto.
  • npm/node: current directory that the interpreter was started from. Documentation.
  • .NETCore: :person_shrugging: dependencies are defined differently and arbitrarily, however either a project or solution is being looked at already, and all paths are relative to that file.

So, no major consensus, however, many people have likened the PEP to that of node_modules, and I think the comparison is apt.

I will wholly agree that there has been a certain degree of justification-wringing. I’ll raise my hand first to try and say that I’m probably guilty of it.

To be fair, many language tools are conflated with project management tools now, especially with Go being a unibinary that handles package management, build, execute, linting, formatting, and fine wine evaluation. For a while, there was an individual adjacent to me who thought npm was the name of the runtime for Javascript because basically every Node project is interacted with through npm (npm serve, npm build, etc). So, for some folks to begin asking for, dare I say it “modern language” features from Python isn’t absurd, but does put into perspective where certain shifting sands are.

This was part of the reason why I suggested “stop at home/root” and idly suggested that it needed to be adjacent to a pyproject.toml.

  1. I strongly agree that the behavior should be opt-in. I actually think pip should force the user to choose where it will be installed (--global, --user, --local/--project)
  2. see 1.
  3. That’s your perogative
  4. I don’t see why the two can’t coexist, or you make your own decision.
  5. I was under the impression that’s an unstated 80% of the goal.
  6. Similarly.

In thinking about this, I think I’ve a proposal, @kushaldas:

Make explicit the default behavior described in the PEP: __pymodules__ must be adjacent to the module that __main__ is in or in the current directory, with preference to wherever __main__ resides. To further avoid any confusion, the module MUST be in a child path of the current directory if it is not already there (that is, the path that __pymodules__ exists in must be a prefix of the full path that __main__ resides in) and MUST be owned by the same user/etc as is plausible – working with Dev Containers in VS Code often leaves me with sticky files owned by other users because of the container uid:gid mangling.

Allowing for tools to override this behavior is essential: Having an environment variable, perhaps PYTHON_MODPATH, that allows a user to override this logic allows tools like Conda, pipenv, py, etc. to all make choices on their own. This also allows for things like PyTest to handle multiple versions of libraries, etc.

Absolutely – but those all predate conda and virtualenv and …

PYTHONPATH has been “not recommended” ever since it was common to have more than one version of Python on a system (remember when installing an update to python2 broke RedHat?)

This is something new, so should be subject to more scrutiny.

And I haven’t used a sys.path hack since I discovered setuptools develop mode – For me, making a package is the “right” way to make code available to Python.

Right – and as above, having many different ways to add things to sys.path is not great :frowning:

I see this as related to other “packaging strategy” thread. Increasingly I am of the opinion that more and more PEPs handling this or that use case not only will not help the situation, but will actually make things worse, because they are just adding one more alternative to all the existing ways of handling packages, one more option people must consider when figuring out how they want to do things. I think it would be better for the Python world if no further changes were made to packaging/environments at all except as part of a single unified attempt to handle a large number of use cases and, in the process, clear the field as much as possible of competing existing solutions. Otherwise we are just digging deeper in the same hole we’re in (namely, that there are many conflicting and confusing things to consider when dealing with Python packages and environments).

1 Like

Unless I’m understanding this incorrectly, this won’t work with any project layout (e.g. the recommended src layout) other than the discouraged “implicit”/“flat” one dumping the top-level import modules/packages directly in the project dir (unless you put the __pypackages__ dir inside the src dir or equivalent, which seems rather silly and is explicitly not in the CWD).