PEP 582 - Python local packages directory

Please, elaborate. I’m under the impression it already does when starting, based on my experiences using Python on Debian, Windows, and MacOS, all of which have different search paths that must be addressed early on.

Not totally. There’s a handful of ways to get to the interpreter that do much more work than what the standard executable does, such as the embedding of CPython in an application like Wireshark. However, these are approximately out of scope for PEP 582, from what I can tell.

Please elaborate: This adds a firmly opt-in pattern to searching for packages. Assuming that “it must be either next to the module that main confers or next to a pyproject.toml or version control boundary” requires a very specific hoop to be gone through in order to make this behavior useful.

Meanwhile, tools like git get shelled straight into people’s shell. The startup latency for a few recursive checks is meaningless.

Google (TensorFlow) and Amazon (SageMaker) both basically ignore Conda’s existence, mostly because of its tight connections to Conda, Inc.

This was specifically why I suggested "recurse up until you find a pyproject.toml or .(git|svn|etc) next to a __pypackages__, or reach the homedir/root.

There are cases where you’d have mixed users (e.g. symlinks in a handful of places, devcontainers, etc.)

2 Likes

Related to Paul’s security implications about subdirectory ownership
and surprise auto-running an exploit some other user stashed there,
what are the odds we’re going to see trojan Git repositories with an
included pypackages somewhere in the worktree so that if a
victim runs Python while their current working directory is within
that checkout they unwittingly call the attacker’s payload? That
Python includes the empty string (current working directory) in
sys.path by default has been brought up by many before as a security
concern. Does this proposal make it worse?

1 Like

There is a reason the PEP does not support any such parent directory scanning.

1 Like

Exactly. However, not scanning parent directories isn’t exactly a great UX either. I mentioned this above, about 4 years ago. Wow, this discussion has been around for ages!

3 Likes

I can see a few ways to address this. Some that come to mind:

  • Ignore the current directory and only look at the path for __main__ unless a repl is called.
  • Magic Imports: if __main__ does an import from future (from future import pymodules), only then does the automatic seek behavior kick in.
  • require -m explicitly to launch binaries from inside __pymodules__ or explicitly state that this is to be done by helper tools (pipx, etc)

I don’t think malicious Imports are a huge issue. I do think that malicious binaries are.

1 Like

But that doesn’t work today either unless you do an environment activation or you installed into your global Python installation. Or are you assuming one of those things already occurred for your python command to work?

Sure, but so does more entries on sys.path and I believe you pointed out that isn’t a concern for posy so I don’t see why that’s a concern here.

True, but in the 19 months that the Python Launcher for existed I have not had anyone complain about it doing that search for a .venv directory until 2 weeks ago, and that instance was just wanting a way to turn it off conditionally instead of not liking it. And that feature came from beta testers wanting the search, so I don’t think if it were done that way it would be wrong.

But regardless, …

So I think this is all a moot point. But I’m also a big supporter of using -m from the top of my workspace anyway, so none of these concerns apply to my usual workflow anyway.

2 Likes

Because posy has the luxury of choosing a particular target audience and focusing on them – it doesn’t have to work for every python invocation, the way python does. (And also it’s way easier for something like posy to iterate on stuff as we learn what works, vs changes to python being mostly frozen forever once they’re released.)

But most people use the py launcher interactively, right? As far as I know, people aren’t writing scripts like #!/usr/bin/py, or invoking py as a worker subprocess inside some larger system, stuff like that? The point is that you have to evaluate this kind of clever DWIM feature against a specific context. If I’m using a high-level interactive tool, I love clever DWIM stuff. But the exact same feature might frustrate the heck out of me if it shows up in a lower-level tool where predictability is way more important.

I’m not saying PEP 582-like features are bad in general, just that for the python executable specifically, there’s a limit to how much clever DWIM stuff we can or want to do at that level.

See also the different CLIs exposed by npm vs node.

Yeah, I wasn’t trying to say anything controversial :-). Just explain the constraints that led to the PEP being the way it is. I’ve always been 100% excited about the goal of PEP 582; I’m just worried that there’s too big a gap between the PEP 582 in our imaginations and the actually-existing PEP 582.

6 Likes

Perhaps this PEP would be more successful if it appreciated some concerns by proposing a new launcher (or adding functionality to the existing py launcher) which will be the only program which checks for the new __pypackages__ directory.

This would place a heavier burden on marketing, especially with the bar number of existing tutorials which recommend invoking Python with python[3], but would leave that command unaffected.

PS: sorry if this was suggested years ago, my memory’s not that good.

There was a suggestion at some point that proposed python -m pep582 (exact module name tbd), which is effectively the same. Someone also proposed adding a new command way back when the PEP was first written if I remember correctly, but that never caught on. (Although I’ll add that I actually like adding a new command best personally.)

1 Like

+1 – this would remove a lot of the potential confusion when folks are using conda or another “environment” system.

I will say I’m looking at trying to add support for conda to participate in the Python Launcher for Unix, so the “confusion” worry for this scenario would only be conda → PEP 582 side and not universal for tool → conda side.

I think this is where I’d use from future import pypackages and hint that this is a change in “some major future version” that such behavior might be a default. Adding a flag to pip (--pkg perhaps) would also help direct users to the right place.

I half agree with @njs here: I think that, perhaps, the text of the PEP could be worked on to more accurately reflect the middle ground between what we’ve been talking about here.

One thing that I hear a lot of chatter about in this thread is Virtualenvs and, beyond that, package tools like conda . It takes me back to a time when I first started paying attention to Python’s PEPs and noticed this one: PEP-370. More accurately, its author calling for it to be declared dead in favor of virtualenvs.

A response from one of the Fedora Python SIG folks makes for an interesting note:

While I consider venvs easy and cool, this just moves the barrier for the users a little bit higher. We (Fedora Python SIG) are fighting users that run sudo pip install all the time (because the Interwebz are full of such instructions). The users might be willing to listen to “please, don’t use pip with sudo, use --user instead”. However, if you tell them “learn how to use a venv”, they’ll just stick with sudo.

People are generally trying to do The Right Thing, and the higher barrier we put on them to do the Right Thing, the less people will. Having taught Python to students for quite some time, I wouldn’t dare try and run an introductory course using as high level a tool as Virtualenv, but installing packages via the local repository in PIP-370 is just a little too broad for today’s development needs.

As it was described by a colleague of mine: “Virtualenvs are great for deployment, where they belong. For everything else, they’re crap.”

as for Conda: Conda has yet, from what I can tell, to adopt PEP-621. Open question then: Does this only apply to systems that know PEP-621? If so, how much should py/python[3] care about pyproject.toml? if Conda does everything inside its own specially-shaped environments, let’s step out of their way. I believe pip can determine if it’s under the mind-control of Conda, Pipenv, poetry, etc. In cases like this, make this PEP step aside unless the venv (via pyproject.toml) calls for it.

2 Likes

For whatever it’s worth, I largely agree with @njs here in that I think the current way PEP 582 attempts to work is a compromise that basically makes nobody actually happy in practice, for basically all of the reasons he’s outlined. The python command is the entrypoint to all of Python from the CLI, anything we add to it is going to be invoked for every invocation of Python [1] whether it cares about these new features or not.

The PEP knows this, and to try and mitigate this is makes a number of compromises that make the PEP harder and more complicated to use and understand, which ends up, IMO, making the PEP fall short of it’s goals while also making the python command (and thus ~every user of Python) pay the cost that comes from that PEP.

Something like the py launcher, or a flag the python or a python -m pep582 module making this opt in or be something that happens at a higher level, eliminates the need to worry about the impact to things that don’t need to use PEP 582 environments, which means that PEP 582 has more freedom to implement itself in an uncompromising way that will serve it’s intended users better.


  1. Other than outliers like embedding Python. ↩︎

4 Likes

One interesting thing about this is that, eventually we found out that PEP 370 doesn’t really solve the actual problem. People did switch to pip install --user, but that approach still induces enough bad things that are fundamentally identical to sudo pip install, only with a smaller blast radius (it only breaks your own environment, not everybody else’s on the same system), that eventually we introduced PEP 668 and effectively tell people to stop doing either.

So a lesson I would personally take from this experience is very different. I’d conclude that instead of introducing a “less wrong” solution, we should shoot for a solution that is “sufficiently correct”; instead of trying two maintain a separate, compromised solution, it may be more worthwhile to put effort in pushing the better solution to people.

I’m not saying your conclusion is wrong (especially considering the situation here is different from PEP 370), but it’s interesting to me how people can make very different conclusions out of the same thing.

3 Likes

I’m not sure I understand that – but it sounds like a great collaboration.

To be clear, I’m not suggesting that this PEP would be incompatible with conda – you’d “simply” have to not use __pypackages__ with conda environments.

But I do think it could lead to a lot of confusion for folks, particularly newbies – if this is widely adopted, then projects will ship with a __pypackages__ dir, and the “quick start” docs will tell folks how to use it, etc. And conda users will get confused.

I used to see a lot of messages on the conda lists about folks getting tangled up using virtualenv with / inside conda environments – it CAN be done, but it creates a mess, and it is totally unnecessary. I haven’t seen a much of that for a while, so maybe it’s sorted itself out.

And maybe this will too.

1 Like

I don’t understand – how could conda adopt PEP-621? It is NOT a Python package manager, it is a general purpose package manager. pyproject.toml is just what it says it is – metadata about a Python project.

I don’t think so – in fact, a well-build conda package of a Python distribution is installed exactly how it would be by pip. Because it was installed by pip, during the conda build process. pip shouldn’t behave differently if it is called by conda-build, that’s kinda the point.

The way conda environments work is that everything they need is inside one dir. This PEP proposes to put python packages in arbitrary places in the file system, it’s completely incompatible. [note: folks smarter than me might be be able to figure out a way to make it work, I don’t know]

3 Likes

That’s a fair read. To be honest, 99% of my time trying to work with Conda has been swearing at it and wondering how people use it for anything useful, then throwing it away and using virtualenvs anyway. This has been my attempt to use it over and over again for machine learning projects.

What would the cost of treating Conda like a funny virtualenv?

To be precise, Conda is a package manager, and it makes no sense to compare it with virtualenv. If you mean Conda environment, it would kind of make sense to compare it to virtualenv for certain use cases (when you populate Python and Python packages into the Conda environment, granted that’s probably >90% of its use cases in practice).

That would have been me years ago with pip and any python package that required compilation, before I found conda. I spent a LOT of time compiling packages for the Mac, and if Christoph Gohlke’s old Windows package repo wasn’t there, we’d have been dead in the water on Windows. it was truly unusable – conda was a massive step in the right direction.

What with wheels and all (and note that anylinux was inspired by conda) things are MUCH better now, but conda still has a major edge if you step out of the just Python world. And even today – try using the osgeo stack on anything other than LInux.

Once conds-forge gained momentum, it’s been remarkably easy to get stuff done with conda.

If you struggle, than either:

  • no one is providing the packages you need for conda :frowning:
    or
  • you’re trying to use it in a way that it wasn’t designed for.

I don’t do ML, but a heck of a lot of folks do use conda for it.

I don’t follow. conda environments are kinds/sorta like virtualenv – I honestly don’t know virtualenv well enough to be able to describe the differences – except that conda manages non-python libraries, etc as well.

1 Like

To be fair, the core of the PEP is nothing more than adding a couple of entries to sys.path on startup. That’s neither more nor less useful than the fact that the current directory, or python.zip, is added to sys.path on startup - it’s of benefit for certain expected use cases, and does no harm if not used.

The problem lies in the fact that the motivation for the change is around particular ways of using that addition. And there’s a lot of confusion and mistaken expectations about those changes to workflows - which isn’t helped by the many years this PEP has been around, and people’s speculation about what might happen.

Yes, the current version of the PEP[1] suggests changes to pip as well, but (1) these have significant issues around backward compatibility, and (2) a core PEP isn’t the right place to propose changes to pip’s UI anyway. We’ll thrash out how pip wants to respond to this PEP if it gets accepted on pip’s tracker, not here.

Maybe in the end, the PEP is something that doesn’t appeal to people. Maybe just adding entries to sys.path is a compromise that doesn’t do enough to be useful. That’s fine, in that case the PEP will be rejected.


  1. There’s a revision in the works. ↩︎

3 Likes