PEP 582 - Python local packages directory

I agree 100%. @kushaldas I thought I’d asked explicitly for that example to be modified or omitted, because it perpetuates the (incorrect) impression that the PEP expects pip’s default to change.

If you want to say pip install --prefix __pypackages__ twisted (suggesting the approach that the PEP notes will mostly work right now), or pip install --pypackages twisted (with an explicit note in the example that this will not work unless pip adds a --pypackages option, which is not certain) then that’s fine. But I’d honestly prefer the example to just go, even though I know you want to focus on the training use case. You’re just not going to get pip install twisted ever doing what you want here, and the PEP needs to accept that.

1 Like

Regarding the recursive search issue, I believe following Node’s behaviour will be a much better choice. No recursive search will reduce the value of PEP 582 even more and introduce even more confusion.

Two random remarks:

I would consider this CVE to be much of a nothingburger, bordering on It rather involved being on the other side of this airtight hatchway. To perform this attack, the attacker needs to be able to write to C:\.git, which requires administrative privileges by default, and if someone has administrative privileges, they could just as well put their malicious code in the user’s Startup folder or whatever.

The modern .NET (no-longer-Core) answer is:

  • At runtime, dependencies (.dll files) are typically looked up in the directory in which the .dll file requesting the dependency resides
    • Exceptions to the “typically” are largely undocumented, and not very important for most developers
  • At build time, the .csproj file specifies dependencies as package names or paths to other project files, and the build system gets the .dll files to put in the output folder from those places (you can also specify a path to a .dll if necessary)
    • Your code and all its (non-system) dependencies end up in one big directory
  • The dotnet CLI tool tries to find a .csproj file in the current working directory only
    • If you are somewhere else, you can specify a path to a .csproj

The single-sentence TL;DR answer is “you specify dependencies in your .csproj and dotnet run does the right thing”. A .NET project without a .csproj is pretty much impossible, so there is one obvious way to do it™.

1 Like

:person_shrugging: “We don’t think the risk is worth worrying about, but yes we are aware of what git did” seems like a fair enough way of “discussing how Python will avoid issues like the git vulnerability” if that’s what the PEP authors feel :slightly_smiling_face:

Yep. And node projects have a packages.json don’t they? (I’m not that familiar with either system). So there’s a fairly well defined “project directory” for both. Python is used in many different ways, including “a bunch of scripts in a directory with other stuff in there too, so it’s not really a Python project as such”. Specifically, for example, we can’t assume everything has a pyproject.toml. That’s not to say we can’t work like other systems, just that we need to be careful that their solutions don’t lock us out of workflows that they don’t have.

But @kushaldas has said he’s not putting parent scanning into his PEP, so I guess this is a discussion for what an alternative PEP might look like (unless he can be persuaded to change his mind).

Even an empty pyproject.toml? Like, what if PEP 582 was only activated when such a file was present and that would unambiguously determine where __pypackages__ was put?

npm uses packages.json, node itself does not AFAIK (though for various reasons, lots of libraries you can install via npm won’t work without their packages.json. I believe the node import system doesn’t know anything at all about packages.json.

Node makes use of package.json to understand imports/exports and several other key components of how Javascript ES6 and beyond load packages ("CommonJS vs ECMAScript loader, for example); documentation here. A lot of things get tuned about Node from package.json.

I could easily see an interim tool that wraps pip touching this, putting in a short placeholder with a link to relevant documentation.

Searching for __pypackages__ is a bit worse. Consider:

  • I clone a random github repo to look at the code
  • While poking around in the code, I need to pause to do some quick arithmetic, so I run python to use the repl as a calculator, with working directory inside this repo

Right now, that’s safe – you don’t need to think about where you are when you run python. With PEP 582 as written, it’s unsafe (will immediately perform arbitrary code execution controlled by whoever made the repo), but only if I’m in a directory containing a __pypackages__, so at least there’s a clearly visible marker of the unsafety. With directory searching, a random __pypackages__ arbitrarily far away from my current directory can also perform arbitrary code execution.

It’s not Heartbleed or anything, but it’s a legitimate concern, especially since we’d be changing the semantics of something that people might already be doing.

5 Likes

It’s safe until your quick calculations involve a call to
math.sqrt() and you need to use import:

$ echo 'print("j00 pwn3d!!!1!11")'>math.py
$ python3
>>> import math
1 Like

Just because a different programming language chose to do something not that great security wise, we don’t have to follow the same path. Things should be secure first.

2 Likes

If someone is in a position to drop a __pypackages__ in your parent directory, you’ve already lost the game security wise IMO.

6 Likes

True, but in a multi-user system, or a checked out source code/repository with dependencies, this can also happen. Plus there is also the startup time cost, which we are skipping here (by saying no support for scanning parent directories).

2 Likes

We have PYTHONSAFEPATH and -P option since Python 3.11, how would PEP 582 interactive with it? As described in gh-57684, I think it’s a real problem.

From the PEP

For example, __pypackages__ will be ignored if the -P option or the PYTHONSAFEPATH environment variable is set.

If you haven’t yet, please read the updated PEP. It does try to cover a lot of this.

2 Likes

Are you suggesting that people check in __pypackages__?

Enforcement of where it can’t be is where I’d start: Can’t be in $HOME, /, /tmp, /[s]bin, /usr/[s]bin, etc.


I have a light retraction-ish to make: I implied Node doesn’t recurse by “looking in the current directory”.

I actually just did a little reading/testing and Node does recurse to find node_modules, and npm will generate a loose package.json if you don’t have one already:

indrora@DESKTOP-HTA6J0U:~/src$ mkdir test
indrora@DESKTOP-HTA6J0U:~/src$ cd test
indrora@DESKTOP-HTA6J0U:~/src/test$ npm install colors

added 1 package in 96ms
indrora@DESKTOP-HTA6J0U:~/src/test$ mkdir foo/bar/baz/quux -p
indrora@DESKTOP-HTA6J0U:~/src/test$ cd foo/bar/baz/quux
indrora@DESKTOP-HTA6J0U:~/src/test/foo/bar/baz/quux$ npm root
/home/indrora/src/test/node_modules
indrora@DESKTOP-HTA6J0U:~/src/test/foo/bar/baz/quux$ node
Welcome to Node.js v19.6.0.
Type ".help" for more information.
> var colors = require("colors");
undefined
> console.log(colors.rainbow("Hello, friends!"));
Hello, friends!
undefined
>
indrora@DESKTOP-HTA6J0U:~/src/test/foo/bar/baz/quux$

Note that Node warns people to not check in node_modules (it’s often quite large, in the gigabytes) and it looks funny when you do so in git.


as for startup cost: Any startup cost that this would incur is negligible to zero. More time, I guarantee, goes into logic for selecting the right locale than checking for a few directories. Currently, at startup, python3.10 on my VM makes 146 calls to stat() before it shows the banner. 35 of those calls result in ENOENT.

1 Like

I don’t know whether it is appropriate to post review comments here. The PEP gives the following code to retrieve the install scheme:

scheme = sysconfig.get_preferred_scheme("prefix")
purelib = sysconfig.get_path("purelib", scheme, vars={"base": "__pypackages__", "platbase": "__pypackages__"})
platlib = sysconfig.get_path("platlib", scheme, vars={"base": "__pypackages__", "platbase": "__pypackages__"})

But since the preferred scheme on Windows is nt, the packages will be installed under __pypackages__/Lib/site-packages, it seems to cause package mixing between different Python versions.

cc @kushaldas

Yes, that is the standard behavior of current pip too, and Cpython follows the same path. Instead of defining a new path, we are reusing what is already expected.

That is different. The current install scheme is associated with a specific Python interpreter, so there is no possibility to mix packages from different python versions. Think __pypackages__ as a venv without an interpreter, installers can install packages for different python versions into the same __pypackages__ directory. A similar situation is for the user scheme, so even on Windows, the site-packages are isolated with Python{py_version_nodot_plat} part.

First, remember that __pypackages__ isn’t “venv without an interpreter”. It’s just a sys.path entry. I get the analogy, but don’t push it too far. (Sorry, I know I’m starting to sound like a stuck record on this)

Second, there’s no suggestion in the PEP (or in my personal view of the PEP, for what that’s worth) that being able to install packages for multiple Python versions in the same __pypackages__ is a goal. The key use case is for beginners, who are extremely unlikely to have multiple Python versions in the first place. If it’s something you view as a key requirement for the PEP, then you need to argue for it as a feature in its own right, not query details of the PEP based on the assumption that it’s useful and “should work”.

As a practical issue with handling this differently, there’s no existing install scheme that has versioned directories on Windows, so we’d need a new scheme. That idea was mentioned in the “rejected ideas” section, so see there for the reasons for not going down that route.

Edit: Whoops, I forgot the user scheme on Windows is versioned. Sorry, I’d forgotten about that one, but I don’t think the “user” scheme is in general an appropriate choice here (we’d be using it because it’s convenient, not because its intended use matches our needs).

3 Likes

I suspect that not having versioned directories on Windows will be problematic.

The paths on Windows don’t have versioning because the expectation is the base path will be versioned, so the individual parts do not need to be. The opposite is true of both *nix and __pypackages__. People are going to get very weird, confusing errors if they have multiple versions of Python installed on their Window machines and they’re using __pypackages__.

Rejecting adding a dedicated scheme feels shortsighted, it’s adding a long term cost to avoid a short term cost.

3 Likes

Adding one is a valid choice to make. The downside is that until pip adds support for installing into that dedicated scheme, which won’t happen until some time after Python introduces that scheme[1], the PEP is largely unhelpful to its intended target audience. Whereas with the current approach, pip install --prefix is a good enough short term option.

It’s not me you need to persuade, of course, it’s @kushaldas. (And I guess as a co-author, you have some direct influence on the PEP :wink:)

Just to make my position here clear - I helped @kushaldas with the latest re-work of the PEP, ensuring that the lack of clarity that I’d complained about was addressed, and the intent and scope of the PEP was clear. I did not, however, try to change his mind on the content of the PEP, nor does the resulting text necessarily reflect my views. Personally, I’m neutral on the PEP itself, in particular I’m not sufficiently interested in it to try to get the details changed - the only things that mattered to me, that the PEP didn’t state that pip would change its default install location, and that any tool changes were clearly described as what the PEP hoped would happen, rather than as requirements, have been addressed, so I’m good.

I am trying to channel @kushaldas and explain what I believe his position is on people’s questions. But that’s mainly because I’m frustrated with the way this discussion struggled to keep focus in the past, and I’m hoping that by doing this, I’m helping people express their concerns in a way that’s actionable. But I’m not an author of the PEP (or even a sponsor), just an interested bystander.


  1. How long after largely depends on whether anyone steps up to do the work of implementing it. ↩︎

3 Likes