PEP 582 - Python local packages directory

True, if you’re viewing it from a tool perspective that does make sense.

What would that look like?

:laughing: Ah, the internet.

2 Likes

The way I imagine this is a directory similar to pyproject-lock.d (say .pyproject-env.d). Each directory in it is a directory containing an environment, named something like

default-cp310-win_amd64
------- ----- ---------
  (1)    (2)     (3)
  1. Environment (dependency group) identifier
  2. Interpreter version identifier
  3. Interpreter architecture identifier

Say we have a tool called pyproject, we’d do

# Generate pyproject-lock.d/default.toml.
pyproject lock --group=default

# Syncs default.toml into .pyproject-env.d/default-cp310-win_amd64/.
pyproject sync --group=default --python=cp310 --arch=win-amd64

# Execute a command against a populated environment in default-cp310-win_amd64.
pyproject run --group=default --python=cp310 --arch=win-amd64 -- my-console-script

There are some other UX details to figure out, e.g. can the various options be inferred or set with some project-wise configs so we don’t need to type them in every time, can we just say --python=3.10 and --arch=64, and so on. But those can be explored by implementations, once the basic structure is layed down.

There is a significant thing that needs clarification from the PEP – What is the path scheme other than lib

pythonloc and pipx seems to pick __pypackages__/X.Y/lib/bin, but pdm picks __pypackages__/X.Y/Scripts for windows and __pypackages__/X.Y/bin for other platforms. This would be a main blocker for IDEs to add support for PEP 582.

IMHO the path should follow the schema paths as defined by sysconfig rather than any (potentially) platform dependent value.

IMO, any proposal to add a new place where “things can be installed” should provide a sysconfig-compatible layout, by which I mean provide a clear explanation of how to calculate a path corresponding to every location returned by sysconfig.get_path_names() - ('stdlib', 'platstdlib', 'purelib', 'platlib', 'include', 'scripts', 'data') in Python 3.9.

It’s a shame that sysconfig doesn’t provide an extensibility mechanism to allow 3rd parties to register additional schemes - then we could just say that the proposal needs to define a sysconfig scheme and be done with it.

Having said that, this proposal, as it’s a PEP to change Python, could (and probably should) define a new sysconfig scheme (I’d suggest the name “local”, but feel free to bikeshed :wink:) and then tools can simply do sysconfig.get_paths("local") and know they will be able to treat the result just like any other install target.

but sysconfig IS platform-dependant itself. what scheme should be used? or are we going to have a new scheme?

Since the PEP authors are no longer actively participating in this thread, I think it’d be worthwhile starting a new PEP that addresses the various issues we’ve accumulated. Actually, I feel there should be (at least) two PEPs, one to discuss the internal layout of each directory (whether to use virtual environment, the scheme to use instead of not, etc.), and another to address the overrall structure of the directory containing those environments (how should we hook into the interpreter to support loading modules from each environment, the name of each environment’s directory, and the name of the containing directory, etc.).

Regarding the scheme specifically, we can have a new scheme, or use one of the existing schemes (either nt or posix_home seems good and simple enough). It’d work as long as (and only work if) the PEP (again, likely a new one instead of PEP 582) specifies one.

2 Likes

@kushaldas should still be around.

The initial goal of the PEP is to help the complete beginners. We discussed and decided (back then) to focus on one group who needs the highest amount of help, people starting to learn Python. Anyone who needs a complex setup, can easily continue the existing virtual environments and related tools. The same goes for people with need for multiple Python versions.

1 Like

Thank you @brettcannon for the mention.

I am still looking forward to work on this, but in the middle of some life changing incidents. So, I will be slow to reply.

2 Likes

The Scripts dir wasn’t omitted intentionally from pipx or pythonloc, I just don’t develop on Windows and didn’t think of it at the time.

Both of those tools also have sufficient disclaimers that the implementation may change at any time that I am completely fine with a slightly different or even radically different approach. I am in favor of fast iteration to learn and improve rather than locking into something early and preserving compatibility for it (i.e. JavaScript’s approach over C++'s).

My thought on the directory structure is it should match whatever is created for virtual environments. Namespacing by version seems fine, as long as whatever is in the version dir is basically a virtual environment. i.e. __pypackages__/X.Y/<venv>.

1 Like

Hi there, as someone coming from C#/JS/TS land, this is very exciting.

Now, I just accept venvs it because hey it’s python, but what I’m wondering is - how possible is this PEP to get out of draft? Do we need a new one? Like, what needs to happen for project level deps without the need of, frankly extraneous, shell manipulation to be a thing? And can I help?

For info, you can use PDM. It has support for PEP 582:

1 Like

Importing files based on the current working directory was the cause of CVE-2022-21699 in IPython.

What’s to stop an attacker dropping /tmp/__pypackages__/3.11/lib/sitecustomize.py or something else loaded on Python startup, and injecting code into other user’s Python interpreters run from /tmp?

I mean arguable this vulnerability always exists as long as Python’s sys.path starts with '' for a repl or with -m, but the PEP ought to discuss this.

1 Like

Seems like it is discussed in the PEP:

Maybe related:

The only point that is discussed in the PEP is that /tmp/__pypackages__ will not be used if running a script from elsewhere.

But this implies it will be used if not running a script, and that’s exactly the usage that is described as vulnerable in the CVE.

2 Likes

Well, I don’t think this PEP opens new vectors for this sort of thing. As you said, sys.path already makes overriding something like encodings to execute arbitrary code an issue if your file system access is already compromised. If it’s truly a concern then I suspect the PEP would mention it doesn’t make anything worse than it already is.

But maybe the PEP just needs an update to the latest PEP template which has a security section anyway.

I will try to update the PEP this weekend (or next) if I manage to get time from all doctor visits.

2 Likes

I agree, it’s within Python’s existing security model, though I think users still need to be cautioned about that model, as Glyph is doing in this blogpost.

I tried IPython 8.0.1 and I was surprised that it arranges for the current working directory to be put towards the end of sys.path instead of the start, so it is harder for the context of the current working directory to break/compromise installed packages. An encodings.py or a sitecustomize.py does not get executed on Python REPL start. An attacker could add a requests.py and hope that somebody runs it in a context that isn’t installed, though it might be better to write a StringIO.py and hope that somewhere in your import graph there’s still a leftover

try:
    from StringIO import StringIO
except ImportError:
    from io import StringIO

So it appears that IPython is trying to improve upon Python’s security model, and maybe the CVE is valid if that’s the case (but this seems like a lost cause; I would dispute the CVE).

Returning to PEP-582, at worst it adds more directories and paths to worry about when considering the security of an application. Maybe pip could audit this a bit, e.g. warn about __pypackages__ existing in a location where it is writing console_scripts/gui_scripts but which is not the directory in which it is writing package contents. I guess a logical extension would also be to warn about or refuse to install console_scripts/gui_scripts that end with .py.

1 Like

On the security side, as long as the “Isolated Mode” behaviour of local packages support is sensible, and a local packages directory doesn’t affect running scripts outside the directory from a working directory outside the directory when running in a normal mode, then the ability for local packages to override system ones will be a feature rather than a bug.

The better argument for putting the script (or working) directory later in sys.path in general is a beginner-friendliness one: if you call your “experimenting with the socket module” script “socket.py”, your “import socket” line no longer does what you expect. Ditto for “experimenting with numpy”, etc - it’s natural to name experimentation scripts after the library you’re experimenting with, and the default path configuration means doing so not only doesn’t work, but fails in a cryptic fashion.

The “multiple platform specific local environments or only one generic one?” question is a thornier one, as the trade-off changes based on the use cases you’re attempting to support. A fully general solution would have three tiers of environment in the local packages directory:

  • a common platform and Python version independent environment with only pure Python and stable C ABI based extension modules in it
  • a Python version specific environment with pure Python and C extension modules in it (extension modules have their own mechanisms for parallel installation of extension modules for more than one platform)
  • a platform compatibility tag specific environment that aligns with the way wheel distribution files get named and installed

PEP 582 intentionally simplifies this topic by offering only the middle Python version specific tier, as it best reflects the way that system level and user level package installations work in the absence of virtual environments. What it doesn’t point out (and probably should make explicit) is that starting with only that version dependent but platform neutral environment doesn’t lock us in to only supporting that tier forever - the version independent environment and the platform specific environments could be proposed by later PEPs, citing the specific use cases that the simpler starting point didn’t address.