PEP 582 - Python local packages directory

njs · March 11, 2019, 7:11pm

@cs01 I think you’ll find this thread interesting: https://mail.python.org/archives/list/distutils-sig@python.org/message/YFJITQB37MZOPOFJJF3OAQOY4TOAFXYM/

It’s specifically trying to get that big-picture/top-down view you’re asking about, and indeed it concludes that PEP 582 is not the best approach for solving these problems, partly because of the entry points issue.

uranusjr · March 11, 2019, 7:27pm

This thread also: Standalone app deployment story

And I’d like to repeat my opinion that pip (and the dependency management paradigm it serves) is fundamentally the wrong approach to deploy standalone applications like flake8, Black, etc. I am ambivalent on whether pypackages should support console scripts, but if it does, it should be because of other reasons, not to bolt standalone application distribution back onto it.

njs · March 11, 2019, 11:29pm

@uranusjr This is about use cases again :-). If your use case is “I want to install a standalone application that happens to be written in python”, then I agree that pip/pypackages/etc. are probably not the right approach. If your use case is “I want my project workflow tooling to let me install specific versions of applications into isolated, project-specific environments, and coordinate those applications with project collaborators”, then pip’s approach is fine, and people might reasonably expect pypackages to address that use case.

steve.dower · March 12, 2019, 12:08am

Pip has nothing to do with the approach here - this is virtualenv’s influence.

The only thing required to make pypackages work “the same” here is to do a mkdir and pip install into that.

Not that I’d recommend doing that either, just as I’d hesitate to say “manually create a venv for each of your tools” (pipx is much closer to the best way here), but it doesn’t help to let invalid comparisons go unanswered.

uranusjr · March 12, 2019, 2:55am

I agree. But then I’d argue entry points cause more confusion than benefits in this case; python -m works much better since it eliminates a layer of indirection (PATH). I’ve certainly seen too many people complaining venvs not activating correctly, but ending up finding they have their bashrc misconfigured. pypackages would definitely help this, but console scripts don’t.

methane · March 12, 2019, 6:56am

I worried next problem comes up: They may use wrong Python.
Sadly, it’s very common to having multiple Python instances are installed. For example, system Python, Homebrew, pyenv, and conda.

When activating virtual env, the python used to create the venv is used automatically.

How people can use correct Python easily with PEP 582?

uranusjr · March 12, 2019, 8:48am

I happen to be thinking about exactly the same topic today This also goes back to Nathaniel’s fantastic elephant; these topics are somewhat separate, but in the end they need to work well together so we can provide a seamless experience.

It seems to me that we can learn things from recent works on Windows. I just posted in Ideas on a related topic: Provide a PEP 514-like registry for macOS

pf_moore · March 12, 2019, 9:42am

So…

mkdir foo
py -m pip install -t foo flake8
foo\scripts\flake8.exe

You’d expect that to work, and for the exe wrapper in foo\scripts to run the system Python (because pip hard-codes sys.executable in the wrapper shebang), somehow with the foo directory on sys.path so that the imports needed for flake8 would work?

There’s quite a lot of pieces here that don’t work like that right now, and I’m not 100% sure what you expect to change and how. Also, there’s no pypackages involved here.

I may well be missing something - quite possibly flake8 isn’t the sort of example you’re thinking of (I picked it because, to me, it’s an obvious example of “project workflow tooling” that @njs referred to). This discussion seems to me to be getting bogged down in generalities, and we’d be better talking about very specific and precise examples, so that people know what’s being discussed…

steve.dower · March 12, 2019, 2:46pm

Like I said, neither approach is the right one for flake8 (I like using black as the example because it autocompletes on my phone ). “pipx run black” is far better, and if pipx is looking at a local config file to choose the exact version of black for the current project then I think that fits the need nearly perfectly.

My point is that the difference between venv and pypackages is in how they locate packages (env variables and absolute symlinks vs. relative to script and fully relocatable) and pip doesn’t actually do anything different between them (once it supports the latter, at least, which as you point out it currently does not, but that’s been a regular point throughout this proposal - the proposal is about CPython and not pip, by request of Donald, so we have to assume that pip will figure out how to work with changes to CPython).

steve.dower · March 12, 2019, 2:49pm

You should have had -t foo/__pypackages__ and “somehow” had a script wrapper that does “cd foo; python -m flake8 (magically in the original CWD but without changing sys.path back)”. This is certainly a piece that isn’t figured out yet, but it is not part of the change to CPython as such, and so it’s not in the PEP.

Adding standard script wrappers to CPython would take another PEP, but I’m not totally against it. Then I could go fix problems like embedding absolute paths with invalid quoting

pf_moore · March 12, 2019, 4:08pm

These two comments are confusing to me, because they seem to contradict each other. Are you saying that __pypackages__ is or is not intended as a way to manage which version of flake8 is used for your project?

I completely understand the idea of having a script put its runtime dependencies in __pypackages__. That’s basically just a simpler and more flexible way of having a zipapp, and seems entirely reasonable to me.

But when the conversation moved on to “project workflow tooling”, it seemed to divert to discussing development-time dependencies, and I lost track of who’s saying that __pypackages__ is intended to handle that case, and who’s saying it’s not. Hence my plea for specific examples.

There are way too many ways of making “applications”, “tools”, or “command line utilities” (call them what you want) in Python already. The __pypackages__ proposal adds another one, but doesn’t really explain which of the existing approaches it’s intended to replace. Obligatory XKCD.

steve.dower · March 12, 2019, 5:51pm

My point is that you may give any script its own dependencies in its own folder. So if you always run it as python C:\flake8\flake8.py then you can put its dependencies in C:\flake8\__pypackages__ and it’ll just pick them up.

Having tooling so that you don’t have to type the full path is an extra task that is outside the scope of the PEP.

pf_moore · March 12, 2019, 6:43pm

OK, so this is nothing to do with console scripts. Good, I think I follow now.

njs · March 12, 2019, 8:47pm

From talking to Kushal, and from reading the “motivation” section of the PEP, my understanding is that development-time dependencies are the only motivation for __pypackages__. In particular, development dependencies for absolute beginners, where you can’t reasonably teach them about virtualenv until sometime after you’ve taught them about if and print.

So IIUC, the project workflow tooling case is 100% what the PEP authors are trying to solve.

pf_moore · March 12, 2019, 10:50pm

OK, so I’m very confused. I guess I need to re-read the PEP and try to understand it better…

steve.dower · March 13, 2019, 12:45am

More likely we need to rewrite the PEP better It’s very focused on mechanics right now and only tells half the story because only half the story exists in the tool (CPython) that the proposal applies to.

Reading with the context of “if CPython would infer this search path from the startup file, what could other tools do to make user’s lives easier” will yield the best results.

pf_moore · March 13, 2019, 12:10pm

Thanks. With that suggestion (and a determination not to read anything into the proposal that’s not there¹) the PEP makes perfect sense to me.

The whole thing about having development tools in the __pypackages__ directory is, as you say, part of “the other half of the story”, in the sense that it’s perfectly possible to install a tool like black into __pypackages__, but how that tool then becomes available as something the user can invoke from the command line depends on how other tools handle this. python -m black would just work, as that only involves the core interpreter, but if people want a shell black command, that’s when things like console entry points, and how they get built, becomes relevant. One problem I foresee with python -m black is that its meaning will change if you change directory. That’s a pain, as it means you need to cd back to the project root every time. So having better tool support would be useful (although even with tool support, I’m still a bit concerned that we might have just made python -m into something that we would rather not recommend…)

There’s also a fairly bad (but hard to prohibit) security concern that the PEP doesn’t address. Imagine a directory containing a hidden __pypackages__ folder containing a malicious pip package. We pretty strongly recommend python -m pip as the canonical way to invoke pip, so there’s a strong possibility that this would give an attacker a way to trick users into invoking malicious code. OK, it requires the attacker to have access to a directory that the user will run commands from, but even so it seems risky (consider putting something like this in tmp on a Unix box - at least Unix doesn’t have hidden directories like Windows does…)

So IMO, the __pypackages__ proposal seems like a nice simple and understandable mechanism (subject to the security question I mentioned above). But to be actually useful will involve a whole other exercise in updating tooling and project workflows to take advantage of that mechanism. And that other exercise is likely to be by far the hardest part of the process…

¹… and a fanatical devotion to the Pope

steve.dower · March 13, 2019, 1:10pm

This, and the rest of the example, is a very good point. I don’t think it came up in any of the discussions.

It’s not necessarily fatal to the proposal, as a “pip.py” in the current directory will have the same effect today, but perhaps we ought to think about -m some more? I like it, but its ability to search a lot of places to find code to run is a bit of a risk at times.

pf_moore · March 13, 2019, 1:36pm

There are fundamental differences between .py files and .exe files (as I know you know). As a simple example, subprocess.run(['pip', 'install', 'foo']) will fail to find your pip.py. As another example, VS Code doesn’t recognise .py files as OS commands (at least it didn’t for things like pylint when I last tried…) This is probably fixable at the tool level, but that was sort of my point - it needs tool support, and it doesn’t fix -m directly.

IMO, -m is an incredibly useful and effective feature. The problem is not so much with -m as it is with the whole question of “what is a Python application”. In practical terms, there are various ways of deploying Python code:

Standalone scripts
Libraries that expose command line interfaces
Full applications

-m is perfect for (2), as are console scripts. But they get used for (3) as well, because there’s no particularly good solution for (3). And of course __pypackages__ is ideal for (1) (supplementing things like zipapps).

My point about python -m pip or python -m black wasn’t so much that -m is bad, as that pip and black should probably be distributed as type (3) applications, but the lack of a good lightweight solution for that makes it impractical. This brings us back to this thread, of course…

uranusjr · March 13, 2019, 1:42pm

I believe what Steve meant is that pip.py in cwd changes the behaviour of python -m pip, just like pypackages in cwd does. pip.exe should always do the right thing in this situation.