What's the deal with the pipx shared venv, exactly?

kknechtel · July 11, 2024, 6:38am

I’ve been experimenting with Pipx the last couple of weeks trying to figure out ways to unleash its full power. I’ve noticed that a shared venv is created that contains pip, setuptools and wheel. Then the per-project venvs have .pth files that include the shared venv’s setup, and are created --without-pip.

So far, so good - now I don’t repeat copies of Pip across my disk just so that Pip can know where to install things. But the details of the architecture have me scratching my head a bit:

Are the .pth files actually necessary, given the existence of --python and --target options for Pip? Pipx presumably always knows where Pip is located, and presumably will be running it in a separate process, and presumably can figure out which executable to use it with and what flags to pass. So I don’t understand why it’s important, or even helpful, to ensure that the pip package is on sys.path when the application runs.

I’m especially concerned here that the shared environment might not have the same Python version as a per-project one, since there’s also a --python option for pipx install. For Pip it shouldn’t matter, but on principle this seems brittle since it’s looking for site-packages specifically intended for a different version of Python.
When do these setuptools and wheel installations actually get used? I thought that Pip defaults to isolated builds now, so that if Pipx tries to install from an sdist, Pip would be forced to install a fresh setuptools and wheel into an isolated environment first before installing the built wheel into the real virtual(?) environment.
Supposing I’m right that they’re specifically there in case I explicitly do something like pipx install pycowsay --pip-args "\"--no-build-isolation\"" ^[1]. What if I instead install something that uses a different backend? I don’t see a supported way to add other backends (or other packages) to the shared environment.
Is there anything else particularly clever or subtle about this setup that I might want to know about?

I previously found in testing - while trying to avoid hitting the Internet for a simple install - that this weird quoting is necessary; one pair is stripped by the shell and the other is needed to make it a Pip arg instead of a Pipx arg. ↩︎

pf_moore · July 11, 2024, 6:49am

The shared lib predates pip’s --python flag. The --target flag can’t be used as it doesn’t properly support upgrades and uninstalls. Also, some projects (still) expect pip to be installed in every environment.

So it’s fixable, but no-one has got round to looking at doing so.

Same answer - never these days, it’s just out of date code.

No, just the historical context. I guess you could say it’s a clever/subtle solution to a problem that no longer exists

kknechtel · July 11, 2024, 7:09am

I see.

(… Any ideas for forcing Pipx-driven Pip to use a cached backend, when it’s compatible with the package? This is mainly so I can speed up editable installs. With no Internet connection and no build isolation, a Setuptools-powered approach is taking over 3 seconds on my machine for a trivial project. With build isolation and a barely working Internet connection, it can hang pretty much indefinitely.)

bulletmark · July 11, 2024, 1:26pm

Well actually somebody has!

pf_moore · July 11, 2024, 1:35pm

I’m not sure I follow you. Yes, uv could be used in pipx as an alternative to pip (the performance of uv makes this attractive), but nowadays both pip and uv have a --python option^[1] that would allow pipx to avoid the .pth file hack it uses at the moment regardless of which installer it uses.

and indeed, pip got it before uv even existed… ↩︎

kknechtel · July 11, 2024, 1:37pm

Actually, could you elaborate on this a bit? For example, any known workflow that would be broken? Is this a “we would need to do a deprecation cycle and teach users how to do it the new way” sort of thing, or… ?

pf_moore · July 11, 2024, 1:44pm

The one I recall was IPython’s %pip magic, which runs subprocess.run([sys.executable, "-m", "pip", ...]) in the background (see here). It doesn’t include a dependency on pip, so this will fail in an environment without pip installed.

Yes, it’s a “we need people to change their code/workflows to fix assumptions that may not be true in future” situation. But it’s not clear how we’d detect or deprecate this usage.

There was a Discourse thread about this at the time we introduced the zipapp distribution of pip.

kknechtel · July 11, 2024, 4:23pm

Hmm.

I can’t really imagine detection. That seems like it would have required building something in to the recommended subprocess approach ahead of time.

As far as deprecation goes, I similarly can’t see a way for Pip itself to say anything about it, and have it actual directed at the right people. So that just leaves the documentation, and praying the right people read it.

Aside from that, there’s the reasoning that if people expect python -m pip in a new subprocess to work, then they’re expecting pip to exist in the current environment. Since Pip isn’t part of the standard library, these users therefore have pip as a dependency (even if they aren’t using it as a library), and should declare it. That would at least allow pip install theirapp, run from a different Pip, to notice that the destination environment lacks Pip and install it. (And when run from the environment’s Pip, everything works already and there’s no more to do.) That does defeat the purpose of the zipapp, but those developers (and their users) weren’t planning around the zipapp anyway. The situation hasn’t really gotten worse for them - there’s just a roadblock that prevents things from getting as much better for them as ought to happen.

But.

Notable from that discussion:

So maybe what’s needed is a tool that actually does have that API - and is minimal and focused. Or maybe just the library for it, never mind making a(nother) competitor to Pip. (Or maybe in the far future, Pip could use something like that internally.)

Overwhelmingly, as far as I’m aware, people^[1] do these tricks because they’re distributing an application with optional extras, and want the user to be able to obtain the extras at runtime (as opposed to only supporting theirapp[feature] at the initial installation, or expecting the user to understand and wrestle with Pip manually). One good example of this that I’ve run into is Manim.

So these are cases where sdist support is probably much less important on average: you wouldn’t be doing this in the first place if you had the kind of users who could deal with an sdist installation potentially failing^[2]. Caching is probably also much less important: it’s relatively unlikely the user has the dependency in cache, and users who do would already have the skills to manage the environment manually. The main tasks are to resolve dependencies, grab wheels, figure out the right directory, and do the unpacking. Those could all be part of a developer toolchain. (And the last part is, to my understanding, already covered by installer.)

That is, outside of cases like IPython/Jupyter/Spyder where the project is itself a Python development tool. ↩︎
Although actually, the Manim stack will try to pull in some huge AI-related stuff depending on exactly what you’re trying to do. I’ve been trying to build a much more lightweight approach to video rendering. But that’s another story… ↩︎

sinoroc · July 11, 2024, 4:51pm

Side note…

There is pip-api. I have not used it myself. I do not know if it would help/work in the use case(s) discussed here. And I assume that all warnings against using pip from your own Python code likely still apply. Just thought I would mention it.

kknechtel · July 11, 2024, 5:13pm

It’s neat that that exists, but I don’t think it’s what I’m looking for. It appears that fundamentally it’s just a wrapper for the subprocess logic, except that it uses os.environ.get("PIPAPI_PYTHON_LOCATION", sys.executable) as the Python to run (i.e. the default of using the current Python can be overridden).

Although that probably does still solve problems for a fair number of users.

pf_moore · July 11, 2024, 5:39pm

I’m not actually sure I still feel that way, to be honest. IMO it’s legitimate for code that expects to be able to run python -m pip as a subprocess (using sys.executable as the Python interpreter) to declare pip as a runtime dependency. Pip does have an API - it’s the CLI. What I said was it has no Python API, i.e., there are no supported functions importable or callable from the user’s Python code. But that’s a pretty fine distinction, and I didn’t want to derail the discussion by expanding on that point.

So, to put all this in context, nowadays, pipx could happily add a dependency on pip and/or uv, then use the command line interface for whichever tool the user prefers, including the --python flag, to install the requested application in its private venv. And I’m pretty sure there’s work going on in that direction.