Where is my cached Pip wheel coming from?

kknechtel · January 24, 2024, 10:12am

I was trying to investigate some ways to automate venv creation and updating, when I noticed something odd on my system. If I create a new venv:

$ python -m venv example_venv
$ example_venv/bin/python -m pip install --upgrade pip
Collecting pip
  Using cached pip-23.3.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 20.0.2
    Uninstalling pip-20.0.2:
      Successfully uninstalled pip-20.0.2
Successfully installed pip-23.3.2

Wait, cached? But there doesn’t appear to be anything like that in the cache:

$ example_venv/bin/python -m pip cache list | grep pip

shows nothing. I know that my system Python uses a hacked ensurepip and that it’s installing wheels from /usr/share/python-wheels, which gets copied to share/python-wheels within the venv. But that only contains the original Pip wheel, not the one for the upgrade:

$ ls example_venv/share/python-wheels/ | grep pip
pip-20.0.2-py2.py3-none-any.whl
$ ls /usr/share/python-wheels | grep pip
pip-20.0.2-py2.py3-none-any.whl

Where/how else does Pip cache wheels? How am I avoiding a fresh download for each new venv?

barry-scott · January 24, 2024, 3:44pm

You could brut-force search for the .whl: find /usr /var /home -name 'pip*.whl'?

kknechtel · January 24, 2024, 9:32pm

I couldn’t find it anywhere that way (and a bunch of variations). The closest I found was wheels for other versions in other virtual environments (especially ones that have virtualenv), and the .dist-info folder for the installed upgraded Pip in the new virtual environment as well as in a built-from-source 3.11 (where I sudo upgraded it as part of my experiments - the system Python is 3.8, and I’m not touching it until it’s time to upgrade the OS entirely).

barry-scott · January 24, 2024, 9:35pm

Use strace to see what files pip accesses?

kknechtel · January 24, 2024, 10:06pm

It looks like it’s copying the wheel out from an extension-less file in the http sub-directory of the cache, which file appears to embed the length-counted wheel along with some HTTP metadata.

The cache directory also contains a http-v2, documented like:

Changed in version 23.3: A new cache format is now used, stored in a directory called http-v2 (see below for this directory’s location). Previously this cache was stored in a directory called http in the main cache directory. If you have completely switched to newer versions of pip, you may wish to delete the old directory.

I assume that when I got the 23.3.2 wheel the first time, it wasn’t from an environment running 23.3.1, therefore it was put in the old cache.

pip cache list, even on 23.3.2, apparently only shows wheels from the wheels subfolder of the cache (which are stored as actual .whl files), not from http nor http-v2. On the other hand, it seems that the HTTP cache (either version; http-v2 apparently just separates out HTTP metadata from the actual downloaded files) might contain any kind of files previously downloaded from PyPI, including sdists, READMEs, JSON-formatted listings of available versions etc. It’s not clear how Pip knows what wheels are available in the cache; but obviously it does. (Based on the subfolder names, I assume this is using some kind of database library.)

My understanding from the documentation is that wheels only contains locally-built wheels (i.e., from sdists retrieved from PyPI), not wheels that were directly provided by PyPI; and those are only in http/http-v2 instead. But I would argue that pip cache list ought to list the downloaded wheels as well. After all, is it not the purpose of the command to indicate what is cached and can thus be installed without a download?

jeff5 · January 26, 2024, 7:43am

Shouldn’t you have to activate your venv before installing things?

I have noticed myself that installing things I used before into a new venv often mentions a cache. This seems very reasonable behaviour to me.

On my machine (Windows) C:\Users\Jeff\AppData\Local\pip\cache\wheels.

kknechtel · January 26, 2024, 7:50am

“Activating a venv” just manipulates some environment variables so that the prompt changes, PATH changes and a deactivate command exists. Installing to the venv can be done just as easily by running the venv’s Pip using the venv’s Python, which in turn can be done by explicitly specifying the path to that Python and using -m pip instead of running Pip directly. That’s how I’ve done it in the example.

I think you’ve missed the point of the thread. The cache directory apparently contains wheels (or at least, binary blobs that include wheels) in other sub-folders, and Pip knows how to use those wheels for installation; but pip cache list only shows what’s stored in the wheels subdirectory (i.e., wheels that were built locally, not downloaded).

jeff5 · January 26, 2024, 10:08am

I know, but I would not feel safe in treating what I did next as representative of behaviour within the venv until these were set. It is, however, all magic to me.

Possibly. But you seemed to expect the cache only to be in a sub-directory of the venv, while I find it to be in a user-specific place outside the venv, and (in pip cache list) to contain things I installed in other versions of Python, and a long time ago. YMMV.

barry-scott · January 26, 2024, 10:18am

I never activate venv’s and they always work without surprises.

The only reason to activate, as far as I know, is to be able to type “python” or “pip” and have the venv versions run.

fungi · January 26, 2024, 2:26pm

I know, but I would not feel safe in treating what I did next as
representative of behaviour within the venv until these were set.
It is, however, all magic to me.

I too use venvs extensively but never “activate” them. Most
Python-based tools I use for day-to-day work (and there are dozens)
are pip-installed into individual venvs and then I maintain a
symlink farm to their entrypoints from a ~/bin directory that’s
added to my default $PATH at login. It works just fine, really.

matanox · April 13, 2024, 9:47pm

Same case, I have a package getting installed from cache, but nowhere to be found through pip cache commands. How can I locate it? I would like to keep and share a copy as the version I have is no longer available on PyPi and I can’t upgrade my codebase for the newer versions yet.

When I pip install it in a new venv, it says “Using cached …” with the wheel name and all and seamlessly installs it in the new venv, but I can’t trace where it is being actually cached, it’s not anywhere under where pip cache info shows me, not even when searching from one level up from ~/.cache/pip/wheels where it points me to. Not with and not without a whl extension, the package name is truly phantom.

Is there very alternatively any command which extracts the wheel to somewhere, without installing it?

Upgrading to the latetst pip doesn’t change anything in the above, nor activating and deactivating the venv, I mean at least if you’re not one of those epic heros who never activate their environments

matanox · April 19, 2024, 7:10am

Any advice for extracting a wheel from the new caching structure?