I’m using pip 22.2.2 on Python 3.10 with a local package mirror. However, while the local mirror is used as an index, downloads are attempted from the global pythonhosted mirror.
pip correctly uses the specified mirror as a package index:
Looking in indexes: http://pypi.repo.test.hhu.de/simple
1 location(s) to search for versions of pandas:
* http://pypi.repo.test.hhu.de/simple/pandas/
Fetching project page and analyzing links: http://pypi.repo.test.hhu.de/simple/pandas/
Getting page http://pypi.repo.test.hhu.de/simple/pandas/
Found index url http://pypi.repo.test.hhu.de/simple
Looking up "http://pypi.repo.test.hhu.de/simple/pandas/" in the cache
Request header has "max_age" as 0, cache bypassed
Starting new HTTP connection (1): pypi.repo.test.hhu.de:80
http://pypi.repo.test.hhu.de:80 "GET /simple/pandas/ HTTP/1.1" 304 0
Fetched page http://pypi.repo.test.hhu.de/simple/pandas/ as application/vnd.pypi.simple.v1+json
but afterwards, all reported links point to pythonhosted:
Found link https://files.pythonhosted.org/packages/b4/8e/057ebd80a3b6dcda154dd6878744fc5549832a484e72bc4189b8d782be75/pandas-0.1.tar.gz (from http://pypi.repo.test.hhu.de/simple/pandas/), version: 0.1
[...]
how is that possible? The package index http://pypi.repo.test.hhu.de/simple/pandas/ does not contain a single reference to pythonhosted, as confirmed by running this snippet:
You’ll need to provide a publicly available index for people to reproduce your issue. pip has many level of caches and there are too many variables for anyone to provide a concrete explaination based only on the information you provided.
Hi, thanks a lot for the fast reply. Unfortunately, I can’t provide a publicly available index, as it is firewalled for local university use only. Fortunately, we were able to fix it internally.
Our pip mirror is a local nginx instance with URL rewriting enabled. Apparently since PoC of PEP 691 · pypa/pip@6f167b5 · GitHub, pip requests the mirror list with a different MIME type. However, nginx per default only applies URL rewriting to text/html content. This made it difficult to debug, as the responses looked perfectly fine when fetched via curl, but apparently didn’t work when fetched with pip. Adding sub_filter_types ‘*’; enables URL rewriting for all MIME types, and therefore fixed the issue. From the nginx manual:
Syntax: sub_filter_types mime-type …;
Default:
sub_filter_types text/html;
Context: http, server, location
Enables string replacement in responses with the specified MIME types in addition to “text/html”. The special value “*” matches any MIME type (0.8.29).
Additionally, the caching mechanisms of pip made it harder as we initially didn’t notice that the issue was already fixed. So if you have similar problems, try cleaning your cache with pip cache remove *.
Still, thanks a lot for the fast response Maybe this helps other people with a similar issue in the future.
This is indeed related, thanks for linking it! Didn’t find that post by my search keywords regarding this error, so it might be helpful that this is linked here now!
You also may want to keep an eye out for possible cases of this
issue:
It caught us by surprise doing similar URL rewriting with Apache,
because the JSON responses are all on one line of text and can be
many megabytes in length for some projects, so required custom
tuning for us to continue being able to rewrite the file URLs in
those specific situations.
Thanks a lot! I’ll keep an eye on that, especially if further issues arise. At the moment my tests were successful, so let’s hope it continues works as-is.
If you still aren’t able to figure this out, consider running pip with -vvv and posting the output of that run. That’ll contain sufficient information to help point out what exactly is happening.
However, -vvv didn’t help much, as the output I posted above looked fine. We used tcpdump to look at the requests sent by pip, and noticed that the request headers were different and therefore the response was as well. Afterwards we found the corresponding git change, and the respective hints provided kindly on this thread. Thanks for all the support, and keep up the great work!