Pip search is still broken

nilamo · September 1, 2022, 8:49pm

…and the error says it’ll be deprecated in the future? That’s obviously bad, why is it just being turned off instead of fixed? The need to find packages who’s exact name you don’t know won’t go away, just because the ability to search for them has been taken away, lol.

Is there any discussion about this anywhere? Is this a problem I can help fix?
I find it extremely hard to believe that doing a full text search is an impossible task.

Furthermore, the error says to check https://status.python.org for more info… but pip and/or package searching is nowhere on the status page, and there’s no indication at all that anything is other than 100% operational. Which is also extremely bad.

ketozhang · September 1, 2022, 9:03pm

Turning the question around, what can’t you get by using a internet search engine that you can with pip search?

pf_moore · September 1, 2022, 9:11pm

The PyPI search API was disabled in late 2020. This article gives some background. I’m surprised you’ve only just noticed…

pip search was built on the PyPI XML-RPC search API, so it no longer works because that API has been disabled. We should probably just remove the pip search command, to save confusion.

nilamo · September 1, 2022, 9:12pm

Because if I’m trying to install something, and apt search doesn’t have any results, the next step is naturally to check pip search.

But that doesn’t work, which is annoying.
So I check https://python.org, but there’s no search. There is at least a link to pypy from python.org, but if you don’t know that pip is pulling from pypy, that link is not helpful.

nilamo · September 1, 2022, 9:13pm

I didn’t just notice. I did however realize that it’s still not functional, and still hasn’t been fixed.

If the xml-rpc api is no longer available, why not… use… a different api…? Obviously searching packages is working directly from pypy.org, why can’t pip use the same functionality?

pf_moore · September 1, 2022, 9:27pm

There is no such API.

Because PyPI (Warehouse) doesn’t expose it for 3rd party use. Which is almost certainly because the last time they did, it got abused to the point where it was causing service issues.

I’ve updated the issue on the pip tracker here to propose that we finally deprecate and remove pip search, so that we no longer mislead people like yourself who think that it might work… (Please understand “why not just fix it” is not a helpful comment, and will not be received positively - we have considered the options and there really is no viable way to just make pip search start working again).

I appreciate your frustration here, but honestly, you should just go to the PyPI website and use the search box there. It may not be ideal for your workflow, but it works, which is more than can be said for blaming the pip or PyPI maintainers…

nilamo · September 1, 2022, 9:35pm

I didn’t realize I was giving the impression that I was blaming someone. I did, in fact, ask if there was some way I could help.

Is there a way to mark this thread as solved?

pf_moore · September 1, 2022, 9:43pm

Not a problem, it’s just that a lot of people over time have asked “why not just…” in one form or another, which gets a bit wearying. My comment was more in the form of a pre-emptive strike against anyone who reads this thread, then goes to the linked issue and starts complaining. That’s one reason I think we should probably just remove pip search at this point.

No need to mark the thread as solved, if we just leave it here it’ll be fine.

gst · September 2, 2022, 5:39pm

why not just give a better “human friendly” error message to pip search (instead of "command unknow or soon deprecated (or the current XMLRPC code -32500 error that’s given) ?

$ pip search foobar
error: pip has not(anymore) a search feature. 
You can check/search for your foobar:
+ here
+ or eventually here
blabla

?

saaketp · September 2, 2022, 7:00pm

That’s what was suggested in the pip issue as well https://github.com/pypa/pip/issues/5216#issuecomment-1235329876
and there is a PR to rephrase it Rephrase the search XMLRPC disable error by pradyunsg · Pull Request #12173 · pypi/warehouse · GitHub

mwichmann · September 2, 2022, 10:45pm

I realize I’m not adding anything helpful here, but it will absolutely be a surprise for newcomers with any kind of experience with other ecosystems that virtually all familiar “package managers” have some solution for searching via a command line interface, and Python does not. Consider apt/yum/dnf/zypper/pacman/(others-I-left-out) for Linux environments, choco search for Windows users interested in Chocolatey, plus gosearch, npm search, gem search, ppm search and…

TagWolf · October 15, 2022, 4:05am

Please see some viable solutions below from a cybersecurity engineer with decades of experience such as one or more of the following:

Fastest Solution:

Add a limiter on the server side to only allow X searches (especially larger ones) within X minutes and then lock out that specific IP for whatever time period is required to keep servers healthy. This can be done quickly and with no client updating and server side solutions only such as apache modules and iptables firewall rules.

Longer Term Solutions:

Change the default behavior of pip to cache search results by default and only grab new package lists every 24 hours unless overridden
By default, limit searches to X rate unless a user has some other identifier (such as an api key / login) so abusive clients can be disabled

mwichmann · October 17, 2022, 6:00pm

Local caching is the approach of well-known Linux distribution package managers (dnf, apt, pacman, etc.). Don’t know if there are reasons why that is less viable in the Python world - possibly a lesser acceptance of great globs of data being stored locally?

fungi · October 17, 2022, 6:48pm

My Debian/Sid workstation has over 200MiB of LZ4-compressed package
metadata. The current package count in Sid is around 165K entries.

Depending on what you want to search on PyPI, at best its project
count is more than double that right now, and the count of
releases/files pip has to take into account is one or two orders of
magnitude higher.

Granted, the type and volume of metadata used by apt and pip are not
the same, but it should be fairly apparent that there is a
significant difference in package quantity and churn between a (very
large) Linux distro and PyPI.

mwichmann · October 17, 2022, 7:14pm

Perhaps an opt-in mechanism would be more appealing - for the Debian world analogy, think apt-file, which is not default, and has to have the database populated by request, but then does provide for a number of very useful queries,

domdfcoding · October 20, 2022, 2:18pm

The local caching approach was my motivation for GitHub - domdfcoding/pypi_search: Metadata for a client-side search of PyPI, which is updated every 30 minutes with new releases’ names and summaries. The tool is modelled after apt-cache, which only searches (the debian equivalent of) that metadata. The total size is about 35MB, but since it’s a git repository you only have to download the delta each time you refresh your local version.

gargolito · October 21, 2022, 1:56pm

I wrote this a while back when it was first turned off. I was just learning python at the time so the code is fugly but it works. GitHub - gargolito/search-pypi: search pypi.org

itsdotscience · December 22, 2022, 8:33am

Or, while not the greatest but brings basic capability back

pip install pip_search (note that _ )

then pip_search

and for some cheap integration for *sh

alias pip='function _pip(){
    if [ $1 = "search" ]; then
        pip_search "$2";
    else pip "$@";
    fi;
};_pip'

and the windows folks (oh yeah, blast from the past here hah…i had to look up the if syntax, its been…longer than i care to admit knowing this lol)

doskey pip= IF $1 == search ( pip_search $2 $3 $4 $5 $6 $7 $8 $9 ) ELSE ( pip $* )
(quick edit to fix additional parameters if passed)

The only things above I can take credit for are knowing about “pip_search” and seeing the lack of windows equiv. alias for cmd.exe that doskey bit.

reference:

eryksun · December 22, 2022, 12:01pm

Of course it’s not MS-DOS DOSKEY blasting in from the 1980s. It’s “doskey.exe” for the Windows NT console, blasting in from the 1990s. Each application that’s attached to a console session has a command history buffer, plus a set of input aliases that match at the beginning of an entered line of text. Attached applications (e.g. “cmd.exe” or “python.exe”) have nothing to do with this. For example, when an application calls ReadConsoleW(), the console host has already applied any matching alias such as “pip”.

“doskey.exe” provides command-line access to the command history and input alias functions in the console API. The API for input aliases is documented:

But most of the command history API is undocumented:

SetConsoleHistoryInfo
GetConsoleHistoryInfo
SetConsoleNumberOfCommandsW
GetConsoleCommandHistoryLengthW
GetConsoleCommandHistoryW
ExpungeConsoleCommandHistoryW

pradyunsg · December 27, 2022, 3:48pm

At this point, the message that pip search provides is:

❯ pip search random-text-here
ERROR: XMLRPC request failed [code: -32500]
RuntimeError: PyPI no longer supports 'pip search' (or XML-RPC search). Please use https://pypi.org/search (via a browser) instead. See https://warehouse.pypa.io/api-reference/xml-rpc.html#deprecated-methods for more information.

That should resolve most of the concerns raised in OP, and clearly explains what does not work.