Use cases for search functionality in PyPI

Hello all,

I’ve been following and working on the search portion of PyPI for some time and noticed it’s a part of Warehouse that generates some frustration among users.

Some time ago opened a meta issue to try to establish some use cases and give narrow down what is it that users expect from searching in PyPI. I thought it be good to open up this discussion to a wider audience.

Here are the use cases I distilled from the different issues in Warehouse:

  1. Project name searches : users that have a vague recollection of the name of a package or want to make sure of the spelling before installing. I believe this is the main use case for pip search . (#5506).
  2. Solution searches : users that would like to know what the best package is for a particular task. This could be covered by a “popularity” metric however it’s hard to get right as there’s a lot of aggregated community wisdom that is just not reflected in terms of project metadata as names, classifiers and descriptions are sometimes lacking or misleading. (#3932, #3860)
  3. Meta searches : users that would like to explore the project ecosystem based project metadata like interpreter versions, license, contributors, etc. (#727, #1971)

The goal would be to define these or more use cases and a set of requirements for them to start creating some issues in Warehouse.

Thanks in advance.

5 Likes

Personally, I only ever really want project name searches, and typically I either know the full project name, and just want to go to the project page, or I know a partial name and want to confirm the full actual name.

However, it’s possible that I’m “too close to the problem”, and because I know the limitations of the current search feature I’ve simply never tried it for anything more complex :slightly_smiling_face: But I do tend to head straight to Google for broader searches - e.g., “python library for extracting images from pdf” - mainly because if there isn’t a library, it will often give you useful references anyway.

2 Likes

Functionality search: users that are looking for a specific, precise functionality. For example “parsing ISO datetimes” or “RTMP protocol decoder” or “SAT solver”.

5 Likes

I think it’s worth mentioning a lot of users assume package with names similar or equal to a particular technology or service are somehow reserved to the “best package” for it.

This, unfortunately, is not necessarily the case anymore. One very flagrant case is aws while others are more subtle. Recently there was a PEP 541 case for grpc versus grpcio which was resolved but unfortunately not updated.

User @MiloslavPojman replied on twitter with:

  1. Looking for available names for a new library.
  2. Checking correct spelling (e.g. sklearn vs. scikit-learn)

Regarding use-case 2, I really like the “ecosystem” section of the wikis in Marshmallow projects. It allows users to see a list of publicly-available functionality, and developers can update the list themselves.

Unfortunately, I can’t think of a way to integrate this intuitively with PyPI’s search itself