PEP 691: JSON-based Simple API for Python Package Indexes

brettcannon · June 15, 2022, 9:33pm

There’s two potential use-cases for all the data from this PEP. One is an installer, knowing what package it wants in order find the appropriate file. The other is if there’s some browsing functionality in e.g. an editor who wants to provide an interface in front of the package server to users to peruse.

If all you care about is the former case, then you may want a list of normalized package names for the index to simply verify the package is there, and no URL since the PEP tells you how to construct it. That suggests (list, normalized, no).

But if you’re providing a browser, you want that display name (which makes providing it a SHOULD and not a MUST sort of thing). Now, you still don’t need the URL since the PEP specifies how the URL to the package’s details must be constructed. So for me the question is whether the index should contain the display name and canonical name, and to me that comes down to what will burn less energy: the bandwidth to ship the canonical name over the internet on every fetch, or calculating it on a case-by-case basis by every browser. I have no good feel on this since PyPI’s bandwidth is so massive it may actually outweigh the cost of calculating the canonical name when someone happens to pick a package name from some list of things. This suggests either (list, non-normalized, no) or (map, both, no).

But I personally don’t love the discrepancy between display names and canonical names to begin with. As such, I would still be quite happy with (list, normalized, no) for the index to save on the bandwidth and processing since you can never rely on a display name for anything. Besides, people can figure out that oslo.concurrency and oslo-concurrency are the same thing pretty easily. But I also don’t know if other people agree with me on this.