Potential inconsistency w/ PEP 503 (Simple Repo API)

PEP 503 says, “The text of the anchor tag MUST be the normalized name of the project and the href attribute MUST link to the URL for that particular project.” But if you look at the simple index for PyPI you will notice that e.g. 0.0.1 is listed as <a href="/simple/0-0-1/">0.0.1</a>. You may notice that the CDATA/text of the anchor is not normalized (later on the PEP says the project name in the URL needs to be normalized).

So if I’m reading the PEP appropriately then who is right: PyPI or the PEP? My assumption is it’s PyPI as normalizing the project name in the CDATA would remove the ability to search on the project’s full name via just data from the simple repo API.

P.S. If the PEP is wrong then when I submit a PR to fix it I will also clarify what the root URL is in relation to the base URL as I personally found that a bit confusing.

1 Like

I had a similar problem a while ago about a project page’s dist anchor tag text and value. @dstufft said at the time that the PEP was written to mimic pip’s behaviour, so I’d assume in this case whatever PyPI does is “correct” and the PEP should be updated to match that.

Note though the situation here is not as clear since pip does not use this page at all (so there’s not a client implementation to reference), and I’m not sure what the old pypi.python.org API did—and have no intention to dig into the code to find out.

1 Like

OK, I’ll submit a PR to update the PEP then.

… for now. :wink:

But in all seriousness, isn’t this used by --index-url/--extra-index-url in pip? Are you just saying pip doesn’t use pypi.org/simple/ by default?

1 Like

That’s correct, pretty much nothing uses /simple/ as far as I’m aware. It’s huge and also doesn’t get purged regularly so it’s usually out of date anyways.

2 Likes

I use /simple when I’m scraping PyPI for something and feeling lazy :slight_smile:

Wouldn’t be opposed to an even simpler /simple for that purpose, though. Like a flat list of available package names.

1 Like

I might be changing that. :slight_smile:

Exactly! Same for me and other people I know who have done analysis of PyPI as I don’t think the JSON API has a way to get a list of all projects.

1 Like

Or a more complex /simple.json with latest release version, python-requires and basic metadata (author, maintainers, dates). Updated once daily is plenty.

(I feel like most of the time I’m looking at uploaded files to see which platforms are supported, but that’s way too much information to dump into this file. Last release date lets me do my own caching though.)

1 Like