PEP 833: Freezing the HTML simple repository API

Pre-PEP thread: Pre-PEP: What would it look like to deprecate PEP 503?

PEP draft: PEP 833 – Freezing the HTML simple repository API | peps.python.org

Summary of rationale and motivation

The use of an HTML representation for Python package indices predates efforts to standardize Python packaging. Consequently, the HTML representation standardized with PEP 503 represents a formalization of existing practices (particularly those of PyPI), rather than a design.

The HTML representation serves the Python packaging ecosystem admirably, but is also subject to a handful of technical and social limitations (elaborated in the PEP) that result it being (1) cumbersome to add features to in a backwards-compatible and performant manner, and (2) de facto frozen outside of PyPI itself.

Consequently, the PEP proposes “freezing” the HTML representation, i.e. explicitly discouraging the addition of new features to the HTML representation in future Simple Repository API PEPs. This does not deprecate the HTML representation, and the PEP does not discourage installers or indices from continuing to use it.

Summary of proposed changes

The HTML representation of the simple repository API is frozen for the purposes of Python packaging standards processes. Future Python packaging PEPs SHOULD NOT modify the HTML representation of the simple repository API, and MUST instead modify the JSON representation.


As always, I look forward to the community’s feedback on this proposal :slightly_smiling_face:

CC @dstufft as sponsor/delegate

9 Likes

+1 from me!

1 Like

-1 from me, I would like to see the optional upload-time field be allowed in the HTML serialization (currently the specifications say it is JSON only).

The field is critical for lots of client use cases, HTML only mirrors could then pick it up from PyPI, and other tools that generate HTML can choose to implement it, or not.

After that I would be happy to see the HTML specification frozen, and even eventually discouraged.

1 Like

I don’t feel strongly one way or another on the future of HTML or exposing upload-time in that HTML-- but are you aware of any mirrors that actually work by mirroring the actual HTML that PyPI serves?

My expectation is that pretty much all mirrors are implemented by generating their own HTML, particularly since PyPI uses absolute URLs to the files, so if you want your mirror to mirror the files as well, you need to fix those URLs somehow. It’s probably easier to just generate the HTML than to try and take PyPI’s HTML and mutate it to point to your own URLs.

You’d also have to make sure that you’re mirroring the metadata files (which won’t happen with any sort of generic website mirroring tool) or you’d also have to make sure that you’re dropping the data-core-metadata attribute.

Regardless of whether anyone mirrors PyPI by mirroring the actual HTML, mirrors that generate their own HTML could choose to implement upload-time if it were added, so I don’t think it affects your position much one way or the other. I’m mostly just curious because it’s come up a couple of times and I’ve never actually seen anyone do that in a long time (a decade ago that was the case though).

I guess one question I’d have is whether we know of any non-PyPI indexes that have expressed a willingness to implement upload-time? One of the assertions that PEP 833 makes is that uptake of new features among non PyPI indexes is limited or non-existent. If that’s true are they going to implement upload-time, or is it going to be yet another feature that gets ignored? If it’s not true and there are non PyPI indexes that are adopting new features regularly, then that is useful information to know for determining whether 833 is a good idea or not.

I mentioned it in the earlier thread, but to answer my own question a bit. I suspect there will be a larger motivation to implement upload-time given there are user facing features that require that data, whereas the other features are not required for any user facing feature.

That being said, I have no idea if larger is still effectively zero or not!

I’d also say that adding upload-time to the HTML representation is such a trivially easy addition, that if someone felt motivated to write such a PEP, I’d personally have no problem with it. I expect it’d get approved unless there was a hidden contingent of people who were adamant the HTML representation shouldn’t get even one more feature that refrained from posting in that other thread.

1 Like

I also wouldn’t be opposed to a PEP that made a last change to the HTML representation for upload-time. But I agree with Donald’s analysis that doing so is unlikely to move the needle on adoption, given that third-party indices generally need to rebuild the representation anyways, and we haven’t seen any evidence of them doing that for other pieces of metadata.