PEP 658: Static Distribution Metadata in the Simple Repository API

Maybe we should include some of the content in WHEEL in the tag as well. For example:

<a href="...."
    data-dist-format="Wheel-Version: 1.0"
    data-dist-info-metadata="sha256:0123456789abcdef">
  x-42-py4-none-any.whl
</a>

The content of data-dist-format can only be Wheel-Version: 1.0 for now (same as the Wheel-Version line in the wheel’s WHEEL file), and we’ll designate a value for distribution formats that provide static metadata in the future.

Any thoughts on this? I think I’ll add an attribute to indicate distribution format (maybe not the exact format above but something like wheel:1.0) to the PEP.

What’s the use case? As a general principle, I’m -1 on bloating APIs “just in case” something might be useful.

I know simple API pages mostly aren’t that big, but I just checked out of curiosity, and there’s one (pyagrum-nightly) that’s 6M in size, which isn’t exactly trivial. As that’s 17296 links, all of which seem to be wheels, we’d be adding quite a lot of extra data. Obviously that’s an extreme outlier, and we’d already be adding metadata links for every one of these, so it’s already going to add a lot of extra content, so maybe we really don’t care that much. But still, what’s the gain?

1 Like

I think it’s for future compatibility, in case in the future we change the format of metadata (not adding fields etc., but e.g. use JSON instead). This can be handled in the distribution by bumping the version in the WHEEL file, but can’t be handled with the current proposal.

With that said, it’s also OK to not have that field now, and if we ever need it, define the absence of data-dist-format as the initial format version. So say if we’re ever to have a wheel 2.0, the tag will need to say data-dist-format="wheel:2.0", but a lack of

I’m not convinced including the metadata version is helpful - what am I (as a resolver tool) supposed to do with it? Reject a package entirely? As soon as I decide to get the metadata, I’m going to find out the format/version, and I can’t think of anything useful to do any earlier.

3 Likes

This seems like an odd requirement:

The metadata served must be completely static, i.e. identical to the METADATA file in the .dist-info directory [dist-info] if the distribution is installed. The repository can provide this for any distributions, but it is expected they will only provide them for wheels [wheel] at the current time, since an sdist [sdist] does not yet have a way to promise the metadata will stay the same after it is built.

The METADATA file in a wheel is necessarily static, by the definition of the wheel format installation protocol (since anything installing wheels is supposed to just copy over the metadata). Is this intending to explicitly rule out serving PEP 643 metadata files? If so, why? I would expect that a PEP 643 metadata file for an sdist would, on average, be much more useful than nothing. Even metadata files with Dynamic dependencies can be useful for things like pre-warming a cache when traversing a dependency graph.

Can we simply say that the metadata file must contain the same metadata that the relevant file contains? Alternatively, we can say that sdists must be core metadata >= 2.2.

1 Like

It’s not the metadata version, but the distribution format version. This determines how a tool can actually make sense of the bytes sent by the server that’s supposed to be the metadata file.

But if tool authors are having trouble understanding its use, that’s a strong sign it’s not only not (yet) needed, but also won’t be correctly used. Since that attribute can be retrospectively defined anyway (as mentioned above), I’ll leave it out :slightly_smiling_face:

I was actually trying to rule out PEP 621 so people don’t get the wrong idea and start exposing pyproject.toml (which is not distribution metadata, but many people including tool authors confuse them). You’re right, PEP 621 should be allowed. Any suggestions how I can improve the wording to include the right things (and only them)?

2 Likes

How about:

The metadata served must be specified in the Core Metadata Specification format. Metadata must only be served for standards-compliant build artifacts that expose their metadata in a canonical location (i.e. PKG-INFO for sdists and {distribution}-{version}.dist-info/METADATA for wheels). The data served must be identical to the data found in the built artifact’s canonical location.

Possibly you can track down canonical links for where it says where the canonical locations are. Possibly this is the one for sdists, though it also says pyproject.toml is required, which I didn’t think was the case, so I dunno.

2 Likes

That’s the correct link. It technically only covers new style sdists as defined in PEP 517. That’s because the older sadist format was never standardised and we didn’t attempt to retroactively standardise it. The metadata in older sdists is pretty much useless anyway (there’s another thread here about that but I’m on mobile right now so I can’t find the link).

Very good typo here :rofl:

17 Likes

I have submitted an edit to the Rationale for this. The change should reflect in the rendered PEP when someone reads this, but in case it’s not, here’s the PR: PEP 658: Rationale edits by uranusjr · Pull Request #1972 · python/peps · GitHub

1 Like

It’s been more than two weeks since the previous discussion. If there’s no more outstanding issues, may I ask for a pronouncement from the PEP delegate @dstufft?

2 Likes

I’ll review the discussion and re-read the PEP and get a pronouncement this weekend.

2 Likes

Friendly ping

Any update on this?

2 Likes

Any update on this? This PEP if implemented would be very helpful for Pyodide. Pyodide has to reinstall wheels much more often than native environments and also has more constraints on what versions of packages it can support (primarily due to dependencies on C extensions that require extra porting work) so it is important that we can do reasonably efficient dependency resolution. But because pypi doesn’t put CORS headers on range requests (Allow cross origin range requests · Issue #11 · pypa/conveyor · GitHub) we don’t even have the LazyWheel option available to us.

3 Likes

Friendly ping @dstufft

Sorry, this slipped off my radar, will get it looked at by EOW.

4 Likes

I’m happy to accept this PEP! Congratulations :slight_smile:

I will mention there’s one bit of wording that you might want to clean up. It doesn’t really affect the contents of the PEP, but the Abstract still references data-dist-info-metadata as the pointer for the location of the metadata file. The rest of the PEP has it correct, so this is just missed reference that should ideally be cleaned up. But either way, the PEP looks good to me.

10 Likes

:tada:

I’ve filed PEP 658: Mark as Accepted by pradyunsg · Pull Request #2049 · python/peps · GitHub to mark this PEP as accepted. ^>^

3 Likes