As there is another metadata file format metadata.json
PEP 426 -- Metadata for Python Software Packages 2.0 | Python.org
how would this PEP handle it?
PEP 426 has been withdrawn.
Tzu-pingās proposal is flexible. We can easily add additional files later, e.g. x-42-py4-none-any.whl.metadata
now and x-42-py4-none-any.whl.metadata.json
or x-42-py4-none-any.whl.metadata.yml
later.
I think I meant to write Since tools generally only need dependency information (the to is redundant). Thanks for catching this!
Yes, thatās the idea. (Donald came up with that.) I also included dist-info
in the attribute name to avoid possible confusion in the future if we ever have another file named METADATA
thatās not in the .dist-info
directory.
How about something like this
x-42-py4-none-any.whl
x-42-py4-none-any.dist-info/METADATA
x-42-py4-none-any.dist-info
is a naming scheme not used anywhere else so I wonder how people feel. Personally I like though, Iāll propose this instead if others are fine it.
I imagine that this will only work for wheels, not sdists,
and cutting off the file extension will make it more ambiguous
without context.
Agreed. I suggest we keep it simple, and just say that the metadata for file xxxxx
is at xxxxx.METADATA
. The PEP is solely about exposing the metadata, so letās not over-generalise the solution, and by just appending a suffix to the filename weāre sure that we can support metadata for any file, with the only limitation being that a file can only have one set of metadata (which is true by definition).
sdists have a different naming scheme than wheels so there should be no ambiguity.
Anyway, this was just a proposal, reacting to the doubts about the naming schemes expressed before. Iāve no strong opinion on this.
One additional small suggestion/thought that you should feel free to completely ignore:
It seems like there are two approaches to go down in terms of naming.
One is to try to match the name as closely as possible to the name inside of the archive file. This pushes the handling of naming collisions onto the standards that define the files. e.g. what does METADATA
mean. Unfortunately that also means that we might have to tweak things if, say, we do a wheel 2.0 that made METADATA
a JSON file.
The other one is to just define our own filenames, and not try to match the in file naming scheme. If we go this route, it might be useful to include some extra information though. For instance, if we did foo.whl.metadata
, and we upgraded to a new metadata version that wasnāt compatible, what would we do? We could do .json
or .yml
if we used json or yaml, but what if we used the same format? Would it make sense to do something like foo.whl.metadata.v1
to denote itās v1 style metadata?
Alternatively we could just punt on it, call it foo.whl.metadata
, and say weāll figure out the best name if we ever need a second name.
Maybe we should include some of the content in WHEEL
in the tag as well. For example:
<a href="...."
data-dist-format="Wheel-Version: 1.0"
data-dist-info-metadata="sha256:0123456789abcdef">
x-42-py4-none-any.whl
</a>
The content of data-dist-format
can only be Wheel-Version: 1.0
for now (same as the Wheel-Version
line in the wheelās WHEEL
file), and weāll designate a value for distribution formats that provide static metadata in the future.
Any thoughts on this? I think Iāll add an attribute to indicate distribution format (maybe not the exact format above but something like wheel:1.0
) to the PEP.
Whatās the use case? As a general principle, Iām -1 on bloating APIs ājust in caseā something might be useful.
I know simple API pages mostly arenāt that big, but I just checked out of curiosity, and thereās one (pyagrum-nightly) thatās 6M in size, which isnāt exactly trivial. As thatās 17296 links, all of which seem to be wheels, weād be adding quite a lot of extra data. Obviously thatās an extreme outlier, and weād already be adding metadata links for every one of these, so itās already going to add a lot of extra content, so maybe we really donāt care that much. But still, whatās the gain?
I think itās for future compatibility, in case in the future we change the format of metadata (not adding fields etc., but e.g. use JSON instead). This can be handled in the distribution by bumping the version in the WHEEL
file, but canāt be handled with the current proposal.
With that said, itās also OK to not have that field now, and if we ever need it, define the absence of data-dist-format
as the initial format version. So say if weāre ever to have a wheel 2.0, the tag will need to say data-dist-format="wheel:2.0"
, but a lack of
Iām not convinced including the metadata version is helpful - what am I (as a resolver tool) supposed to do with it? Reject a package entirely? As soon as I decide to get the metadata, Iām going to find out the format/version, and I canāt think of anything useful to do any earlier.
This seems like an odd requirement:
The metadata served must be completely static, i.e. identical to the METADATA file in the .dist-info directory [dist-info] if the distribution is installed. The repository can provide this for any distributions, but it is expected they will only provide them for wheels [wheel] at the current time, since an sdist [sdist] does not yet have a way to promise the metadata will stay the same after it is built.
The METADATA
file in a wheel
is necessarily static, by the definition of the wheel format installation protocol (since anything installing wheels is supposed to just copy over the metadata). Is this intending to explicitly rule out serving PEP 643 metadata files? If so, why? I would expect that a PEP 643 metadata file for an sdist would, on average, be much more useful than nothing. Even metadata files with Dynamic
dependencies can be useful for things like pre-warming a cache when traversing a dependency graph.
Can we simply say that the metadata file must contain the same metadata that the relevant file contains? Alternatively, we can say that sdists must be core metadata >= 2.2.
Itās not the metadata version, but the distribution format version. This determines how a tool can actually make sense of the bytes sent by the server thatās supposed to be the metadata file.
But if tool authors are having trouble understanding its use, thatās a strong sign itās not only not (yet) needed, but also wonāt be correctly used. Since that attribute can be retrospectively defined anyway (as mentioned above), Iāll leave it out
I was actually trying to rule out PEP 621 so people donāt get the wrong idea and start exposing pyproject.toml
(which is not distribution metadata, but many people including tool authors confuse them). Youāre right, PEP 621 should be allowed. Any suggestions how I can improve the wording to include the right things (and only them)?
How about:
The metadata served must be specified in the Core Metadata Specification format. Metadata must only be served for standards-compliant build artifacts that expose their metadata in a canonical location (i.e.
PKG-INFO
for sdists and{distribution}-{version}.dist-info/METADATA
for wheels). The data served must be identical to the data found in the built artifactās canonical location.
Possibly you can track down canonical links for where it says where the canonical locations are. Possibly this is the one for sdists, though it also says pyproject.toml
is required, which I didnāt think was the case, so I dunno.
Thatās the correct link. It technically only covers new style sdists as defined in PEP 517. Thatās because the older sadist format was never standardised and we didnāt attempt to retroactively standardise it. The metadata in older sdists is pretty much useless anyway (thereās another thread here about that but Iām on mobile right now so I canāt find the link).
Very good typo here
I have submitted an edit to the Rationale for this. The change should reflect in the rendered PEP when someone reads this, but in case itās not, hereās the PR: PEP 658: Rationale edits by uranusjr Ā· Pull Request #1972 Ā· python/peps Ā· GitHub