Provide distribution metadata as 'data' attributes on links in simple index

Where does the data-requires-python information shown in the attributes of links to distributions on PyPI (PEP 503) come from? Does warehouse read this bit of information from the metadata inside the distribution file? Is it delivered by the client on upload? From a quick research, seems like it comes from the client (here in twine).

In a similar way, could all the metadata relevant to dependency resolution (name, version, dependency requirements, Python requirements, is there more?) be published in similar data attributes (maybe Base64 encoded, maybe all merged into 1 attribute). The goal would be that in a best case scenario the dependency resolution could be done completely without downloading any distribution (only by parsing the HTML pages).

For wheels, this metadata is (as far as I can tell) static. Work is currently being done to specify a way to declare this metadata in a static way (where possible) for source distributions as well.

Tools such as pip could cache this information (the HTML pages) and potentially go as far as doing most of the dependency resolution offline.

Does it seem feasible, reasonable, and meaningful?

1 Like

You might be interested in https://github.com/pypa/warehouse/issues/8254

1 Like

Thanks for the link! Indeed this is interesting to me, this would help reach the same goal: faster dependency resolution.

My suggestion here is obviously much more lightweight (hackish, less robust?), and I won’t try to defend it too hard (especially in regard to the whole TUF topic, I only have vaguely read about it), but I will still mention a couple of advantages:

  • 1 HTTP request per project, should give all the info (but growth of the size of these pages)
  • most of the work is already done in warehouse, in PEP 503, in pip, it just needs to be extended from 1 attribute (data-requires-python) to 3 or 4 (name, version, dependencies, extras, platform)
  • no need to read the distributions server-side, twine already seems to provide the info (reliable?) (but I assume warehouse reads the distributions anyway to build the project pages)

Any link to the discussion that led to the decision of adding data-requires-python to PEP 503 (but not the other metadata)?

1 Like