ahem
I build adhoc tools like this a lot. But if you want the really big potential consumer of this data for me, pip is probably the one to look at. What pip most needs from a sdist is name and version (which we get from the filename) and dependencies. I imagine a bunch of dependency data will be deferred to wheel build time, but we could get a lot of benefit from reliable dependency metadata in sdists.
When resolving an install request, pip downloads distributions from the package index to get dependency information. Thatās costly and it forms a major performance hit for pipās new resolver. For wheels, all we have to do is download and extract the metadata. For sdists, we have to download, unpack, set up an isolated build environment, and call the PEP 517 metadata hook. If we had reliable PEP 621 metadata in the sdist, we could check for non-dynamic dependency data, and if itās there, bypass all of that. (And to be clear, the build cost isnāt something thatās going to get paid anyway, itās quite possible we will discard a sdist because it doesnāt lead to a valid result).
So Iām very much arguing that consumers are important from real-world requirements. And while Iād just as happily take a āstandard sdist metadataā PEP, Iām frankly sick of getting bogged down in debate on that. The advantage of PEP 621 here is:
- It has a mechanism for saying āto be calculated laterā.
- It defaults to static data -
dynamic
is explicitly opt-in. - Thereās no debate over the name of the flipping file, itās
pyproject.toml
.
But if backends know the data, and donāt put it anywhere because weāre waiting for the mythical āstandards sdist formatā, I canāt use it in pip or anywhere else unless the user migrates their project to PEP 621. And waiting for users to adopt the new standard is probably just as slow as waiting for sdist standardisation.
So yes, in its new form, PEP 621 offers a significant benefit for pip. Probably other consumers as well, but I canāt speak for them. Thereās a cost for backends in that they have to update pyproject.toml
, but honestly I donāt think thatās a big chore.
I apologise if I assumed incorrectly that people realised the above. Iām pretty certain pipās use case was presented during the initial PEP 621 discussions, but got dismissed for the same āweāre not standardising sdistsā reasons. Thatās one of the reasons I became less interested in PEP 621 - it was clearly only being targeted at being a common input format for backends. I was never involved as a backend developer, but as a consumer developer (IIRC, @pradyunsg and @uranusjr were in that category tooĀ¹). Fair enough, but weāve now essentially failed to make it a common input format, at least to the extent that Poetry have said they are unlikely to adopt it for some time. So I was left wondering whatās left. When I realised that we could possibly bring back the benefits for pip, I suggested that to @brettcannon and he was willing to give that a go.
But without being a cross-backend format, and without being usable for tools that want to introspect source distributions, Iām not sure thereās enough left in PEP 621 to warrant standardising (as opposed to just the backends that want to have a common format getting together and agreeing one).
Ā¹ And wasnāt @di involved for the Warehouse side? Surely Warehouse would be another case that would benefit from being able to read pyproject.toml
for metadata known to be fixed across all distribution files for a project?