We’ve always hit complicated debates when we try to standardise sdist metadata. I’m not 100% sure why, I think there are some cases where people get quite anxious about the possibility that backends could generate different metadata when building the wheel than they did when building the sdist. I honestly don’t know why that could happen, or why we can’t simply declare that as something that backends are no longer allowed to do - but it does make standardising sdist metadata a potentially time-consuming process.
But I see no reason why we can’t standardise the filename - projects are already effectively required to freeze the name and version when building the sdist, so we’re not imposing anything new.
I wish we could just bless the current format as standard, but you make a good point that other sites like github generate names in that format that aren’t sdists. But conversely, I’d somehow feel uncomfortable if sdists got a new extension. I know that’s silly, so I’m not going to argue too strongly, but how about this:
- sdist filenames MUST take the form
NAME-VERSION.sdist.tar.gz. The “name” and “version” portions must be canonicalised the same way as wheels. Tools MUST assume that any file with extension
.sdist.tar.gz is a sdist.
- The NAME and VERSION parts of the filename MUST match the distribution metadata - both the metadata in the sdist itself (when that gets standardised) and the metadata of any wheel built from that sdist. It is a backend error to create a wheel whose name and version don’t match the sdist filename.
Currently PEP 503 doesn’t make any statement about what “project files” an index can serve. I’m inclined to leave that unchanged, as it requires tools to make judgements about files without considering where they came from (which is overall a good thing). But I would make one exception to that, for compatibility purposes, and say that tools MAY assume that files named
*.tar.gz and served from a PEP 503 index are sdists, and proceed as if they had been named
(It’s not inconceivable that some tool will choose to treat all
.tar.gz files like this, but I’d view that as an implementation choice about how to treat non-standard files, rather than something the standard should take a view on).
Even if we do want to go further and standardise sdist metadata, I’d still advocate for the above as the specification of the sdist filename. It feels like the minimum change needed to give us reliable information.