PEP 751: one last time

:+1: So choose one, but not both in a single lock file.

So that puts uv and PDM (and @sethmlarson ) in favour of a disposable [tool] table and Poetry as neutral from an export format perspective. That seems like enough to go with the disposable approach!

@charliermarsh didn’t express an opinion, @radoering said they would put in [tool], so it’s kind of hazy as to whether to put this in or not (and @sethmlarson liked having it).

It first depends if we are even going to try and record them. :sweat_smile: But after that I think it’s a question of UX if you are handed a lock file and nothing else. Do people care about distinguishing between extras and dependency groups? Previous discussions suggested people did.

Correct.

Yes. With the index server’s URL you can get the project name from packages.name and tack that on. You can then make the request and then find the file you’re after.

I personally don’t see why the cache would care. You got the file you wanted and I believe pip hasn’t cared about the hash once the file is on disk when the requirements file contains hashes. I don’t see why this would be different.

Because you connect via IP address and it changes every time? So you could never rely on anything be written down in the PEP for where to find something because your IP constantly changes?

I think it depends on how thorough you want to be. If you wrote down the hash(es) that some wheel file has so you can validate before using it from a cache then you could do that to check if things match. But I’m also fine to make it explicit in the PEP that all the security mechanisms are about file acquisition and once you have it on disk then it’s up to your tool to decide if its cached version of the wheel file meets your needs or not (e.g. you can bypass the cache or use it based on name alone).

The latter for me (and if you include file size then hash collision attacks go way down). I think I brought this up way back when PEP 751 was first proposed, but I personally view the lock file as saying what bits you want to get installed into an environment. I think getting pedantic to the point that it has to come from a specific URL instead of relying on hashes and file size to make sure you’re getting what you want isn’t beneficial for the user. If I’m remembering correctly I considered the URL a hint of where to look, but I think you didn’t like the URL being viewed as that and wanted it to be canonical.

I’m old enough to remember download mirrors and BitTorrent (for anyone unfamiliar with mirrors, think of them like CDNs where you had to manually choose which location to download from). In both situations you weren’t downloading a specific file from a specific place on the internet, you were downloading specific bits from wherever you could get them the fastest and you could verify you got what you expected by checking the hash and file size (if you weren’t so lazy as to skip that step which I know I almost always did :sweat_smile:). In that case you only cared about the content.

Sure, but I think we need to be very clear about how far we would be taking this. If we say the lock file fundamentally cares about recording what file contents are expected when downloading a file, then the URL and index are more for auditing purposes as well as a quick way to know where to consider looking for a file. But we could also say installers could in fact have users provide alternative locations to look as well if they so desired (e.g. an internal mirror or CDN). Think of the weekend hobbyist wanting to just grab the files from PyPI and the corporate user who has an internal mirror of PyPI, both wanting to use the same lock file. Do we tell e.g. the corporate user to recreate the lock file because the URL would be different, or just say to check the hashes and file sizes to make sure you’re getting the same thing as the weekend hobbyist from the internal mirror?

If we take a “fetch the exact same file” and it’s all about the URL on the internet, then it has to be the exact URL or the index, but not both. And it also means users can’t use their own mirrors.

I’m personally for the “content” view over the “location” view. @charliermarsh @radoering @frostming do you have opinion on this one? Would you ever support letting users download files from places other than what’s in a lock file as long as the hash and file size match?

6 Likes