From a practical standpoint, It would be non trivial to support
Want-Digest in PyPI. Serving a file goes directly from our CDN to S3 or GCS, so we don’t have any sort of web server that we control in the path of serving files.
Off the top of my head I can only think of two ways for us to implement this:
- Add a proxy server, and change our infrastructure to go from CDN -> S3/GCS to CDN -> Proxy -> S3/GCS, and in this proxy server we basically just implement support for
- I think both S3 and GCS support adding static metadata to a file that gets surfaced as a response header, so we could maybe do something like add a header per hash we support, and implement
Want-Digest in VCL… maybe? I’m not 100% sure it’s feasible.
Outside of PyPI itself bandersnatch mirrors don’t have any custom running Python code, and it’s just a cronjob that drops some files on disk, and it expects you to have some web server configured to serve that disk directory. This would imply that your web server has to support
Want-Digest or that they now need a specialized server process, is there any general purpose web servers that do support
A more foundational question is wether
Want-Digest is the right tool for the job, particularly as it pertains to mirrors and general purpose web servers (assuming there is any that support it). Right now the hashes in
/simple/foo/ act as an integrity check against anything done to the file, it effectively treats the file storage as untrusted so disk corruption, S3/GCS modification, etc that change the file hash will cause installers to fail, because unless they also modify the
/simple/foo/ page the hashes won’t align, but with a general purpose file server, or the VCL option above, we start trusting the file storage and modifications to that will start to reflect naturally in the
Want-Digest response. That isn’t wrong exactly, but it’s an open question about whether it’s the behavior we would want or not.
Basically, I think supporting
Want-Digest should probably be something that goes through the PEP process, both because of the open questions and because it’s modifying the public interface of something that was defined in the PEP process. Hopefully as part of that we would nail down specifics and figure out if it’s something we’d want to support or not.
For whatever it’s worth, my current gut instinct would be -0, but I’m happy to listen to arguments that it should happen.