Which cryptographic signing approach?

trishankatdatadog · September 4, 2019, 8:38pm

IMHO, the minimum security model proposed in PEP 458 should be doable within 3-6 months of full-time work by a small team of dedicated developers. This includes 1 month of reading about and experimenting with TUF. IIRC, Docker took around the same time to implement Docker Content Trust (server) and Notary (client), their version of TUF. The Datadog TUF and in-toto integration was more complicated because it offers much stronger security guarantees, and it took around 1-2x as long.

There is no signing from developers here, so there is nothing to require on their part, because PyPI would basically sign all packages. pip would be lightly modified to transparently verify packages using TUF. The download code should be no more complicated than this Datadog-specific downloader.

IMO, the most complicated pieces would be in decreasing order of difficulty:

Designing the initialization and rotation of the root TUF keys. This is a time-consuming, offline ceremony that involves careful generation and backing up of keys in a secure environment. An important consideration is to make sure that it can survive changes of personnel. The good news is that large parts of it can be scripted / automated, even though it is run manually by humans. It should be done once every few years (e.g., 1 year), or whenever there is a key compromise.
Writing the PyPI code to sign packages on demand using TUF. This code should include garbage collection of expired metadata.

The code above, which I have a lot of experience with, needs to be written at most once. The TUF reference implementation is already in Python, so a lot of code is already available. I expect that we would run into small bugs involving corner cases. For example, I expect that we will find unexpected bugs when writing many different versions of metadata for a large enough number of packages or at a fast enough rate, or run into unexpected network issues when downloading packages in production environments we had not anticipated before. So, this will needs lots of testing, and some fixing, that will naturally come from beta-testing. After this, it should be in largely maintenance mode, until such time as PEP 480 is ready to be implemented.

I am happy to discuss more details. This timeline should not be considered a strict one: I am deliberately using a conservative estimate to make sure that we can account for unexpected hiccups. I know TUF looks intimidating, but it really isn’t once you have set up the one-time work, and it offers strong security guarantees that other systems simply do not provide (e.g., transparent key rotation for uncompromised users after PyPI itself has been compromised but recovered from). One thing we should do is to introduce security gradually in pieces, not all at once. Again, the TUF team and I are happy to consult and volunteer as much as we can.