I support back signing the existing uploads. However, I would like to know if there would be a way for users to differentiate a retroactive signature from a “regular” one, which should strike a balance between initial usability and transparency.
I really like this idea. I think it is very possible using the delegation mechanism within TUF, so as to use a “backsigning key” for all the backsigned packages, and then move them to a regular “signed” role once they are pushed after the flag date.
I went ahead and closed the issue. If there are any objections I’m happy to reopen it.
I noticed that, over the course of the last several weeks, there’s been some back-and-forth about whether the PEP should include either or both of:
- a short review of how other package managers/ecosystems handle this threat
- a short list of other platforms/services that have integrated TUF successfully, and what benefits TUF has provided to them
In response to @tiran and to the conversation about mentioning Microsoft, @mnm678 updated the abstract to only mention “TUF has been used in production by a number of organizations, including use in Cloud Native Computing Foundation’s Notary service, which provides the infrastructure for container image signing in Docker Registry”.
Is this sufficient? Would people prefer more detail on adoptions? I think the current PEP draft does not include a short review of how other package managers/ecosystems handle this threat, or a link to such a review; is this something people would find helpful?
Of the points I raised, the following two remain outstanding with addition of the text you quoted:
- What benefits were gained from using TUF in those organisations?
- How do other ecosystems handle this issue (in particular, what alternatives to TUF have been used)?
but I’m not particularly inclined to make an issue out of it. If I were going to push for anything further, it would be a (brief) summary of what alternatives to TUF exist. Personally, though, I’ve probably got the context I need from the discussions here, so adding that sort of information would be only marginally useful to me, and I don’t really want to continue the debate solely on behalf of hypothetical readers who don’t have a feel for what TUF is.
For the original asyncio PEP, there was some additional background material that would have been a distraction in the PEP itself, so I ended up posting it on my blog.
In this case, I think we could reference the Linux Foundation blog post from when Notary and TUF were accepted as CNCF projects for general background info: https://www.linuxfoundation.org/cloud-containers-virtualization/2017/10/cncf-host-two-security-projects-notary-tuf-specification/
I’m not sure where best to place this, so I’ll try here.
Today I was discussing PEP-458 and PEP-480 on #debian-python (OFTC IRC) with @dstufft. One concern I have as a Linux distribution packager of Python packages is that the current ability to work with GPG signatures via PyPI be preserved. While I know GPG signatures don’t solve all the problems these PEPs are attempting to address, they do (for us) solve an important set of problems.
Personally, I would find it helpful if these PEPs would make an explicit statement that replacing the current GPG support is a non-goal. Eventually, implementation of PEP 458 and PEP 480 might be a suitable replacement, but it would take some time from when these changes are fielded for us to assess them and adapt our tools to use them, so please, for the foreseeable future, leave PGP signatures alone.
Thanks for contributing to the discussion Scott.
Forgive my ignorance, but could you please elaborate on this? I would like PEP 458 and, more likely, PEP 480 to be a suitable replacement for GPG signatures in future but it’s not clear to me what problems GPG solves for distribution packagers.
(I tried to find some irc logs for #debian-python but came up short)
It’s probably worth splitting the question into 2 parts:
- Does the PEP propose dropping an existing feature (GPG support)? I’d assume no, as if it did, there would be a need for a deprecation plan, transition instructions, etc, to be documented. But I agree that explicitly stating that the PEP doesn’t propose to remove GPG support would be helpful.
- Do the features in the PEP offer a (better? equivalent?) replacement for GPG support? If so, then dropping GPG at a later date, via a deprecation process, would be viable. If not, then we are committed to supporting both unless we choose to simply remove GPG functionality (again via a deprecation process). This is more of an informational/context matter in terms of the PEP.
Disclaimer: I don’t use GPG support at all. Please take my comments above as purely about the PEP process, not about the actual functionality.
PEP 458 provides zero information about how the content PyPI is serving relates to the sdist the developer has on their system. A GPG signature from a known source (we typically use a TOFU model - Trust On First Use) that verifies against a known key for that package gives us assurance that what was received from PyPI is unmodified from what the developer intended.
PEP 458 only attempts the PyPI to end user half of that chain. In theory, PEP 480 would complete covering the space from the developer to PyPI, but I think we’d want to understand the actual implementation before agreeing it was sufficient.
We use GPG signed artifacts throughout the Debian infrastructure, so it’s something our tools are well equipped to handle. Switching to use PEP 480 author originated signatures probably is adequate from a security perspective (devil’s always in the details, so we’ll wait and see), but it definitely would require PyPI specific tooling changes that won’t appear overnight.
I have a vague recollection from earlier discussions about these PEPs that the claim was since “no one” used GPG signatures, they could just be dropped. It’d be good to explicitly document this as a non-goal.
@kitterma Thanks for raising this issue! Yes, I agree we should explicitly document that PEPs 458 and 480 will not prevents developers from continuing to upload detach GPG signature files alongside distributions. I added an issue for us to track this.
You are right that PEP 480 can be used to directly distribute and support developers using GPG keys instead of, say, Ed25519. In fact, GPG was recently added to
securesystemslib, which is a shared abstraction used by TUF for cryptographic algorithms, although the Python TUF implementation needs some changes to use GPG.
@kitterma I can add text to the PEP to clarifies that this PEP does not affect existing GPG support. As discussed, GPG provides developer signatures, while PEP 458 signs metadata from the repository.
As far as PEP 480, I think that there is more discussion that could take place about the relationship between the PEP and existing GPG support. For example, we could use GPG signatures to sign TUF metadata from the developer (which includes a hash of the package) as @trishankatdatadog mentioned, which would supersede the functionality of just a GPG signature. But this might be a discussion for another time in the context of PEP 480.
I opened a pull request to address the GPG signing issue here.
Once the “GPG signature deprecation is a non-goal” PR is merged, I think all the open points of discussion and contention are solved, and this is ready for @dstufft to (I hope) approve. I defer to @dstufft and to @ncoghlan’s judgment, and of course I shall back off in my claim if anyone shares new concerns or points to previously stated ones that haven’t yet been resolved!
To follow up here: as the contracted implementor for PEP 458 I don’t foresee any issues with this approach. That being said, it isn’t scoped in our SoW. My plan here is to expose the delegation mechanism so that follow-up work can be done by a maintainer or contributor (or on the contract itself, if we end up underburning).
Hi William. Probably off-topic for this thread, but I’d like to work on this. Does it need to be follow-up work (i.e. once your SoW is complete) or do you think there’s scope to collaborate whilst you’re implementing what has been agreed under your SoW?
Some comments - haven’t gotten a full read in yet.
The discussion about compression is SHOULD/MAY. Given the size of userbase and metadata dataset we’re talking about, that seems unwise. Should we not specify exactly what is going to be done?
The consistent view discussion talks about existing mirroring technologies like rsync, but warehouse doesn’t offer those as an interface; I believe the only supported mirroring systems are bandersnatch and various proxy-based approaches. Neither of which are discussed in detail. Proxy based approaches may just work, but bandersnatch perhaps won’t unless it is modified to obtain bins before packages, and that work needs to be identified and called out.
Thanks for the feedback!
Compression is not included in the TUF reference implementation and has led to some past issues with TUF. This means that PyPI consumers would need a more custom implementation in order to support compression in PyPI. This is not impossible, but would require more work when it is time to implement the client side of TUF. Therefore I added compression as an optional feature that can be used if the implementers think that the reduced bandwidth size is worth the extra programming effort. If others here think that we need more detail about compression in the PEP, I would be happy to add that.
This would be a good addition to the PEP. As long as all files on PyPI at a given time are copied to the mirror, it should be able to serve the consistent snapshots with no issues. Otherwise mirrors should make sure to include all metadata from the most recent snapshot.