Create and distribute Software Bill-of-Materials (SBOM) for Python artifacts

I’m mainly asking whether they should use (and hopefully help improve) Seth’s tooling, or if that tooling is meant to be specific to python.org releases (like the current release tools repo).
I guess a third option is putting it on PyPI, like blurb. (And tell everyone to keep it updated…)

1 Like

All of the tooling that actually creates the final SBOMs for each artifact would indeed live in the release-tools repository.

What would get backported is Tools/build/generate_sbom.py script which ensures that checked-in dependencies have their metadata updated when new versions are pulled in for each branch. I felt this made sense since backports of upgraded dependencies can have their SBOM updates also free-ride alongside them.

The generate_sbom.py script could be made to be somewhat reusable at some point if we really wanted to (although we already have one customization for pip which others wouldn’t find use for). The method in particular that we’re using to track checked-in dependencies I suspect may be popular with other projects. I plan to share this method within the OpenSSF Security Tooling WG to see what others think. The final scripts in release-tools that stitch everything together however are unlikely to be reusable by other projects.

2 Likes

Sharing some results, I’ve created a draft for some code that is capable of taking the SBOM for embedded CPython dependencies and generating a completely usable SBOM. This SBOM is mostly feature-complete both from a content POV and from an external requirements POV. The output meets NTIA Minimum Elements for an SBOM, scores 9.6/10 on SBOM quality, and is capable of identifying vulnerabilities when used with SBOM scanner tools like Grype.

2 Likes

I have examined cpython/Misc/sbom.spdx.json at v3.13.0a3 · python/cpython · GitHub

What worries me is that it claims the bundled pip wheel is licensed as: MIT.

In fact, due to the vendored packages in pip, the SPDX license expression is much more complex than that. For example, in Fedora, we believe the license for pip 23.3.1 is MIT AND Python-2.0.1 AND Apache-2.0 AND BSD-2-Clause AND BSD-3-Clause AND ISC AND LGPL-2.1-only AND MPL-2.0 AND (Apache-2.0 OR BSD-2-Clause)[1].

(I understand that perfect is the enemy of good, but perhaps when we try to enumerate everything, this is important.)


  1. Breakdown at Tree - rpms/python-pip - src.fedoraproject.org ↩︎

2 Likes

This is a great point, one thing we can do with the SBOM is enumerate all the vendored projects of pip instead of creating a long license expression? I’ve created this GitHub issue with my proposal.

I don’t think we can do it instead of the long license (practically the only common requirement for each of these licenses is that you add its message to all your others), but if it should be done for the long license then it should be done in the SBOM too.

Ideally we’d just reference pip’s own SPDX document though, as we’re merely bundling it. But until that’s available (and generally acceptable - I’m not sure it is yet), replicating the information as best as we can is probably best.

1 Like

So I’ve enumerated all of pip’s vendored projects, but I’m now wondering if we decide to “and” the entire pip dependency lists’ license IDs together whether we need to do the same for CPython’s license expression? This seems like a clunky direction since we’re enumerating all the applicable licenses in the SBOM already?

I’ll ask some SBOM folks what they think about this too.

1 Like

Created the PR for backporting all work done on the main branch to 3.12. It seems like backporting beyond 3.12 would require a lot more work on the tooling side to add support for enumerating setuptools and its multiple layers of vendored dependencies, I’m not sure if it’s worth it right now. 3.12 happens to be an identical SBOM to 3.13.

Also wanted to note that we’re fairly confident we won’t break downstream distributors because we got feedback from Fedora for 3.13.0a3 (thanks @ksurma) and have resolved all the issues there.

I think this is the last work that’ll happen for now (pending finding issues) in the CPython repository, future work will focus on release tooling and documentation.

4 Likes

There’s been a lot of movement for CPython SBOMs this week:

  • The backport of SBOM tooling to 3.12 branch was accepted.
  • SBOM creation and uploading was added to the CPython release process.
  • The two above points mean that CPython 3.12.2 will be the first release to have SBOM documents.
  • I’ve published documentation for CPython’s Software Bill-of-Materials on python.org/download.
  • Changed all third-party package licenseConcluded fields to NOASSERTION to avoid folks using SBOMs for determining licensing compatibility. Our use-case right now for SBOMs is for vulnerability and supply chain management.
  • Received some more feedback from pip maintainers, might be able to move the automation from the CPython source repository to the release process itself so pip maintainers don’t run into difficulties with backporting pip fixes to previous branches.

That’s all for now. Stay tuned for when CPython has its first end-to-end SBOM! :partying_face:

4 Likes

Python 3.12.2 is released and has SBOM documents available for Python-3.12.2.tar.xz and Python-3.12.2.tgz. Find them in their own column in the download artifacts table.

6 Likes