Software Bill of Materials for next Python release

Considering the 2021 Executive Order on Improving the Nation’s Cybersecurity, it is required for all the software products/services to to publish the minimum elements for a Software Bill of Materials (SBOM). Python should also work on these lines to integrate SBOM in it’s build pipeline. CycloneDx by OWASP provides excellent tools to construct SBOMs for wide range of software applications. Wondering if Python community is already working on SBOMs or is planing to do so in future.

It isn’t clear to me if you wanted this for the CPython interpreter project itself (in which case Core Development is the right category) or if you were asking about this from a Python packaging point of view in which case this could be moved over to the Packaging category.

That things like your GitHub - CycloneDX/cyclonedx-python: Creates CycloneDX Software Bill of Materials (SBOM) from Python projects and environments. exist is presumably useful for the latter Packaging category (I’ve never used it and am unlikely to ever have a need).

Based on your first federalregister SBOM definition link, it is doable for someone to figure out how produce an artifact with the list of third party tooling and linked third party software used in our CPython binary releases in whatever format(s) are deemed desirable for that. I’m unaware of anyone expressing a desire to work on doing that.

We’re an open source project so the data is already there in our source tree and build configuration and has been since the beginning, even if not in some random government proposed canonical format. I’d expect someone wanting to work on this would probably be a motivated volunteer from a company that needs to consume and or ship relevant SBOM artifacts themselves.

2 Likes

In what sense, by law?

Initially, I meant it for Python packaging but since SBOM is useful for any software project so why not CPython interpreter. Question: Why is SBOM useful or needed in first place as it brings along user trust: increases supply chain security by reducing risk of supply chain attack. There have been very high profile security breaches recently. While the risk is always there but using SBOM can help identify and possibly mitigate the vulnerabilities on time.
It is most likely that SBOM will be a crucial part of any software in near future. Here is a synopsis on why organizations should use SBOMs. President Biden’s Executive Order.
While all of this is optional at this point but I believe it is not entirely useless. I’ve worked on generating SBOM for open-source jdk and was thinking why not for Python (btw I’m currently learning). I’m not associated with any organization however, out of my interest for SBOMs I’m willing to volunteer in generating SBOM for Python packaging given the community welcomes the effort. Thanks!

While at this point it is required for the software applications government uses however it is most likely to be adopted by all the organizations building software. Because, why not, better security, trust in supply chain. Cyber Security Executive order 2021

I believe some folks in the packaging side are already looking into SBOM support for the ecosystem.

The page you linked to as the executive order, states that it is:

Notice, request for public comment.

I’ve not read it in detail, but this detail stuck out to me. It seems to be something written by a US government agency.

Beyond that, cyclonedx-python (linked above) does provide the relevant functionality, of generating a CycloneDX SBOM from the most popular Python packaging dependency specification formats.

2 Likes

I’m not sure if everyone here would agree (@brettcannon?) but for me personally, if you (@SehrishHussain) want to produce a SBOM for say the 3.11b4 release that would be an interesting experiment, we could then see whether your approach can be integrated into the release process. The release managers would have to be supportive before we can think of the latter, but I think a one-off demo would be helpful if you want to convince folks (much more helpful than insisting “everyone likes better security, right?” :-). Note that the release is a complicated production, requiring compiler toolchains on three different platforms.

3 Likes

Can you direct me to some if you know any. Or would you refer to which place I should be looking?

Makes sense. Yeah, releasing comes at a later stage. If I’m to work on SBOM I will need someone from community to gather the metadata etc. If you know anyone working on this or direct me to the right place to discuss with folks that would be great!

pep-0639 already “defines a specification for … using SPDX identifiers …”

Apparently “SPDX, completely addresses the Executive Order 4(e) and 4(f) and 10(j) requirements for a Software Bill of Materials (SBOM)”.

Should SPDX be preferred? Are there any example SBOMs of known open source software in either format available?

What is the difference between a SBOM and a pep-0665 lock file?

Thanks.

2 Likes

@SehrishHussain I think this is the right place. Maybe you can just ask more pointed questions and we can take it from here? You ask for someone to “gather the metadata etc.” – I have no idea what that means (to me, “metadata” is about as descriptive as “things” :-).

2 Likes

I think a proof-of-concept is going to be necessary to know how to evaluate how this may impact us and whether we need to own this or if this is going to be something the community provides external from us (i.e. you could argue the installers are the only things that we really need this as those who compile and distribute CPython themselves will have their own SBOMs based on what they include). Since whether our code gets used by a government has never been a direct concern of ours – flashbacks to emails looking for export control numbers for the US government – it’s not something we have to take on, but I’m sure our users would appreciate it if we can help out somehow.

Packaging-related discussions occur in Packaging - Discussions on Python.org . Specifically around SBOMs, it’s all been in private conversations so far so I’m not at liberty to share, but people were looking into it less than a month ago.

You can look at SOFTWARE BILL OF MATERIALS | National Telecommunications and Information Administration for the nitty-gritty, but the missing parts from the lock file format that PEP 665 proposed are:

  • Author name
  • Supplier name
  • Relationship

This also doesn’t covered vendored code in a project which is an equally important aspect to SBOMs (on top of shipping roughly what your lock file is with your binary/product). This will probably require an update to wheels.

And for those of you still wondering what an SBOM is, the one-liner is basically a receipt of all the software used to make a product/binary. That way it’s easy to look at all your software to know if you have e.g. a vulnerable copy of log4j running somewhere. So in our case it would be stuff like what version of SQLite got compiled into the Windows binary. Other than that it’s whether there’s an easy way for us help out e.g. Linux distros who build and distribute CPython on their own to get an SBOM for what they built from the Makefile.

2 Likes

Linux distros generally

  • unbundle things like SQLite and always use system copies of libraries (with appropriate metadata), or
  • have a way to mark bundled libraries, making a SBOM relatively easy to generate (in a non-Python-specific way), or
  • don’t care much about of this – and aren’t too interested in being suppliers for the US government.

CPython can certainly help, in ways like making it easy to use system libraries rather than bundled ones, but it has generally been very friendly with distro packaging.

1 Like

Microsoft open sourced Salus software bill of materials (SBOM) generation tool and “uses the standard Software Package Data Exchange (SPDX) format.”

The OSS Review Toolkit (ORT) and syft support both SPDX and CycloneDX.

Salus, ORT and syft all claim to support Python.