SBOMs for Python packages project

Well, maybe I am misunderstanding the README you linked to, but it seems you are looking for .a files in binary wheels in your SELECT query? I would not expect binary wheels to ship .a files of dependencies, is there any situation where it occurs?

Ahhh you’re talking about the blog post, yeah to create the data table I added .a as a query candidate too just to get a complete sample and to avoid making assumptions about what Python projects do. I wasn’t expecting wheels to contain them in abundance (and iirc: there weren’t many .a files in Python packages).

The numpy wheels contain at least one .a numpy/_core/lib/libnpymath.a

@sethmlarson another example with numpy, where bundled native code deps may not be up to date ENH: Streamline and improve the origin and license documentation of third party bundled in wheels · Issue #27764 · numpy/numpy · GitHub

I’ve published the results of some validation work for how useful embedded SBOMs are to today’s tooling. Syft, a popular open source SCA tool, is able to use embedded SBOMs inside of wheels. I forked auditwheel and added some rudimentary SBOM record-keeping of bundled shared libraries and Syft was able to use the result end-to-end. Check out the full details, I’ve made all the work public:

2 Likes

I’ve been only lightly following this discussion since the projects I work in almost exclusively publish pure-Python packages with no separate C extensions, but as a security practitioner I’m still somewhat interested…

What would the upshot of SBOM inclusion be for the simple case of pure-Python packages? I assume there would at least be some default entry/entries for the project itself?

What does “SCA” mean?

Quoting from the second bullet in the first post:

software composition analysis (SCA) tools

2 Likes

For any project, I assume the default SBOM is empty, but if your Python project vendored anything, you’d enumerate it.

1 Like

Basically this! If you aren’t vendoring any other projects/code then you wouldn’t need to do anything, SCA tools already handle your project correctly enough to be effective. If you are vendoring projects (even Python ones) you can list them in an SBOM or maybe the tool you’re using to vendor those projects will do that for you in the future.

The only use-case I could think of that you might want to use an SBOM even if you’re not vendoring projects is being able to describe your project using SBOM metadata that doesn’t have a Python package metadata equivalent, but I suspect this will be rare.

1 Like

For any project, I assume the default SBOM is empty, but if your Python project vendored anything, you’d enumerate it.

Got it, so having an SBOM that just says the package of your project includes the files from your project would be redundant (and perhaps circular). Pure Python projects can presumably just ignore these SBOM discussions for the most part.

Oh, though is there potential benefit to including an empty SBOM, as an assertion that your package contains nothing besides the project’s own files?

2 Likes

One thing I notice is that the example with Pillow shows the various libraries as type “rpm”. Surely that’s wrong, as the issue is precisely that these are not versions that are managed as RPMs, but instead are vendored into the wheel and have a very different support model than normal RPMs?

Is that just a quirk of the syft tool that people using that tool won’t have a problem with, or is it something that needs further work to distinguish between “dependencies satisfied by linking to the system library” and “dependencies bundled with the software”?

That’s a result of the software identifier being used which is a “package URL”, this is how a tool like Syft or Grype can make links between software in use and a vulnerability database. The SBOM included in the Pillow wheel had package URLs that look like this, because AlmaLinux uses RPMs:

pkg:rpm/almalinux/libXau@1.0.9-3.el8?distro=almalinux-8

You can see the full SBOM file that gets generated at this GitHub Gist: Example CycloneDX SBOM file generated by auditwheel · GitHub

So “something that people using the tool won’t have a problem with”. Cool. Sorry for the noise.

@pf_moore No worries! I am not an expert on the PEP delegate process for packaging PEPs but wanted to confirm you are okay with @brettcannon being PEP delegate. Brett volunteered to be delegate here: SBOMs for Python packages project - #13 by brettcannon

Yep, that’s fine with me. The actual process is documented here. Technically you need approval from “the other PyPA core reviewers, the lead PyPI maintainer and the default PEP-Delegate for package distribution metadata PEPs” (of which I’m the last of the 3), but we tend to be pretty informal about the details and as long as no-one objects and you have my OK, we should be fine.

5 Likes