Thanks for the PEP!
Overall I think it seems fine, though I feel that the Core Metadata specification is quite over-specified (perhaps to the point where it can’t actually be used?):
- That file MUST be included in the distribution archive at the specified path relative to the root license directory.
This might mean “root SBOM directory”, which was defined inline a couple of lines earlier (which I missed the first time, so it might be worth a more explicit definition if you’re going to refer to it).
- That file MUST be installed with the project at that same relative path.
Perhaps rephrase this to “Installers must install this file …” to emphasize that all I need to do is include the file, and not figure out how to make someone else do their job properly. (I assume installer maintainers will recognise that this just means “keep doing what you’re already doing” and not “do work to make sure it extracts properly”. In other words, this is a no-op requirement for everyone involved.)
packaging tools MUST reproduce the directory structure under which the source files are located relative to the project root
This isn’t relevant to core metadata. What we need specified here is that the file path listed in METADATA includes the directory structure that appears in the distribution package. The tools that are going to reproduce that structure need this reminder in the later section about source metadata.
SBOM document contents MUST be UTF-8 encoded JSON
SBOM document contents MUST use an SBOM standard
Why? Seriously, at this point, why do we care? Is PyPI supposed to check that the SBOM referenced by METADATA is following a standard and reject the package from being uploaded? Given we aren’t defining the standard ourselves (or specifying it in METADATA), then it’s really up to the final consumer to figure out what the format is. From a core metadata point of view, we just need the info to tell a consumer where they should be looking. I would strike these two bullet points completely.
The “primary” component being described in included SBOM documents MUST be the Python package.
SBOM documents MUST include metadata for the timestamp when the SBOM document was created.
SBOM documents SHOULD include metadata describing the tool creating the SBOM document.
Again, this is important for being a well-behaved project with regards to someone consuming your SBOMs, but isn’t something we can specify in core metadata. Perhaps these should move to a new Background section covering how your SBOM is likely to be used and how you can play well?
PyPI SHOULD validate that all specified files are present in the distribution archives,
Okay, just got up to this point (skimmed over it on the first read, I guess). I don’t like this at all - let’s leave PyPI out of it, and let people handle SBOM formats themselves. There are endless numbers of tools that can do this same thing for those who care about it, and you can suggest that package builders/SBOM generators should do it, but I think making PyPI the enforcer is overbearing.
I’m skipping over the source metadata section, because I neither use nor care about it. I’m sure it’s fine
The sdist specification will be updated to reflect that if the Metadata-Version
is 2.5
or greater, the sdist MUST contain any SBOM files specified by the Sbom-File
field in the PKG-INFO
at their respective paths relative to the sdist
The sdist specification doesn’t actually have anywhere for this to go. PKG-INFO
for metadata version 2.2 or later just refers to the core metadata spec, and I believe the same is true for the wheel spec. In either case, defining the metadata as “relative to the metadata file (either PKG-INFO or
METADATA` depending on context)” should get you what you want without touching the sdist or wheel spec.
It’s also a bit awkwardly worded compared to the later ones - try something like “if the metadata version is 2.5 or later, any Sbom-File
fields must only contain relative paths from the metadata file’s directory to an SBOM file included in the package”. This avoids the “must contain SBOM” wording, which is scary for people who don’t currently use SBOMs (at least until they finish reading and parse it all, but they’re already feeling scared ).
the .dist-info directory MUST contain an sboms
subdirectory, which MUST contain the files
Why? “Relative to the metadata file” is enough, and it keeps things simpler if we don’t have one case with an extra subdirectory (I’d expect in most cases everyone’s going to put them in a subdirectory in their sources anyway, so they’ll just end up nested deeper).
What is probably needed here is clarification that SBOM files should be included in RECORD
.
There are no backwards compatibility concerns for this PEP.
Well, you’re claiming some directory names. It probably doesn’t need to be stated, but the reason you’re increasing the metadata version is to avoid compatibility concerns - anyone currently using the Sbom-File
field for something different will be affected, but they can choose whether to opt into the new meaning. (And if you want to argue that they shouldn’t be using it, then I’ll argue that means we don’t need to change the version I don’t particularly care who wins that argument)
How can a project specify an SBOM file that is conditional? Under what circumstances would an SBOM document be conditional?
I assume you mean in the source metadata? Probably at this stage you add text along the lines of “build tools may choose to use or ignore the sbom-files
specification if requested by the user” and don’t worry about it.
We don’t have conditional core metadata. Once it’s in PKG-INFO or METADATA, it’s basically gotta stay there.
Another question that occurs to me that I don’t recall seeing an answer in the PEP: are SBOMs allowed to differ between separate wheels for the same release (I sure hope so!), and are they allowed to changed between what’s in the sdist and what goes into a wheel?