PEP 770: Improving measurability of Python packages with Software Bill-of-Materials

I don’t disagree with you, I also suggested metadata-files above as a potential name, @rgommers suggested we might want to avoid the use of the word metadata as it might limit us in the future. What you’re saying here is another way the name could limit future usage:

I’ve only been imagining the proposed reservation of .dist-info directories as places where these files might end up, if there’s a future where this feature is used for cases outside of this then yeah, having a different name makes sense to me.

1 Like

Because it seems that the last point is still the naming of the table (and I haven’t seen any +1s on comments), I’ll throw this poll together to see where folks are on the name. The last option is “postpone”, meaning I’ll refactor PEP 770 specifically to remove the creation of the new table (while maintaining the reservation of all .dist-info directories and the .dist-info/sboms directory).

  • additional-metadata
  • additional-files
  • dist-info.files
  • metadata-files
  • Something else (comment)
  • Postpone to a separate PEP
0 voters

I’ll check back in a few days what the results are and if there’s still no clear winner then I’ll proceed down the “postpone” route to keep the ball moving on this PEP, because I think everything else in the PEP is good to go and is waiting for provisional status to begin merging pull requests to projects.

Please don’t use a catch-all name like additional-, it feels too vague and likely to become confusing in the future. Outside of that, I do not have a strong preference on the proposed names but picked one for the sake of the poll.

(Reading Ofek’s feedback above again, I agree that dist-info feels like a bit of an implementation detail for a user-facing feature like this. I think that it adds another level of TOML nesting is less important to me. I agree with the other comments that something other than metadata- would be nice for future compatibility. Unfortunately it doesn’t seem like there’s a name proposal that captures the user-facing intent perfectly.)

2 Likes

So under “postpone”, how would users specify what the SBOM files are that are to be included in the distribution? Would that be left as purely backend-specific (i.e. under the [tool] namespace)?

SBOMs could be included either by build backends (either automatically for generated SBOMs or through some backend-specific tool setting) or they’ll be added by other tools like auditwheel. The only use-case that’s affected by postponing a pyproject.toml config option is the statically defined SBOMs use-case. I suspect many build backends won’t be jumping on a custom tool setting (I’d rather wait for the non-tool option to be defined) but a few build backends and repair tools will use .dist-info/sboms during provisional status.

If you or someone else thinks of a better name that captures all of the above feedback I would love to hear it, happy to run another poll or add it to the existing options. I haven’t been able to think of a name yet that doesn’t have some flaw mentioned in the thread.

1 Like

It appears dist-info.files received 60% of the vote with only one call to postpone (@pradyunsg if you had specific comments on postponement, reach out or comment), thus I’m going to move forward then with the current text of the PEP, thanks everyone for weighing in. I’m again asking for pronouncement of this PEP.

3 Likes

There’s a discussion going on here, around the license file metadata, which might need addressing for SBOMs as well.

Basically, in a monorepo or similar project layout, it’s quite possible for license files to be stored outside of the project source tree, in another part of the overall repository. In that case, the restriction on files having to be under the project root is problematic.

I don’t know if SBOM data could potentially have the same need to be located outside the project tree, but if so, the PEP probably needs updating to reflect whatever solution is identified for the license file case.

3 Likes

First impressions of the PEP: Very well written. Nice job!

In my org, we have an internal tool which produces a custom subdirectory in dist-info containing application specific lock-files and config. Included in that is also a definition of Java dependencies (which ultimately get used through JPype), which is aggregated post-install to determine the environment’s Java requirements, and then this is subsequently resolved. Clearly neither of these will ever be standardised, but I think it is reasonable to wish to bundle additional dist-level metadata into dist-info. I therefore propose either:

  • Reserve names on a per-case basis (it is the responsibility of a future PEP to determine the backwards compatibility of introducing a new name). The number of reserved names is expected to be very low, as I see it.
  • Allow names starting with _ to be permitted (taking inspiration from PEP-700).

To be clear, I am not referring to the keys allowed in pyproject.toml [dist-info.files], which I do believe should be strictly controlled. (actually, I don’t see why “Tools consuming this field SHOULD reject invalid values with an error.” isn’t a MUST)

Furthermore, there is an inconsistency in the document. There are two statements:

Build frontends and publishing tools MAY warn users if any .dist-info subdirectories aren’t in the registry.

Instead the recommendation is that build frontends MAY warn the user or raise an error in this scenario.

It is my belief that a warning is sufficient, thereby also facilitating future distribution metadata installation for older frontends.

Finally, I am wondering whether is would be worthwhile to also specify the PEP-517 stage at which the dist-info subdirectories are created, or is it abundantly clear that this is inevitably the prepare_metadata_for_build_wheel stage?

2 Likes

So it looks like the discussion is continuing on how best to handle files outside of the project root and some other clarification work needs to be done, too. My time to work on this PEP likely to minimal until post-PyCon US (due to other time-sensitive projects coming up and vacation) and I’d like to have the portions of this PEP that seem to have solidifed be actionable (I’ve had draft pull requests ready for a bit).

I’m going to refactor the statically defined SBOM mechanism out of the PEP and will submit that as a separate PEP after PyCon US, then we can completely handle all situations including making a private-use reservation of subdirectories under .dist-info.

I’ve made the necessary changes to remove the [dist-info.files] table in this pull request, please take a look.

1 Like

With apologies for the delay (lock files, illness, and a kid turning 1 take up a lot of time :sweat_smile:), but I’m happy to say that I accept PEP 770!

With the reduced scope to *.dist-info/sboms along with codifying the directories that are reserved in *.dist-info, I don’t think this PEP is controversial. As well, I think it will be useful very quickly for build back-ends to start including SBOMs based on what they use to build what’s contained in a wheel.

Thanks to @sethmlarson for all the work on this and everyone who gave constructive feedback!

18 Likes