PEP 639, Round 2: Improving license clarity with better package metadata

I would very much like to start using License-File more thoroughly, if there is some controversy wrt the rest of the PEP, do you think we could at least extend the metadata with that attribute?

Sorry for the very long delay; I’ve been distracted with helping implement the new PEP rendering system and requested followup enhancements, PEP editor responsibilities, my research and other projects. I have made an initial round of updates to the headers, syntax and formatting of the PEP, and was preparing the second round of more substantive changes as requested here, but preparing for and attending PyCon has consumed virtually all my time in recent days.

Once PyCon’s over in a few days and I catch up on my large backlog from that, I’ll make finishing that a priority, and execute the final followup moving the appendices and other ancillary materials out of the PEP to shorten its length by a further 2/3rds and make it much easier to navigate. Then, I’ll make new thread, Brett can review and it can be discussed in (hopefully) final form.

It seemed relatively uncontroversial on the previous thread and was actually implemented by Setuptools (albeit in a form that differs somewhat from what is currently specified in this PEP), but most of the recent discussion in this thread has centered around the License-File core metadata fields and how it should be specified in the pyproject.toml, so that is the primary part that appears to not be fully settled at the moment.

2 Likes

I have bias due to me already implementing it but I think the current design is good.

If tools are implementing something ahead of this PEP then it would be good to capture that fact, @CAM-Gerlach , to inform that the tool authors have implicitly voted in favour of something.

1 Like

Good point, will do. I do mention in the Appendix that Setuptools has also already implemented support for License-File as discussed in the PEP (with the exception of not flattening them and namespacing them in the licenses directory, which contributors/maintainers on those issues expressed interest in, but was not yet part of the PEP at the time). I’ll add a mention of both to the Rationale along with the other changes.

3 Likes

Any update on this?

1 Like

Sorry for the long delay again; I had a terrific PyCon meeting everyone but was so exhausted and behind on my actual research work that I had to take a long break to focus on that, and was busy working on Python docs stuff and other things.

At long last, the next (and hopefully final) round of substantive PEP 639 updates is up for editorial review!

You can check out the preview here:

It implements all the changes discussed and agreed to here, including all those mentioned in my comment above, as well as a couple related refinements. I welcome your editorial feedback there, though if possible, please remember to try to keep substantive content-relevant discussion centralized over here.

Once that’s reviewed and merged, I’ll have a final followup PR that will shorten the total PEP length by around a further two thirds (on top of the significant earlier reductions) by moving the appendices, previously-normative “Mapping License Classifiers to SPDX Identifiers” section, and the full Rejected Ideas (except for a concise summary of the most important ones) to separate, linked ancillary documents, which should be quite streightforward since I’ve already laid the groundwork in python/peps#2531.

At that point, I’ll create and link a new Round 3 thread for what I hope will be a final round of discussion. Thanks again for bearing with us and being a key part of this lengthy but hopefully ultimately fruitful process!


Substantive content changes (click to expand)
  • Instead of adding a new top-level license-expression key for the license expression in the [project] table of the pyproject.toml source metadata, the PEP now specifies using the top-level string value of the license key for this purpose, which PEP 621 (PEP-621) reserved it for, and updates the license-expression and license key specs, and the examples, rejected ideas and other sections accordingly.
  • Since this makes it not possible to convert a legacy license key to the new License-Expression field at build time, and that’s not really advisable anyway, it drastically simplifies the normative Converting Legacy Metadata section to just a single normative statement, and updates/removes other mentions of it accordingly
  • Likewise, it simplifies and refines the guidance in the Mapping classifiers to SPDX identifiers section to be more general and less focused on build time, and also allows tools to ignore redundant parent classifiers (Note: This section will be moved to an external appendix in the next PR, but I didn’t move it here for ease of review)
  • The license_files directory was renamed to licenses at the request of @brettcannon and to simplify things a touch
  • The specified handling of the license.file key was a bit confused by a (apparently quite common) misunderstanding about how the specified file is used (to inject its text directly under the License field in core metadata, rather than included in distribution archives or its path specified in metadata) due to it being rather underspecified in PEP 621, which this revision corrects.
  • Bump the core metadata version to reflect acceptance of PEP 685 as Metadata 2.3
  • Bump the SPDX license list version to be up to date

Significant non-normative/editorial changes (click to expand)
  • Mention implementation of drafts of this PEP in Hatch and Setuptools, and also describes previous efforts in Setuptools and Wheel that this PEP builds on more clearly, as suggested on the discussion thread
  • Reference the canonical project [source] metadata PyPA spec instead of PEP 621, use clearer language surrounding that, and update a few references to other PEPs
  • Revise the Motivations and Rationale to be less duplicative/redundant, more balanced and focused more specifically on their respective areas (providing some background and describing the problem, and introducing and justifying the proposed solution, respectively)
  • Add a short blurb making explicit the use of RFC 2119 terminology (MUST, SHOULD, etc)
  • Update PEP headers
  • Various other minor revisions to correct typos and other issues, clarify and simply the text, and improve source formatting.

2 Likes

Released: Release Hatchling v1.5.0 · pypa/hatch · GitHub

Thanks—just to note, though, while I hope this is the final draft (or very near it), it may not necessarily be if additional changes are requested by @brettcannon or the community before the PEP is approved.

Thank you, this version looks great. As far as I can tell, it addresses all the pain points with license files and vendored dependencies that we just ran into in dealing with multiple license files & vendored dependencies in meson-python and scipy.

It’s not 100% clear to me what you plan to move out, so let me just ask to not remove Appendix: License Expression Examples. It was very helpful in clarifying or confirming exactly what some sentences in the PEP meant. Examples like that are always easier to parse than paragraphs of text.

2 Likes

Fantastic! Really glad to hear that.

Sorry, I left some of the details on that point to the PR. I definitely agree that a well-chosen example to be crucial to understanding even a carefully written specification or other reference material (as a side note, I also have a PR in work to add the per-key examples from PEP 621 to the “Declaring Project Metadata” spec, for the same reason). Of course, ideally I’d like to do my best to clear up ambiguity and lack of clarity in the spec text too, if you have any particular feedback on that.

I was thinking of potentially moving the fully worked, detailed examples, or at least the “advanced example”, to a directly-linked ancillary document to reduce the length of the PEP itself, and was hoping to be able to break i up into smaller self-contained examples to go along with each individual spec section. However, looking at it more carefully, the real value of the examples as I see them is in being fully-worked, complete and end-to-end, rather than each in isolation.

As such, what about leaving the basic example in the PEP, and moving the advanced and expression examples to a separate document directly linked from the top of the PEP examples section? Or is the advanced example necessary to clear up most of the uncertainties you found, and its not visible enough in a separate linked document? To make either option either, I eliminated the mostly useless and no longer relevant Conversion example (since most of that spec was elided from the PEP itself), so the section is not quite as long and more focused on its more valuable portions.

Please keep the whole examples section, it’s always going to be more unambiguous than sentences. And it’s not that long. Imho examples like the ones you included are always in PEPs.

Example is “root license directory” in this sentence: " As specified by this PEP, its value is also that file’s path relative to the root license directory in both installed projects and the standardized distribution package types."

This could mean pkgname, a specified directory in metadata, pkgname-xxx.dist-info/, pkgname-xxx.dist-info/licenses/, or something else. You can go to terminology and that helps, but it’s still not 100% clear (e.g. is “project root directory” the one pyproject.toml is in, or the top-level installed directory).

Even if you tweak the text, such puzzling is unhelpful - the examples are essential.

1 Like

Okay, sure, I’ll keep them all then—would you suggest moving them back to a body section rather than an appendix? I was mostly concerned with making the PEP easier to navigate after a number of community members pointed out how long it was relative to other PEPs, which made it difficult to read and review, and I didn’t think too much of requiring one extra click to see the full examples, but they are more important than I perhaps gave them credit for, and they don’t make that big of a different in the PEP’s length relative to other less important things.

One additional thing I can do (in the morning) now that we have full Sphinx support courtesy PEP 676 is make the current Terminology definition list a .. glossary::, and then link the :term:s wherever they are used, so people can jump quickly to the definition.

In this particular case, the answer is somewhat complex as it is the project root directory (pyproject.toml directory) for source trees and sdists, and .dist-info/licenses for wheels and installed projects. The root license directory definition does mention this, but I can also link the License Files in Project Formats section from the glossary entry, which is more explicit and detailed about this.

I’d personally prefer to put them in the main text, yes. But either way works.

So should I wait for the 3rd round, or did you need my specific input on the current PR?

For what it’s worth, this PEP needs updated I think to change the Metadata-Version to 2.4, it’s currently stating it changes it to 2.3, but 2.3 was introduced by PEP 685.

1 Like

Sorry, I was at SciPy last week, and then caught COVID there, so I’ve been knocked out for a bit and am now catching up with my backlog again…

I’ve updated the PR further with Jelle’s editorial fixes and feedback. I was going to make the Terminology section into a Sphinx glossary and link the terms on first/prominent use, but on second thought I’m just address any further immediate feedback, push the changes and take care of that in a followup PR, so as to not delay and scope-creep this one further.

If others have any immediate editorial or implementation suggestions on the specific changes in the PR, please do give it a review; otherwise I’ll merge it in a few days time and proceed with the final followups (the glossary and pulling out/slimming down the appendices), after which I’ll create a new thread for a third and hopefully final round of substantive content discussion.

Up to you of course, but based on what you’ve said previously, the current plan and our general practice, if you do happen have immediate editorial or implementation detail suggestions on the specific changes you could make them as a PR review, otherwise you’re probably better off waiting a bit until its all done (and much shorter) and I create the new Round 3 thread.

I’m also looking forward to hearing if you colleagues had any useful feedback!

Yup, that’s one of the things the current update PR does:

:+1: If I have time I’ll read your PR or what gets merged, otherwise I can wait for round 3.

No specific feedback when I explained what in general was going to be included (SPDX and the license files). I’ll ask again when the shorter version is available.

1 Like

Any update on this?

Sorry for the continued delay; I’ve been pretty busy with CPython docs stuff, the Artemis 1 moon launch and some other things. I’ll take crack at doing the last few updates during the core dev sprint; @brettcannon is here as well and will be keeping an eye on it from his side.

1 Like