Hello,
A new draft of PEP 639 has been published.
It proposes the adoption of SPDX license expression syntax as a way to declare the licenses of Python packages.
With Core Metadata version 2.4 the following changes apply:
- a new field
License-Expressionmust be present and must contain a valid SPDX expression - a new field
License-Filemay be present zero or more times, and each must contain one path to a license file declared by the user. All files, either matched by globs, or literal paths, must be included in the distribution - license files are stored in the
.dist-info/licenses/subdirectory of the produced wheel.
License classifiers and the License field are deprecated.
From the user’s perspective:
- declaration of the license is performed via a top-level string value of key
licensein[project]table of pyproject.toml. It has to contain a valid SPDX expression and maps to theLicense-Expressionfield of the Core Metadata. - specification of the license files in the distribution is done via list of either
license-files.globsorlicense-files.pathswhich are mutually exclusive. Iflicense-filesare not present in the metadata, there’s a default value tools must assume (license-files.globs = ["LICEN[CS]E*", "COPYING*", "NOTICE*", "AUTHORS*"]). This maps toLicense-Fileentries in the Core Metadata.
The table values for the license key in the [project] table of pyproject.toml are deprecated. Reasoning is part of the PEP.
Please review the examples to get the practical idea of the proposed changes.
There are no immediate hard breaks in the backwards compatibility.
The PEP specifies than when distributions contain the new License-Expression field, PyPI must validate and reject uploads that don’t conform to the specification.
The PEP leaves a great margin of freedom to the tools regarding the advice they want to produce in case of the incorrect license expressions detected.
Also, the changes will require the updates of a few additional specifications, listed in the PEP.
There are two open issues listed that may require further debate:
- Should the
Licensefield be back-filled, or mutually exclusive? - Should custom license identifiers be allowed?
In the previous thread @pf_moore has raised a concern about the recommended tool to parse and normalize SPDX expressions.
The PEP recommends license-expression · PyPI for that.
If deemed insufficient (as @ofek mentioned in regards to hatchling), that’s a valid concern.
Since SPDX has become a de facto standard of license declaration in the last years, I gather that could be a candidate for creating an official lightweight library that would only do parsing and normalising of the SPDX expressions.
I’m looking forward to your inputs.