Hello,
A new draft of PEP 639 has been published.
It proposes the adoption of SPDX license expression syntax as a way to declare the licenses of Python packages.
With Core Metadata version 2.4 the following changes apply:
- a new field
License-Expression
must be present and must contain a valid SPDX expression - a new field
License-File
may be present zero or more times, and each must contain one path to a license file declared by the user. All files, either matched by globs, or literal paths, must be included in the distribution - license files are stored in the
.dist-info/licenses/
subdirectory of the produced wheel.
License classifiers and the License
field are deprecated.
From the user’s perspective:
- declaration of the license is performed via a top-level string value of key
license
in[project]
table of pyproject.toml. It has to contain a valid SPDX expression and maps to theLicense-Expression
field of the Core Metadata. - specification of the license files in the distribution is done via list of either
license-files.globs
orlicense-files.paths
which are mutually exclusive. Iflicense-files
are not present in the metadata, there’s a default value tools must assume (license-files.globs = ["LICEN[CS]E*", "COPYING*", "NOTICE*", "AUTHORS*"]
). This maps toLicense-File
entries in the Core Metadata.
The table values for the license
key in the [project]
table of pyproject.toml are deprecated. Reasoning is part of the PEP.
Please review the examples to get the practical idea of the proposed changes.
There are no immediate hard breaks in the backwards compatibility.
The PEP specifies than when distributions contain the new License-Expression
field, PyPI must validate and reject uploads that don’t conform to the specification.
The PEP leaves a great margin of freedom to the tools regarding the advice they want to produce in case of the incorrect license expressions detected.
Also, the changes will require the updates of a few additional specifications, listed in the PEP.
There are two open issues listed that may require further debate:
- Should the
License
field be back-filled, or mutually exclusive? - Should custom license identifiers be allowed?
In the previous thread @pf_moore has raised a concern about the recommended tool to parse and normalize SPDX expressions.
The PEP recommends license-expression · PyPI for that.
If deemed insufficient (as @ofek mentioned in regards to hatchling), that’s a valid concern.
Since SPDX has become a de facto standard of license declaration in the last years, I gather that could be a candidate for creating an official lightweight library that would only do parsing and normalising of the SPDX expressions.
I’m looking forward to your inputs.