The API to access the data is not the problem. The problem is that PyPI does not currently record the information: there is no per distribution artifact metadata collected. In the case of licensing information, making sense of metadata prior to version 2.4 is a lost cause, unless you want to try to interpret free text licensing information is a gazillion of different formats with fall back to ambiguously defined classifiers. Metadat 2.4 makes it much easier but still not trivial: the License-Expression
metadata field is not mandatory. Licensing information can be expressed linking to license text via License-File
fields, or not present at all.
Except the one reply from Hynek, which hasnât reacted to your suggesting that his suffering should be endured for the collective good, the only replies are mine and yours, so I donât know on what you are basing your conclusion.
I was not the one to link time spent to suffering. I donât consider the time I spend working on open source projects as suffering. I was indeed very surprised when Hynek jumped from time spent to suffering and that motivated my next message on the topic where I lightly suggested that if time spent working on FOSS contributions is perceived as suffering, maybe it is time to revisit something.
I think that the accepted view that maintainers of open source projects have any responsibility toward the users of their code, to the extent that maintaining the project is more important than their well being is something that need to change. Comments like yours, even if intended only as jokes, only reinforce this view.

This is well off-topic now, though. Please direct message me if you want to continue.
You are accusing me of mean or inconsiderate behavior on a public forum. I prefer to defend my behavior on the same forum.
The API to access the data is not the problem. The problem is that PyPI does not currently record the information: there is no per distribution artifact metadata collected. In the case of licensing information, making sense of metadata prior to version 2.4 is a lost cause, unless you want to try to interpret free text licensing information is a gazillion of different formats with fall back to ambiguously defined classifiers. Metadat 2.4 makes it much easier but still not trivial: the
License-Expression
metadata field is not mandatory. Licensing information can be expressed linking to license text viaLicense-File
fields, or not present at all.
As has been pointed out repeatedly in prior discussions, license information reported for packages on PyPI is at best a strong hint as to the copyright license(s) covering downloads for that project. Anyone concerned about the actual licenses for all of the files contained within the downloads in each projectâs release need to consult files shipped in those projects, or their upstream developersâ documentation. Recent metadata changes improve this, but do not address all possible complexities of applying copyright licenses in projects.

You are accusing me of mean or inconsiderate behavior on a public forum. I prefer to defend my behavior on the same forum.
I intended no such accusation, and I apologise that it was perceived that way. For anyone else reading this, Iâll clearly state that I donât believe Daniele was being mean or inconsiderate.

I was not the one to link time spent to suffering. I donât consider the time I spend working on open source projects as suffering. I was indeed very surprised when Hynek jumped from time spent to suffering
Thatâs fair. They arenât directly the same thing, though I do understand how they logically link together (primarily because the time spent in this case is time that didnât need to be spent except that we forced it to be, and that is what maintainers often refer to as âsufferingâ).

I think that the accepted view that maintainers of open source projects have any responsibility toward the users of their code, to the extent that maintaining the project is more important than their well being is something that need to change. Comments like yours, even if intended only as jokes, only reinforce this view.
Again, I apologise that my comments were seen that way. I certainly donât endorse suffering in any real sense. Though itâs important to understand that the term is commonly used in OSS maintainer circles to refer broadly to unnecessary demands being placed upon maintainers, and in that sense most maintainers do feel a responsibility to âsufferâ for their projects and their users. We hope the other parts of the experience are rewarding enough to make up for it, and they often are, but we also help by not minimizing the additional work we cause when our own contributions donât go smoothly.[1]
And for complete clarity, this is not directed at Daniele specifically, but all of us on this forum. âŠď¸
@daniele @steve.dower youâre derailing the discussion here. This is not a topic I can sensibly split, because there is no category that would fit your back and forth here. So letâs leave it at that, otherwise Iâll have to hide the off-topic messages that users keep flagging.

If you ask me, it was irresponsible for the Hatchling maintainers to start unconditionally emitting metadata version 2.4 while packaging tools are still catching up.
Nope, I was quite responsible in fact and waited to release the new version of Hatchling until others confirmed that they would not be broken: Hatchling v1.27.0 changelog dead link + no tag ¡ Issue #1842 ¡ pypa/hatch ¡ GitHub

There is an open PR to add support for metadata version 2.4 to twine. From the looks of it, it should be merged soon. Switch to packaging for parsing metadata and support metadata 2.4 by dnicolodi ¡ Pull Request #1180 ¡ pypa/twine ¡ GitHub
FTR, I was just checking the state of it and realized that itâs been merged 3 weeks ago: Switch to packaging for parsing metadata and support metadata 2.4 by dnicolodi ¡ Pull Request #1180 ¡ pypa/twine ¡ GitHub. So it seems like the last bit everybodyâs waiting for is a new Twine release, unless Iâm missing something.
⌠and Twine 6.1.0 was released yesterday.
And I updated pypi-publish
about an hour ago: @webknjaz.me on Bluesky. So weâre all set here, I think.
Thank you, @webknjaz!
With that, I just sent the final PR to the PEP: PEP 639: Mark Final by befeleme ¡ Pull Request #4227 ¡ python/peps ¡ GitHub
Thanks all, I just made a release with these changes:
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -2,7 +2,7 @@
build-backend = "hatchling.build"
requires = [
"hatch-vcs",
- "hatchling",
+ "hatchling>=1.27",
]
[project]
...
license = "BSD-3-Clause"
+license-files = [ "LICENSE" ]
...
classifiers = [
- "License :: OSI Approved :: BSD License",
"Programming Language :: Python",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.9",
And it shows up on PyPI with:
- License Expression: BSD-3-Clause
SPDX License Expression
This depended on (at least) these updates:
- packaging 24.2
- Hatchling 1.27
- Twine 6.1.0
- PyPI publish GitHub Action v1.12.4
- build-and-inspect-python-package v2.12.0
Thanks to everyone for making it happen, and especially @ksurma for all the PEP+spec work!
And pip 25.0 which was cut a few hours ago support displaying License-Expression
in pip show
and pip install --report
$ pip install prettytable==3.13.0
$ pip show prettytable
Name: prettytable
Version: 3.13.0
Summary: A simple Python library for easily displaying tabular data in a visually appealing ASCII table format
Home-page: https://github.com/prettytable/prettytable
Author:
Author-email: Luke Maurits <luke@maurits.id.au>
License-Expression: BSD-3-Clause
Location: /home/ichard26/.local/lib/python3.12/site-packages
Requires: wcwidth
Required-by:
Could we please have documented somewhere a reference implementation in Python for the glob part that complies with the mandatory requirements of the PEP? (maybe an attachment? Or something in the PyPA docs?)
I feel that we departed from the original intention of âletâs document whatever stdlibâs glob
do, so that we can implement it in other languagesâ to something that require a lot more validations which are not implemented by the stdlib itself.
We received something similar to the following in a contribution to setuptools: Validate license-files glob patterns by cdce8p ¡ Pull Request #4841 ¡ pypa/setuptools ¡ GitHub (thanks @cdce8p)[1]
import os
import re
from glob import glob
def find_pattern(pattern: str) -> list[str]:
"""
>>> find_pattern("/LICENSE.MIT")
Traceback (most recent call last):
...
ValueError: Pattern '/LICENSE.MIT' should be relative...
>>> find_pattern("../LICENSE.MIT")
Traceback (most recent call last):
...
ValueError: Pattern '../LICENSE.MIT' cannot contain '..'...
>>> find_pattern("LICEN{CSE*")
Traceback (most recent call last):
...
ValueError: Pattern 'LICEN{CSE*' contains invalid characters...
"""
if ".." in pattern:
raise ValueError(f"Pattern {pattern!r} cannot contain '..'")
if pattern.startswith((os.sep, "/")) or ":\\" in pattern:
raise ValueError(
f"Pattern {pattern!r} should be relative and must not start with '/'"
)
if re.match(r'^[\w\-\.\/\*\?\[\]]+$', pattern) is None:
raise ValueError(
f"Pattern '{pattern}' contains invalid characters. "
"https://packaging.python.org/en/latest/specifications/pyproject-toml/#license-files"
)
found = glob(pattern, recursive=True)
if not found:
raise ValueError(f"Pattern '{pattern}' did not match any files.")
return found
Is it enough/complete/correct? (at first glance I would say yes by looking at the text of the PEP, but I would like a second opinion)
the example code is a modification of the original contribution âŠď¸
Sorry to revive this, but the wording in PEP 639 is breaking some projects which have a need to fetch the LICENSE file from a parent directory (because the Python bindings are part of a larger project).
It is not obvious how to workaround this problem in the package configuration. Am I missing something?
See [Request for Reverting Intentional Breaking Change] New license file validation breaks projects with non-standard layout ¡ Issue #4892 ¡ pypa/setuptools ¡ GitHub for the corresponding setuptools
GH issue.
See also a similar issue in Flit. It seems that relative parent-directory paths in backends were previously causing unspecified & potentially broken behaviour, whereas Core Metadata 2.4 prohibits them. A project I maintain resolved this by adding a cp ../LICENSE LICENSE
step just before building â this might be possible to do automatically in a build backend.
A
Some questions, if we want to support license files held outside the project source tree:
Is it OK to allow pyproject.toml
to reference potentially anywhere in the filesystem when looking for license files? I canât think of a security risk here, but that might just mean Iâd make a bad malware developer
How would this be handled when building a sdist? The license file would need to be copied into the sdist, and couldnât remain in the same location as in the source tree (as there is no parent directory in a sdist). That makes this comment in the spec inaccurate:
If the metadata version is 2.4 or greater, the source distribution MUST contain any license files specified by the
License-File
field in thePKG-INFO
at their respective paths relative to the root directory of the sdist (containing thepyproject.toml
and thePKG-INFO
metadata).
Build backends couldnât copy the file to a new location and write an altered License-File
value, as that would violate the guarantees given by the fact that the pyproject.toml
field isnât dynamic
- the metadata field License-File
would no longer be the same as the pyproject.toml
field. We could make an exception for this field, but that seems likely to be messy at best, and probably a source of bugs and inconsistencies between backends.

Is it OK to allow
pyproject.toml
to reference potentially anywhere in the filesystem when looking for license files?
I think it is, but build backends should strongly consider being more safe, whatever that means to them. You generally canât build this kind of security into an interop specification - only restrictions.

Build backends couldnât copy the file to a new location and write an altered
License-File
value, as that would violate the guarantees given by the fact that thepyproject.toml
field isnâtdynamic
I mean, the build backend could require it to be marked dynamic if itâs not already included in the sdist? Thatâs easy enough for the developer to fix up at the same time as theyâre setting the path.
I thought âdynamicâ didnât apply to source tree->sdist transformations? If all the published metadata matches, what exactly is dynamic about it?

I mean, the build backend could require it to be marked dynamic if itâs not already included in the sdist? Thatâs easy enough for the developer to fix up at the same time as theyâre setting the path.
You canât supply a value if you mark a field as dynamic
. So youâd have to use a tool-specific field to specify a license file if you wanted to mark it as dynamic.

I thought âdynamicâ didnât apply to source tree->sdist transformations? If all the published metadata matches, what exactly is dynamic about it?
Itâs a bit of a grey area, TBH. People expect that they can read pyproject.toml
and if a field isnât marked as dynamic, they can treat the value as canonical. There was a lot of debate at the time of PEP 621 over whether people should be allowed to get metadata values from pyproject.toml
without consulting the build backend. The dynamic
field came out of that, and was intended to make it so that people could know when they needed to involve the backend.
In addition, itâs technically the license thatâs the metadata, not the location of the license file. So this is all a bit secondary anyway.
(This is all something that might need considering in the context of the SBOM PEP as well - cc @sethmlarson).

You canât supply a value if you mark a field as
dynamic
. So youâd have to use a tool-specific field to specify a license file if you wanted to mark it as dynamic.
Thatâs probably also fine for these cases. Anyone whoâs making a code layout work when itâs more complex than any Python template out there is probably sympathetic to the idea that defaults canât work for everyone.
Itâs a bit of an unfortunate specification, though. There were arguments about whether absence should imply dynamic, which is about as extreme as presence implying static even when the (static) dynamic value explicitly says itâs dynamic (when it wouldâve been just as easy and safe to allow a default that may be overridden, at least once you exclude 3rd party metadata readers that donât follow the spec). Every time I read that PEP now I wish Iâd spent more time digging into the specification at the time

People expect that they can read
pyproject.toml
and if a field isnât marked as dynamic, they can treat the value as canonical. There was a lot of debate at the time of PEP 621 over whether people should be allowed to get metadata values frompyproject.toml
without consulting the build backend
Yeah, well, they can. And if they ignore dynamic
then theyâre misreading it, except for the line that says to ignore dynamic
when it conflicts with the rest of the file.
Hopefully any serious tools trying to do something useful here arenât short-sighted enough to treat it as canonical when there are actual canonical metadata files available (in an sdist or wheel).
I donât mean to relitigate the design, especially since I purposefully opted-out of the process at the time, but it seems we do need to properly define how to include files in packages in pyproject.toml, since people keep wanting to standardise it even when we explicitly said that this was the responsibility of the build backend.
Where I recall things ended up is that packaging metadata standards dictate aspects of metadata for packages, and source trees are not packages even if users may wish they were and tools sometimes try to pretend they are.
If an sdist doesnât say a field is dynamic then you should be able to automatically infer that value for corresponding wheels. If youâre looking at a non-sdist source tree like a git repository, all bets are off. Thatâs the domain of non-standardized tool UX which could even (granted this is pathological) completely ignore the included pyproject.toml and replace that with its own when creating an sdist.