PEP 639, Round 3: Improving license clarity with better package metadata

I’ve updated uv publish to set license and license-files in the formdata: Upload: All metadata incl. PEP 639 by konstin · Pull Request #9442 · astral-sh/uv · GitHub. We’re publishing upload test package to test pypi, so e.g. astral-test-token 0.1.1912 should now be a user of the new fields.

uv publish now sets the following multiple use fields. I’m not sure about the pluralization rules, but these seem to work and match warehouse’s test_legacy.py:

  • classifiers
  • dynamic
  • license_file
  • obsoletes_dist
  • platform
  • project_urls
  • provides_dist
  • provides_extra
  • requires_dist
  • requires_external
3 Likes

Public service announcement. twine 6.0 has just been released. While this release will not error on distributions using metadata 2.4, it does not send the metadata fields defined in PEP 639 to the package index. On PyPI, this results in the license information not being displayed on the package page.

If you care about correct licensing information being shown on package indexes, you either have to keep using metadata 2.3 or switch to something else to upload your distribution to the indexes. With the recent uv release 0.5.5, uv publish should fully support metadata 2.4.

1 Like

Why aren’t the fields submitted? Hatch also supports publishing like UV but twine is more important because that is what the official GitHub Action uses.

3 Likes

I am not a twine maintainer. Fully supporting metadata version 2.4 was not deemed important enough to delay the release. AFAICT the focus of the release is to make twine compatible with the latest release of pkginfo, one of its dependencies, that had previously to be pinned to an earlier release causing some trouble for distribution maintainers having to integrate twine in their distribution. Full support for metadata version 2.4 will likely be in the next release.

2 Likes

Is this for a project with a dynamic version field that gets determined while building? Metadata 2.2+ disallows a “dynamic” Version field in source distributions, so setuptools caps the metadata version at 2.1 in that case (but still allows inclusion of fields from later metadata versions)

Scratch that, the sdist version info should be static regardless (since the version has to be specified in the filename). So setuptoools not setting the metadata version correctly when adding the new fields sounds like an outright bug.

I’m not sure I understand your reply. I have the impression you are confounding the Version and Metadata-Version fields. The latter is what determines the valid metadata fields, but the former is the one recorded in the sdist filename. Regardless, setting the License-File metadata field with Metadata-Version < 2.4 is always invalid.

I’ve reported it to setuptools [BUG] `tool.setuptools.license-files` results in invalid metadata · Issue #4759 · pypa/setuptools · GitHub but because existing tools seem to accept this (and they cannot be made to do stricter validation till srtuptools is not fixed) it is not considered a missing feature and not a bug, thus it will not be fixed untill setuptools gains proper support for metadata version 2.4.

1 Like

Do I read this correctly that by virtue of Hatchling now creating 2.4 metadata, Twine has started removing licensing information that it used to tolerate? Because I enthusiastically embraced the new standard in today’s attrs release and have immediately people complaining in my issue tracker.

Like… what is the correct way forward for us downstream? I’ve already spent way too much time and energy on three letters appearing in a web interface. :frowning:

3 Likes

There is an open PR to add support for metadata version 2.4 to twine. From the looks of it, it should be merged soon. Switch to packaging for parsing metadata and support metadata 2.4 by dnicolodi · Pull Request #1180 · pypa/twine · GitHub

If you can’t wait, I’d suggest to use the SPDX license identifier together with the “old” syntax, i.e. project.license = {text = "..."}, until then.

No, this is not correct. twine does not know about the new metadata fields added in metadata version 2.4 and thus it does not transmit those with the upload to PyPI. Because PyPI trusts the metadata transmitted with the upload and does not extract it from the distribution files, this results in licensing information not being shown on PyPI. To say it in a different way: twine is not actively removing anything.

If you ask me, it was irresponsible for the Hatchling maintainers to start unconditionally emitting metadata version 2.4 while packaging tools are still catching up. AFAIK, the issue with wheel uploads was reported to them. I believe that, because the Hatch toolset has its own distribution upload tool, they decided that it is not reason enough to hold back updating the metadata version.

Are the complains about licensing information not being displayed on attrs PyPI page or something else? As this is mostly a cosmetic issue (licensing information is still there, just not displayed in the web interface) I would tend to assign the issue to the low priority wish list.

Assuming the issue is limited to the licensing information not showing in the PyPI page, and having it there now is important for you or your users, you should revert to use pre PEP 639 metadata fields. Otherwise, you should just wait till twine gains support for metadata 2.4 and do another release. The PR adding support has been open a while ago and it is moving forward. Before the last twine release, the upload would have simply failed, thus the situation is not that bad.

If it makes you feel better, the time you spent is most likely a few order of magnitude less than what the people involved invested to create the infrastructure to get those three letters there :slightly_smiling_face:

3 Likes

Right, for posterity and people landing here with the same question, here’s the metadata:

This works but shouldn’t (yesterday):

Metadata-Version: 2.3
Name: attrs
Version: 24.2.1.dev53
...
License: MIT

SPDX-IDs as License strings was used by Hatch(ling) for years and tacitly supported by PyPI.

This doesn’t work but should (today):

Metadata-Version: 2.4
Name: attrs
Version: 24.3.0
...
License-Expression: MIT
License-File: LICENSE

ISTM that people are running automated tools collecting licensing information from PyPI packages so my guess is some kind of compliance.

Despite everyone always giving the best, suffering in FOSS is not a competition. :wink:

2 Likes

I feel a need to apologize: when assessing the PEP’s impact it has never occurred to me that twine is more than just a messenger between a distribution and PyPI. This could have been accounted for and prevented, but instead we’ve got a bumpy integration. I surely hope that once the support for metadata 2.4 is merged to twine, the packaging landscape will be mostly sorted out.

7 Likes

The above follows metadata standard version 2.3 thus it should work. Why do you think it should not? The problem with metadata 2.3 is that what goes in the License field is not very well defined. Therefore, some package authors decided to put a license description in it, and other the whole text of the license (the latter has became increasingly popular because PEP 621 allows project.license in pyproject.toml to be set from the content of a file). PyPI displays the License metadata field as project license if there are no License :: classifiers defined (or something along these lines). PEP 639 was drafter with the intent to clarify the use of the License field. For backward compatibility two new metadata fields have been added: License-Expression and License-File.

Scraping the web interface seems a very poor way of doing that. One reason is that for some packages (for example SciPy) the license of the sdist and of the wheels are different but the PyPI package page is per version, not per distribution artifact. In this case I don’t know which license is showm in the web interface (the one for the sdist or the one relative to the last artifact uploaded for a given version seem both good choices). Retrieving the distribution metadata and parsing it seems a much better approach.

If your FOSS contribution cause your or other to suffer, I encourage you to review your choices and your priorities :slightly_smiling_face:

Please don’t encourage this - if Hynek stops taking all the suffering for all the great projects he’s started, everyone else will suffer! :smiley:

7 Likes

OK I kinda skipped a step here: Hatch accepted license = “MIT” in pyproject.toml which others didn’t and which apparently also wasn’t quite supported by the API according to https://github.com/python-attrs/attrs/pull/1337 and was returning null for license before. It’s a bit tough to untangle all these factors from downstream perspective.

The web is just one view on the data: there’s an JSON API too and that returns null for all license fields now: https://pypi.org/pypi/attrs/json

1 Like

I know that this wants to be a joke, but it is very sad that there is space to think that contributing to open source should bring anything other than joy for expressing creativity and connecting with a community.

2 Likes

Sure, but the JSON data has the same shortcomings of the data in the web interface. For example, take https://pypi.org/pypi/scipy/json and try to make sense of what license covers the distribution of SciPy Windows wheels.

I have been prototyping some more REST-ish APIs for PyPI and it’s something I’m quite keen on helping make happen.

4 Likes

Yeah, I agree, it’s kinda dark humour. But I did want to gently point out that we’re talking to a maintainer here, not a contributor, and we recognise that maintainers end up with a greater burden to make things work, as well as being more significantly impacted when other project’s contributors make changes that have an indirect impact.

Here we have someone who mostly works outside of this forum coming in and telling us that people who largely work inside our forum made such a change. The amount of suffering is indeed not a competition, but as this is our residence, we shouldn’t even be trying to compare with our visitor.

(Okay, I know Hynek has the core developer badge and you don’t, so the analogy starts falling apart a bit there. But in this context, on behalf of non-PSF projects, he’s the visitor :wink: )

2 Likes

I tried but I cannot understand what you are trying to say. Are you trying to say that me lightly suggesting that someone should take care of themself if their open source contributions cause suffering is inappropriate, while you lightly suggesting that they should endure suffering in name of the benefits that users get from their contribution is appropriate?

I don’t understand how you can assume that Hynek is a visitor here while I am a resident. I’m almost certain that Hynek has more posts on this platform than me. I just happen to be someone that has spent some time understanding the topic at hand and to develop some time fixing things and I though that some explanations could be beneficial for the community.

Can you please point me to where I made such a comparison?

Based on the responses, I believe mine came across more lightly than yours. Though I may have the advantage of knowing Hynek personally.

In this thread, his comments relate entirely to his experience on projects that are not developed here, while yours are from the position of the decisions that were made here (if only because he was outside and you replied). With no additional context, it’s an entirely reasonable assumption.

Sure. I’ll even quote the bit where Hynek also pointed out that you made such a comparison:

This is well off-topic now, though. Please direct message me if you want to continue.