Expressing project vs. distribution licenses post-PEP 639

If the sdist has one license, but the build backend vendors in an extra DLL on some platforms, then the sdist license would need to be marked as dynamic to allow the wheel license on that platform to differ. The sdist license could still be specified in the sdist metadata, tools would just not be allowed to assume that wheels built from that sdist would have the same license.

That’s pyproject.toml keys. Those are a whole different situation. To be honest, I’m not sure we want to try to standardise all of the possible options in pyproject.toml. After all, build backends will have to do something special when creating sdists and wheels in this situation, so why not just make the license key dynamic, and rely on backend configuration in the [tool] section to specify license data, just the same as you’d need to specify which files to vendor or exclude in the backend configuration?

IMO, having just the existing license key in pyproject.toml, which maps to License-Expression in the sdist/wheel, is perfectly sufficient[1]. If projects need license data that differs between the sdist and wheels built by the build backend, then they should use a backend that supports that and use backend-specific configuration to make it happen. Unless it’s a common real-world situation, I don’t think it’s important enough to standardise (and hence require all build backends to support).


  1. Ignoring for now the potential PEP for a similar pair for the project license ↩︎

1 Like

I agree.

The problem is that as far as I can tell the only thing that can be acknowledged explicitly is that the core metadata in wheels/sdists is per-distribution while pyproject.toml provides no way to specify anything at that level. Hence I feel the only thing that can be said is “None of this works unless you assume all the metadata is going to be the same.” That means the current situation is among the things that don’t work, which it sounds like you agree with?

Again I’m not sure whether we actually disagree. The problem is that unless everyone has that shared understanding, the information that is in distributions (or in pyproject.toml) cannot be relied on to have any particular meaning. It seems to me that that is the situation we’re in.

I don’t understand what you mean by “none of this works”. What is it that doesn’t work?

There are as I see it two main consumers of the distributions that are uploaded to PyPI:

  • People who pip install foo and who will typically get the wheels that a project uploads.
  • People who download the sdists and build everything from source (this includes all repackagers: conda, distro, etc).

In either case you will get a distribution and it will have license metadata that describes its contents.

The case where building a wheel results in a wheel with a different license from the sdist is much less common than the case where the PyPI wheel license is different from the sdist license. This is because making wheels that are ABI-compatible for PyPI forces the kind of bundling that happens with auditwheel but there is no need for that ABI compatibility when building from source. The only examples I know where the built wheel would/could have a different license are like NumPy where the sdist vendors meson but it doesn’t get included in the wheel so:

  • sdist license = numpy license and meson license
  • built wheel = numpy license

Currently I think NumPy just uses the same LICENSE file for sdist and built wheel so the wheel still contains the meson license even though meson isn’t in the wheel (I think this is explained in the free text). If tooling and metadata could represent the license of the built wheel then I think they would use that.

I don’t know of any cases where the build backend vendors things in but there are many situations where it would be a very useful thing to do and it is possible that some projects might provide an option for this in future. I am also certain that there are examples of this that I just don’t know about though.

Note that what I am describing here is just what already happens. If the distributions have different licenses then pip install foo is going to select one of the distributions and it will have the license that it has and that will be recorded in the LICENSE file which will potentially be a concatenation of license files for things that are vendored into the sdist/wheel. The primary issues with this are that it is opaque to users who do not generally read the LICENSE file and also that lack of tooling support for license bundling means that even the LICENSE file may be incomplete for many projects.

PEP 639 offers the possibility to improve this because the combined license can be expressed as a machine readable SPDX expression which is also easier for a human to parse since it is shorter than the LICENSE files and could in principle be displayed on PyPI. The PEP also makes it possible to have multiple license files so they do not need to be concatenated. Each license file could instead be given a meaningful name e.g. LICENSE-apache2-vendored-meson. The PEP means that it is now possible for vendoring tools to support bundling the license files in a sensible way (I don’t think that anyone would want to build the concatenation approach into tooling as a feature).

The goal of the PEP was better license clarity and it seems clear that I am describing situations where that is exactly what is needed but somehow people seem to think that the License-Expression field should only be used in uninteresting cases.

1 Like

Sorry, I had two grants to submit for Spyder and then PyCon to prepare for and attend, but wanted to take a break before the packaging summit to catch up with the feedback here.

Fully agreed with all of this. Going to bring this whole issue up at the packaging summit tomorrow (err, today) and assuming there aren’t strong objections can aim to make a PR during the sprints.

Makes sense; seems the most pragmatic way to handle this. Sounds like you’re planning to make a PR? Or would you like me to?

I don’t want to be too over-optimistic, but at least just considering the project license [1], given it is a single static value copied into the Core Metadata (with identical processing to the existing distribution license already implemented by essentially all backends), it doesn’t seem too unreasonable to expect a relatively streamlined timeline to discuss and implement this. At least, I’d expect it would likely happen before PyPI implements per-distribution metadata retrieval and a new UI to display that.

I’d prioritize writing a draft PEP at the sprints to at least get the discussion started if I wasn’t already committed to writing another more complex PEP with Donghee re: machine-readible deprecations and removals.

Hmm…I’ll have to think on that a bit.

FWIW, different licenses wouldn’t necessarily require doing anything special that isn’t already done by all backends by default per the standards: unlike sdists, wheels only include the source tree’s import packages (plus other explicitly specified content), so anything with a separate license outside that subdirectory will be distributed in the sdist but not the wheels (e.g. test code, data and assets, top-level repo config/content from templates or other sources, re-used CI actions and workflows, project-level logos, banners and other assets, build system content, etc). And the implementation complexity would likely be relatively minimal, since the relevant keys for a given artifact can just be concatenated together with e.g. `" AND ".join((license.project, license.distribution, license.sdist)) and that’s it. And given at least the comments here suggest it is a common-enough real world situation for some popular projects, it seems worth at least considering extending the current limited standard to remove these limitations.

On the other hand, your proposed clarification re:auditwheel and similar tools does mitigate the most common (though not only) case where they differ–assuming such tools actually implement a way to read in and modify license expression data, which this would otherwise provide an escape hatch for. And if something substantially resembling the above comprehensive solution was not supported, this would at least be my preferred fallback to leave things open for (another) future PEP and/or backends to implement instead of some kind of half-measure.

I’ll present and gather feedback on this at the summit (since quite a few projects and stakeholders who might require this will be in the room) and also see how this would intersect with the partial-dynamic metadata PEP Henry is going to propose there, and re-evaluate after.


  1. Ignoring standardizing per-distribution licenses for a moment, per your proposal ↩︎

1 Like

Yes, it would be good if the outcome from PEP 639 is that we do end up having tooling that corrects the license in the wheel bundling situation. The PEP provides all the machinery for this but it would be good if the language in the specs makes it clear that the expectation is for this to happen when such tools are used.

It possibly makes more sense to have a new tool for doing this rather than each of the bundlers (auditwheel, delvewheel, …) implementing this separately. Only the bundling tool knows what was bundled though so they would at least need to provide that information in a machine readable format.

Did anything ever come of this?

If no one beats me to the other requested changes (which, let’s be honest, people won’t now that I have shown a willingness to see the changes made), I plan to tackle the other requests piecemeal as I find time.

1 Like

I also said

which I still need to do. I’m not sure whether it’s OK for me to simply take my own statement as approval for making this change myself (it seems a little too close to abusing my authority) so if anyone has any objection to this change, please speak up! Otherwise I’ll assume we have connsensus that this is OK, and go ahead.

6 Likes

Should you also clarify how non-build backends should handle this? e.g. should they add a field to Dynamic if they change it? Or should readers ignore the [omissions from] Dynamic metadata entirely when they have a wheel in hand and only trust it when they have an sdist?

1 Like

From the spec:

In any context other than a source distribution, Dynamic is for information only.

So yes, the only context in which the Dynamic metadata item is relevant is when you have a sdist. A tool that reads a wheel and modifies the metadata can do so without needing to modify the Dynamic field (although they could if they wanted to, of course).

The key point here is that you can be sure that when you see a sdist, you can know what the (non-Dynamic) metadata in the built wheel will be without building it. But you can’t use the Dynamic field to infer that metadata items will be the same in every wheel you see for that project/version - because the wheel might not have been created by building directly from the sdist.

I’ll try to make that clear in my PR, without making the whole thing so wordy that it harms the clarity. I’ll add you as a reviewer on the PR when I make it, so you can comment on whether I achieved that goal :slightly_smiling_face:

Edit: I’ve added the following statement in the PR:

It is advisable, but not required, that tools which modify wheel metadata add the modified fields to the generated wheel’s Dynamic field.

I think that’s the best we can do - we can’t require anything new in a clarification, so advice is all we can give.

2 Likes

PR at Clarify that the Dynamic metadata field only applies when building from sdist by pfmoore · Pull Request #1901 · pypa/packaging.python.org · GitHub

@steve.dower it looks like I can’t add you as a reviewer. Maybe you have to be a project maintainer to be explicitly added?

I’m seeing this warning from setuptools now:

        ********************************************************************************
        Please use a simple string containing a SPDX expression for `project.license`. You can also use `project.license-files`. (Both options available on setuptools>=77.0.0).

        By 2026-Feb-18, you need to update your project and remove deprecated calls
        or your builds will no longer be supported.

        See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
        ********************************************************************************

In context this is a project whose own code is all licensed MIT but where the PyPI wheels use auditwheel to bundle some things that are LPGLv3 and some other licenses.

I assume that the right thing to do in this situation is:

  • Put project.license as MIT, include the MIT license file and have project.license-files refer to that file.
  • Post-process the PyPI wheels after auditwheel to update License-Expression with AND (...) and add additional License-File fields and bundle in any additional license files to .dist-info in the wheels.

The actual build backend is meson-python rather than setuptools and last time I tried it did not yet support the PEP 639 pyproject.toml fields but it looks like that has now been added. My reading though is that it only handles the first part and the auditwheel post-processing needs to be handled separately.

Does anyone know of any tooling that can handle the second part?

We need something like

$ bundle-licenses foo.whl \
    licence1 licenses/license1-file \
    license2 licenses/license2-file

that would bundle extra license files into the wheel and update the License-Expression and License-Files fields.

What does the very specific date 2026-Feb-18 in the warning refer to? Is PyPI or setuptools going to do something on that date?

1 Like

This is very concerning to me. The core metadata spec states that the old License field is deprecated, but it does not give any indication that it will be removed - there’s no date for removal, no suggestion that removal is planned, nothing like that.

Furthermore, because metadata versions are linear (if someone wants to use features added in a metadata version later than 2.4, they can’t say “version 2.5, but without accepting the changes made in 2.4”) I don’t believe we can realistically remove legacy metadata without an extremely long deprecation period - and we’d need a standard to do so, it’s not something tools should be allowed to do unilaterally.

So IMO, if setuptools does start rejecting the old-style License metadata, they will be in violation of the standards. At best, setuptools is planning on adding tool-specific validation of a field where the standard does not specify any validation. And if that’s what they want to do, then they should be very clear in the message that this is a setuptools-specific limitation, and not standard-required (the linked packaging user guide section is for information, but does not have the force of a standard, and in this case IMO it implies a restriction that it has no right to make).

4 Likes

This forced migration to license expressions, both from this setuptools deprecation and, to a lesser extent, from the clause in the PEP that deliberately blocks package authors from keeping their license classifiers around for backwards compatibility, has caused so much disruption for me.

PyInstaller has been holding onto Python 3.8 support, mostly since it’s the last version to support Windows 7 and 8.0[1], but also to make it easier for people to target the ancient RHEL/SUSE/AIX platforms that they insist on running their servers on. setuptools 76.1.0 requires Python 3.9 but the new license field requires setuptools 77.0 and setuptools refuses to build if there is anything in the [project] table that it doesn’t recognise so we’re actually incapable of using the new field with our supported range of Python versions, yet we’re also going to be forced to use it by setuptools.

I’m eying up switching to hatchling which at least lets you build with fields that it’s too old to know about. In that scenario, no license field is set and I know this will break some of our users in corporate setups that have scanners which block the developer from installing a package whose license can’t be automatically determined. I would like to be able to keep my old license classifiers alongside the new license expression as a backup for these users but I can’t because the PEP blocks it.

The only option I have left is to rewrite all of PyInstaller’s (very custom) build code for hatchling just so that I can safely ignore the shiny new license field and carry on using my suboptimal license classifiers[2] as before, praying that hatchling doesn’t doesn’t do a setuptools and deprecate license classifiers too.


  1. which are still pretty popular ↩︎

  2. actually very suboptimal in PyInstaller’s case. It’s license expression would be (GPL-2.0-or-later WITH Bootloader-exception) AND ((GPL-2.0-or-later WITH Bootloader-exception) OR MIT) AND Apache-2.0 AND MIT AND ZLIB ↩︎

5 Likes

If the license was marked as dynamic then I guess you could check the setuptools version in setup.py for this. I certainly would not want to mark the license as dynamic though so I can understand if that is something that you would not want to do either.

I can understand why the PEP says for tools to error when older license classifiers are combined with the newer license-expression metadata because there would otherwise be ambiguity around which is the source of truth. The point of the PEP was to remove that kind of ambiguity so it makes sense to have strict rules for the new metadata.

I think setuptools is being overly aggressive in deprecating the older format right now though. There should be some time in between making it possible to use the new approach and then (visibly) deprecating the older approach.

3 Likes

I am sympathetic to people wanting to encourage the new stuff and shake off the old stuff. I just can’t get over the irony that this particular push actually blocks the migration that it intends to encourage.

And I don’t see how instead saying consumers of licence metadata should ignore classifiers in favour of license expressions if both are given would have been any worse for anyone. Classifiers and the code to handle them aren’t going anywhere even if a subset of them should become obsolete.


This experience has really highlighted for me one factor that I don’t think was mentioned in the previous setuptools broke stuff conversation: If Python makes a breaking change without a wide enough window between the new way becoming available and the old way being removed that the new way can be universally adopted then, then as long as it’s not a syntax change and the affected code is maintained enough to get a pull request in, the worst that can be required is a bunch of extra if sys.version_info >= (3, x): new way/else: old way conditions. It’s joyless but hypothetically always fixable. With the static pyproject.toml though, there’s no such thing as wrapping a field in a version constraint[1]. A single configuration must be unconditionally compatible with every generation of tooling that may come into contact with it or it’s unworkable. You could argue that that makes conservatism/backwards compatibility even more important to packaging than it is to core Python.


  1. short of writing a plugin that hijacks the build backend in some, usually fragile, way ↩︎

4 Likes

Yeah, I think we got things wrong on compatibility with the license expression PEP. My impression is that people were trying a little too hard to force people to “do the right thing”[1].

I’m incredibly aware that the “linear” versioning we have in most packaging standards (using features from version x+1 unconditionally opts you into version x) and the purely declarative form of most of our standards makes backward compatibility really hard. We’ve got away with it for a long time now, just because most new features have been universally beneficial. But that’s passed I think - we’re seeing more features that are “useful if you want it”, and we’re going to have to work out how to deal with that[2]. It looks like this will be one for the packaging council to work out, rather than it being on me, luckily :wink:


  1. A common failing in license discussions, in my experience ↩︎

  2. The wheel-next proposals are the biggest example of this, but other, less significant cases have been cropping up as well ↩︎

4 Likes

Not to pile on, but I help maintain packaged tools that only recently dropped Python 3.5/3.6/3.7 support and will be supporting 3.8 for a while still. I’d love to move them to new-style license metadata, but can’t until we only support Python 3.9 and newer (for SetupTools 77). If a new SetupTools release stops supporting the older mechanisms, it will likely mean these projects will need to pin back to old SetupTools versions for a while (or drop their license declarations altogether, or switch to a different build backend)… which also implies a forced migration to pyproject.toml since that’s effectively the only way to signal SetupTools version caps.

It’s not the end of the world, but it’s certainly extra friction when you have hundreds or thousands of packages that need it done.

2 Likes