PEP 639: Improving license clarity with better package metadata

So I definitely think that the PyPI issue was exacerbated by the particulars of PyPI’s ReST rendering, though I still think it applies here. I don’t think you can assume a warning is going to be emitted and if it is, that people are even going to read the warning.

PyPI itself doesn’t have a mechanism to report warnings back about an upload, it can either accept the upload or it can fail the upload.

That leaves only client side warnings, however, even if people added those checks today, it would still be several years until they broadcast out to people. I think that distutils warned about missing fields (URL maybe? I forget what one) for like a decade and people were still shipping packages that had them missing and/or malformed. Some part of that is because people generally just ignore warnings, another part of that is that for setuptools at least the output is super verbose and things get lost easily, see for example a very simple test package:

$ python setup.py bdist_wheel
/Users/dstufft/.pyenv/versions/3.7.3/lib/python3.7/site-packages/setuptools/dist.py:472: UserWarning: Normalizing '2019.05.07.01' to '2019.5.7.1'
  normalized_version,
running bdist_wheel
running build
installing to build/bdist.macosx-10.14-x86_64/wheel
running install
running install_egg_info
running egg_info
writing dstufft.testpkg.egg-info/PKG-INFO
writing dependency_links to dstufft.testpkg.egg-info/dependency_links.txt
writing top-level names to dstufft.testpkg.egg-info/top_level.txt
reading manifest file 'dstufft.testpkg.egg-info/SOURCES.txt'
writing manifest file 'dstufft.testpkg.egg-info/SOURCES.txt'
Copying dstufft.testpkg.egg-info to build/bdist.macosx-10.14-x86_64/wheel/dstufft.testpkg-2019.5.7.1-py3.7.egg-info
running install_scripts
creating build/bdist.macosx-10.14-x86_64/wheel/dstufft.testpkg-2019.5.7.1.dist-info/WHEEL
creating 'dist/dstufft.testpkg-2019.5.7.1-py3-none-any.whl' and adding 'build/bdist.macosx-10.14-x86_64/wheel' to it
adding 'dstufft.testpkg-2019.5.7.1.dist-info/METADATA'
adding 'dstufft.testpkg-2019.5.7.1.dist-info/WHEEL'
adding 'dstufft.testpkg-2019.5.7.1.dist-info/top_level.txt'
adding 'dstufft.testpkg-2019.5.7.1.dist-info/RECORD'
removing build/bdist.macosx-10.14-x86_64/wheel

Ironically that has a warning and I didn’t even realize it until I was proof reading this post, because it just blended into the noise of producing a package-- that’s possibly a setuptools problem and we can decide that we don’t think it’s worth doing something different because of it, but I think it is worth at least thinking about the chances someone is even going to see said warning.

I think that if we actually want to ensure that people are putting well formed data in, then we need to validate that field in a way that will actually fail. Part of the transition of doing that can involve a period of time when we only warn, but ultimately I think well formed data needs to be hard validated somewhere in the pipeline, but we can’t really do that with the PEP as stands.

Personally I’d lean towards adding a new field for the data itself, something like License-Expression or even SPDX-Expression or something. I’m not a huge fan of boolean fields and I don’t see us ever having a second kind of license expression where we might want the ability to toggle between them like we do with long description. I think that gives us a very clean path forward:

  1. PyPI immediately starts validating this field, there will never be a published package that has an invalid field here.
  2. Packaging tools start warning if this field is omitted, and can also validate internally that it’s valid.

We probably could never make the field mandatory in the packaging spec itself (e.g. pip should be able to install a package without it, build tools should be able to build a package without it), but I could very easily see an argument that at some point PyPI should require it for uploads.

2 Likes