PEP 639, Round 3: Improving license clarity with better package metadata

Could we please have documented somewhere a reference implementation in Python for the glob part that complies with the mandatory requirements of the PEP? (maybe an attachment? Or something in the PyPA docs?)

I feel that we departed from the original intention of “let’s document whatever stdlib’s glob do, so that we can implement it in other languages” to something that require a lot more validations which are not implemented by the stdlib itself.

We received something similar to the following in a contribution to setuptools: Validate license-files glob patterns by cdce8p · Pull Request #4841 · pypa/setuptools · GitHub (thanks @cdce8p)[1]

import os
import re
from glob import glob


def find_pattern(pattern: str) -> list[str]:
    """
    >>> find_pattern("/LICENSE.MIT")
    Traceback (most recent call last):
    ...
    ValueError: Pattern '/LICENSE.MIT' should be relative...
    >>> find_pattern("../LICENSE.MIT")
    Traceback (most recent call last):
    ...
    ValueError: Pattern '../LICENSE.MIT' cannot contain '..'...
    >>> find_pattern("LICEN{CSE*")
    Traceback (most recent call last):
    ...
    ValueError: Pattern 'LICEN{CSE*' contains invalid characters...
    """
    if ".." in pattern:
        raise ValueError(f"Pattern {pattern!r} cannot contain '..'")
    if pattern.startswith((os.sep, "/")) or ":\\" in pattern:
        raise ValueError(
            f"Pattern {pattern!r} should be relative and must not start with '/'"
        )
    if re.match(r'^[\w\-\.\/\*\?\[\]]+$', pattern) is None:
        raise ValueError(
            f"Pattern '{pattern}' contains invalid characters. "
            "https://packaging.python.org/en/latest/specifications/pyproject-toml/#license-files"
        )
    found = glob(pattern, recursive=True)
    if not found:
        raise ValueError(f"Pattern '{pattern}' did not match any files.")
    return found

Is it enough/complete/correct? (at first glance I would say yes by looking at the text of the PEP, but I would like a second opinion)


  1. the example code is a modification of the original contribution ↩︎