Could we please have documented somewhere a reference implementation in Python for the glob part that complies with the mandatory requirements of the PEP? (maybe an attachment? Or something in the PyPA docs?)
I feel that we departed from the original intention of “let’s document whatever stdlib’s glob
do, so that we can implement it in other languages” to something that require a lot more validations which are not implemented by the stdlib itself.
We received something similar to the following in a contribution to setuptools: Validate license-files glob patterns by cdce8p · Pull Request #4841 · pypa/setuptools · GitHub (thanks @cdce8p)[1]
import os
import re
from glob import glob
def find_pattern(pattern: str) -> list[str]:
"""
>>> find_pattern("/LICENSE.MIT")
Traceback (most recent call last):
...
ValueError: Pattern '/LICENSE.MIT' should be relative...
>>> find_pattern("../LICENSE.MIT")
Traceback (most recent call last):
...
ValueError: Pattern '../LICENSE.MIT' cannot contain '..'...
>>> find_pattern("LICEN{CSE*")
Traceback (most recent call last):
...
ValueError: Pattern 'LICEN{CSE*' contains invalid characters...
"""
if ".." in pattern:
raise ValueError(f"Pattern {pattern!r} cannot contain '..'")
if pattern.startswith((os.sep, "/")) or ":\\" in pattern:
raise ValueError(
f"Pattern {pattern!r} should be relative and must not start with '/'"
)
if re.match(r'^[\w\-\.\/\*\?\[\]]+$', pattern) is None:
raise ValueError(
f"Pattern '{pattern}' contains invalid characters. "
"https://packaging.python.org/en/latest/specifications/pyproject-toml/#license-files"
)
found = glob(pattern, recursive=True)
if not found:
raise ValueError(f"Pattern '{pattern}' did not match any files.")
return found
Is it enough/complete/correct? (at first glance I would say yes by looking at the text of the PEP, but I would like a second opinion)
the example code is a modification of the original contribution ↩︎