Spaces not considered a valid, verbatim character for glob patterns

I opened `license-files` doesn't like spaces in a file name · Issue #793 · pypa/flit · GitHub against Flit for not accepting a license file with a space in the name, and @takluyver pointed out that glob patterns - Python Packaging User Guide doesn’t actually allow for it. I agree with Thomas that I think that’s just an oversight, but I wanted to get buy-in that this can just be a PR against the spec to update.

2 Likes

Apparently these patterns are from PEP 639: PEP 639 – Improving License Clarity with Better Package Metadata | peps.python.org. Unfortunately, it doesn’t really seem to include a rationale for such a narrow definition of glob.

Looking at the linked discussion, @konstin seems to have authored the latest version: PEP 639, Round 3: Improving license clarity with better package metadata - #72 by konstin. You seem to have authored the earlier version per PEP 639: Incorporate the latest discussion feedback by befeleme · Pull Request #3866 · python/peps · GitHub.

The earlier version was specifically talking of “portability”. On one hand, spaces are traditionally known to cause problems with badly written code. On the other, I don’t think it really applies here.

In the context of license files, I don’t think it was “just” an oversight. There was a fair bit of discussion about what glob syntax we should allow, and this is what we ended up with. On the other hand, I could accept the argument that it was a simple mistake[1], and should be fixed.

How many build backends would need amending if we made this change? I think the spec is restrictive enough that reusing a library function (like glob.glob) wouldn’t be compliant, so people might well have written their own.

I’ll also note that spaces aren’t that special. As the spec notes, { and } are also disallowed, so a file like License{1}.txt can’t be used directly. If we say “we should have allowed spaces”, do we open the door to other “we should have allowed” arguments?

Having said all of the above, @brettcannon was PEP delegate for PEP 639, so I’m willing to defer to his view on what we should do here.


  1. I suspect “alphanumeric” was actually just intended as the simplest way we could find to express “we want to prohibit any weird metacharacters that might be treated as special by some obscure glob library out there”… ↩︎

1 Like

I think your footnote that the brevity of “alphanumeric” was taken more literally than expected.

Flit would, PDM would not; that’s all the data points I have.

I don’t think so since spaces in file names is regular thing on Windows while { is not on any OS I know of.

I say amend the spec to allow spaces. I’ll open a PR.

Thanks all!