Fnmatch documentation could be improved

MarkVY · September 12, 2022, 11:28pm

I originally had a StackOverflow question about this a few months ago:

https://stackoverflow.com/questions/70207809/why-does-fnmatch-translate-in-python-anchor-to-the-end-of-the-string-but-not-the

I had a nice discussion in the comments, and I was hoping to refer to it here, but it got deleted.
I’ll reproduce most of the relevant bits. Consider the following code snippet:

from fnmatch import translate
regex = translate('someFile')

We now have a string that we can pass to the re module and stuff. Intuitively, this should be a string that matches exactly one file name, but no:

someFile: matches (of course)
someFile2: does not match (fine)
2someFile: matches! (wait, what?)

It turns out the returned regex anchors at the end, but there’s no anchor at the beginning, which one expect by symmetry. One of the interesting things about the discussion was the discovery that this misfeature goes back to at least 1997:

github.com

python/cpython/blob/9694fc/Lib/fnmatch.py

"""Filename matching with shell patterns.

fnmatch(FILENAME, PATTERN) matches according to the local convention.
fnmatchcase(FILENAME, PATTERN) always takes case in account.

The functions operate by translating the pattern into a regular
expression.  They cache the compiled regular expressions for speed.

The function translate(PATTERN) returns a regular expression
corresponding to PATTERN.  (It does not compile it.)
"""

import re

_cache = {}

def fnmatch(name, pat):
	"""Test whether FILENAME matches PATTERN.
	
	Patterns are Unix shell style:

This file has been truncated. show original

Thus, changing it now might have some unfortunate implications for backwards compatibility, though in theory changing the behavior to better match the docs is supposed to be “always allowed”, at least under some philosophical interpretations of the semver school of thought.

Actually I think accepting this “breaking” change should seriously be considered, because it may well be that for every piece of code out there that will break once this bug is fixed, there are 10x more pieces of code that will be SAVED from breakage in the future, or perhaps even from breakage in the present which has yet to be properly diagnosed.

But even if it is too late to change it now, it is hopefully NOT too late for one or both of the following:

The online docs and/or docstring should clearly that this we’re doing suffix matching, rather than exact matching.
provide an alternative function that does what one would expect.

(Unrelated: perhaps the warning in the docstring about “there is no way to quote metacharacters” could be replaced by this sentence from the full docs “For a literal match, wrap the meta-characters in brackets.”)

Thoughts?

barry-scott · September 13, 2022, 8:21am

I expect that you did not use re.match but instead used re.search.

:>>> print(re.search(regex, '2someFile'))
<re.Match object; span=(1, 9), match='someFile'>
:>>> print(re.match(regex, '2someFile'))
None
:>>> print(re.search(regex, '2someFile'))
<re.Match object; span=(1, 9), match='someFile'>
:>>>

Maybe you want the docs to suggest using re.match()?

MarkVY · September 13, 2022, 2:45pm

Yes, if docs mention match vs search and why it matters in this case, that would be good enough.

MarkVY · September 20, 2022, 6:42pm

Any advice how to get this to happen?

merwok · September 20, 2022, 6:59pm

This is the developer guide to contribute to Python: Helping with Documentation

If you’re willing to make a pull request with proposed changes, I will review it.

MarkVY · September 20, 2022, 7:12pm

Might be difficult. Github login is blocked at work.

merwok · September 20, 2022, 7:24pm

Could you do it fully separate from work time and hardware?
Then it would be your own contribution, needed only your personal contributor agreement.

MarkVY · September 20, 2022, 7:36pm

If I don’t forget or get distracted or something, yes.

Topic		Replies	Views
Add glob.translate(): convert path with shell wildcards to regular expression Ideas	4	811	December 1, 2023
Check whether two strings will point to the same file on the local filesystem Python Help	12	624	December 17, 2022
Cannot find the case statement searching the documentation Documentation	3	375	July 16, 2022
`re.match()`: raise exception if string doesn't match? Ideas	7	7845	May 16, 2022
Why is the output different from what I thought? Python Help	9	353	September 3, 2023

Fnmatch documentation could be improved

Related Topics