Fnmatch documentation could be improved

I originally had a StackOverflow question about this a few months ago:

https://stackoverflow.com/questions/70207809/why-does-fnmatch-translate-in-python-anchor-to-the-end-of-the-string-but-not-the

I had a nice discussion in the comments, and I was hoping to refer to it here, but it got deleted.
I’ll reproduce most of the relevant bits. Consider the following code snippet:

from fnmatch import translate
regex = translate('someFile')

We now have a string that we can pass to the re module and stuff. Intuitively, this should be a string that matches exactly one file name, but no:

  • someFile: matches (of course)
  • someFile2: does not match (fine)
  • 2someFile: matches! (wait, what?)

It turns out the returned regex anchors at the end, but there’s no anchor at the beginning, which one expect by symmetry. One of the interesting things about the discussion was the discovery that this misfeature goes back to at least 1997:

Thus, changing it now might have some unfortunate implications for backwards compatibility, though in theory changing the behavior to better match the docs is supposed to be “always allowed”, at least under some philosophical interpretations of the semver school of thought.

Actually I think accepting this “breaking” change should seriously be considered, because it may well be that for every piece of code out there that will break once this bug is fixed, there are 10x more pieces of code that will be SAVED from breakage in the future, or perhaps even from breakage in the present which has yet to be properly diagnosed.

But even if it is too late to change it now, it is hopefully NOT too late for one or both of the following:

  • The online docs and/or docstring should clearly that this we’re doing suffix matching, rather than exact matching.
  • provide an alternative function that does what one would expect.

(Unrelated: perhaps the warning in the docstring about “there is no way to quote metacharacters” could be replaced by this sentence from the full docs “For a literal match, wrap the meta-characters in brackets.”)

Thoughts?

I expect that you did not use re.match but instead used re.search.

:>>> print(re.search(regex, '2someFile'))
<re.Match object; span=(1, 9), match='someFile'>
:>>> print(re.match(regex, '2someFile'))
None
:>>> print(re.search(regex, '2someFile'))
<re.Match object; span=(1, 9), match='someFile'>
:>>>

Maybe you want the docs to suggest using re.match()?

Yes, if docs mention match vs search and why it matters in this case, that would be good enough.

Any advice how to get this to happen?

This is the developer guide to contribute to Python: Helping with Documentation

If you’re willing to make a pull request with proposed changes, I will review it.

Might be difficult. Github login is blocked at work.

Could you do it fully separate from work time and hardware?
Then it would be your own contribution, needed only your personal contributor agreement.

If I don’t forget or get distracted or something, yes.