I originally had a StackOverflow question about this a few months ago:
I had a nice discussion in the comments, and I was hoping to refer to it here, but it got deleted.
I’ll reproduce most of the relevant bits. Consider the following code snippet:
from fnmatch import translate
regex = translate('someFile')
We now have a string that we can pass to the re
module and stuff. Intuitively, this should be a string that matches exactly one file name, but no:
- someFile: matches (of course)
- someFile2: does not match (fine)
- 2someFile: matches! (wait, what?)
It turns out the returned regex anchors at the end, but there’s no anchor at the beginning, which one expect by symmetry. One of the interesting things about the discussion was the discovery that this misfeature goes back to at least 1997:
Thus, changing it now might have some unfortunate implications for backwards compatibility, though in theory changing the behavior to better match the docs is supposed to be “always allowed”, at least under some philosophical interpretations of the semver school of thought.
Actually I think accepting this “breaking” change should seriously be considered, because it may well be that for every piece of code out there that will break once this bug is fixed, there are 10x more pieces of code that will be SAVED from breakage in the future, or perhaps even from breakage in the present which has yet to be properly diagnosed.
But even if it is too late to change it now, it is hopefully NOT too late for one or both of the following:
- The online docs and/or docstring should clearly that this we’re doing suffix matching, rather than exact matching.
- provide an alternative function that does what one would expect.
(Unrelated: perhaps the warning in the docstring about “there is no way to quote metacharacters” could be replaced by this sentence from the full docs “For a literal match, wrap the meta-characters in brackets.”)
Thoughts?