Glob.glob('**/foo') should return "foo", changed my mind (SOLVED :-)

I believe there is a problem with how “glob.glob(‘**/foo’)” behaves if “foo” exists in the current directory. It currently (3.12.2) does not.

However:

  • The glob documentation says: " the pattern “” will match any files and zero or more directories". Emphasis mine. I don’t fully understand what the documentation is saying here, but zero directories in "/foo" is “foo”.
  • “pathlib.Path(‘.’).glob(‘**/foo’)” DOES return “foo” in the results.
  • rsync also seems to return “foo”. Rsync documentation is more clear, saying “a ‘**’ matches zero or more characters, including slashes”.

Here is an example that shows what I mean:

#!/usr/bin/env python3

from pathlib import Path
import glob
import os

os.chdir('/etc')

print(list(glob.glob('**/services')))
assert 'services' not in glob.glob('**/services')

print(list(glob.glob('./**/services')))
assert 'services' not in glob.glob('./**/services')

print(list(Path('.').glob('**/services')))
assert 'services' in [x.name for x in Path('.').glob('**/services')]

I would expect that all 3 prints would return basically the same thing (a list including the file “./services”). However, as the assertions show, the first two do not.

However, this could quite possibly be a controversial change, since anyone using “**/foo” may be relying on current behavior, but I would argue that the documentation shows that use to be in error.

Alternately, the documentation should be cleared up to explain that “**” semantics don’t match rsync in that “**/foo” does not find “foo” in the current directory. I’d also take a swing at clearing up the documentation since I can’t really understand it.

Sean

The full line in the documentation is:

If recursive is true, the pattern “** ” will match any files and zero or more directories, subdirectories and symbolic links to directories.

Emphasis mine.

You need to pass recursive=True when calling glob in order for ** to match zero or more directories. It’s False by default:

glob.glob(pathname, *, root_dir=None, dir_fd=None, recursive=False, include_hidden=False )

2 Likes

Man, rookie mistake on my part, you are absolutely right.

I got confused because “**/foo” was working in my typical use-case because it was just in some subdirectory, so it “felt” recursive when instead if was just “any”.

Thanks!