Find directory without using os.walk or recursion

I have this directory tree:
(top)–>(c , cpp, python)
(top/c)–>(other_courses)
(top/c/other_courses)–>(cpp,python)
(top/cpp)–>(other_courses)
(top/cpp/other_corses)–>(c,python)
(top)–>(python)
I want to find all directories by the name python. I do not want to use os.walk() and I do not want to use recursion as in the code commented out below. Also I want to print Not found in case folder does not exist ( my code prints not found many times!).

import os 
def findFolder(rootDir,folder): 
   
    list_dirs = os.walk(rootDir) 
    for root, dirs, files in list_dirs:
        
        for d in dirs:
           p=os.path.join(root, d) 
           #print(p)
           if folder in p:
                   print(p )      
        
"""for lists in os.listdir(rootDir): 
        path = os.path.join(rootDir, lists) 
        if 'python' in path:
             f=True
             print (path) 
        if os.path.isdir(path): 
            findFolder(rootDir,folder)
    if not f:
       print('not')"""
findFolder('root','python')```

without using os.walk or recursion

Why?

Anything that can be done with recursion, you can equivalently do by keeping a list of the things you still haven’t checked. But, as Stefan asks: why? Recursion is the simplest, clearest, and most logical way to spell this. Is there a special restriction that makes recursion impractical?

I tried to do plain for loop without recursion but could not get it right.

I don’t see how that’s supposed to answer our questions.

Yes I agree with you about recursion. But what about if stack overflow occurs? Any way for me iterative is more intuitive.

That’s a valid concern, at least in theory. But do you have directory structures so deep that this becomes an issue? And what about os.walk?

The exercise I am trying to solve is before learning walk. So essentially what it is asking is create your own walk and use it

That’s a very reasonable concern, but do be aware that Python’s default recursion limit is one thousand. That’s a LOT of directory levels. If you really need more than that, it’s certainly possible to avoid recursion, but it’s far less clean and convenient.

Then I have some great news for you: today, you have stumbled upon the exercise that’s going to teach you how recursion works! :slight_smile: Seriously though, this is a really superb lesson in recursion, because directory trees are inherently recursive (a directory can contain other directories, and walking a small part of a directory tree is the same as walking a large part of one, you just need to do more/less of the same work) and a recursive solution to the problem will make the most sense.

We’re programmers. When we’re faced with a problem, we learn new tools and strategies, and keep right on going!

7 Likes

Well the basic deal looks like this:

 def flattened(thing):
     pending = [thing]
     while pending:
         obj = pending.pop()
         ... do whatever with obj ...
         for child in obj.its_children():
             if you_would_recurse_into(child):
                 pending.append(child)

So that you have a queue (pending) of things needing processing. Pull
the first thing off the queue, process it. If it has children you would
have processed by recursion, just appending them to the queue. Repeat
until the queue is empty.

For your walk() replacement, thing would be the top directory name.
The children can be obtained with os.listdir() and any children which
are directories can be appended to the queue.

Whether you pull from the start or end of the queue affects the order,
as does what order you append things to the queue. And in principle the
queue count be any container such as a set or heap as long as you can
pull a single item from it.

Cheers,
Cameron Simpson cs@cskk.id.au

Thanks Christ but wait. It seems we are stuck to recursion because os.walk is recursive. So just for the fun of it can we come up with pure iterative solution without ever using os module?

thanks Cameron that is exactly what I did. I modified source code of os.walk and made it leaner and used as shown below. can please take a look and make better and also print not found when folder does not exist?

import os
import os.path as path
def mywalk(top,folder, topdown=True):#, onerror=None, followlinks=False):
    
   
    islink, join, isdir = path.islink, path.join, path.isdir

    names=os.listdir(top)

    dirs, nondirs = [], []
    for name in names:
        if isdir(join(top, name)):
            dirs.append(name)
        else:
            nondirs.append(name)
    
    if topdown:
       yield top, dirs, nondirs
    for name in dirs:
        new_path = join(top, name)
        
        for x in mywalk(new_path,folder):#, topdown, onerror, followlinks):
                yield x
    #I NEED TO PRINT not found if folder not found at the end
    #I don't know  if folder not in dirs??
p = "./root"
rt=''
folder='python'
for root,d_names,f_names in mywalk(p,folder,topdown=False):
     rt=root
     if folder in rt:
        print('....',rt) 
        rt=''
    

        


thanks Cameron that is exactly what I did. I modified source code of
os.walk and made it leaner and used as shown below. can please take a
look and make better and also print not found when folder does not
exist?

import os
import os.path as path
def mywalk(top,folder, topdown=True):#, onerror=None, followlinks=False):
   islink, join, isdir = path.islink, path.join, path.isdir

You can do this in the import:

 from os.path import islink, join, isdir

Personally I usually write this as:

 from os.path import islink, joinpath, isdirpath

basicly because “join” is a very generic name.

   names=os.listdir(top)
   dirs, nondirs = [], []
   for name in names:
       if isdir(join(top, name)):
           dirs.append(name)
       else:
           nondirs.append(name)
   if topdown:
      yield top, dirs, nondirs
   for name in dirs:
       new_path = join(top, name)
       for x in mywalk(new_path,folder):#, topdown, onerror, followlinks):

This is a recursive call - you’re calling mywalk from within mywalk.
The iterative solution keeps a collection of outstanding directories to
walk and iterates over that.

Thanks Cameron what about adding “not found” to mywalk function? is there a way to do that?

You criteria for “not found” are not clear to me.

The general approach is to set a flag, something like this:

 found = False
 for item in items:
     if this_is_what_we_wanted(item):
         found = True
         # often you might only need to find the first matching item
         # so we might break out of the loop at this point;
         # if you need to do things to _all_ matching items,
         # do not break
         break
 if not found:
     print("no matching item found")