Find directory without using os.walk or recursion

kobosh · January 22, 2024, 6:54pm

I have this directory tree:
(top)–>(c , cpp, python)
(top/c)–>(other_courses)
(top/c/other_courses)–>(cpp,python)
(top/cpp)–>(other_courses)
(top/cpp/other_corses)–>(c,python)
(top)–>(python)
I want to find all directories by the name python. I do not want to use os.walk() and I do not want to use recursion as in the code commented out below. Also I want to print Not found in case folder does not exist ( my code prints not found many times!).

import os 
def findFolder(rootDir,folder): 
   
    list_dirs = os.walk(rootDir) 
    for root, dirs, files in list_dirs:
        
        for d in dirs:
           p=os.path.join(root, d) 
           #print(p)
           if folder in p:
                   print(p )      
        
"""for lists in os.listdir(rootDir): 
        path = os.path.join(rootDir, lists) 
        if 'python' in path:
             f=True
             print (path) 
        if os.path.isdir(path): 
            findFolder(rootDir,folder)
    if not f:
       print('not')"""
findFolder('root','python')```

Stefan2 · January 22, 2024, 6:56pm

without using os.walk or recursion

Why?

Rosuav · January 22, 2024, 7:11pm

Anything that can be done with recursion, you can equivalently do by keeping a list of the things you still haven’t checked. But, as Stefan asks: why? Recursion is the simplest, clearest, and most logical way to spell this. Is there a special restriction that makes recursion impractical?

kobosh · January 22, 2024, 8:09pm

I tried to do plain for loop without recursion but could not get it right.

Stefan2 · January 22, 2024, 8:58pm

I don’t see how that’s supposed to answer our questions.

kobosh · January 22, 2024, 9:21pm

Yes I agree with you about recursion. But what about if stack overflow occurs? Any way for me iterative is more intuitive.

Stefan2 · January 22, 2024, 9:28pm

That’s a valid concern, at least in theory. But do you have directory structures so deep that this becomes an issue? And what about os.walk?

kobosh · January 22, 2024, 9:47pm

The exercise I am trying to solve is before learning walk. So essentially what it is asking is create your own walk and use it

Rosuav · January 22, 2024, 10:12pm

That’s a very reasonable concern, but do be aware that Python’s default recursion limit is one thousand. That’s a LOT of directory levels. If you really need more than that, it’s certainly possible to avoid recursion, but it’s far less clean and convenient.

Then I have some great news for you: today, you have stumbled upon the exercise that’s going to teach you how recursion works! Seriously though, this is a really superb lesson in recursion, because directory trees are inherently recursive (a directory can contain other directories, and walking a small part of a directory tree is the same as walking a large part of one, you just need to do more/less of the same work) and a recursive solution to the problem will make the most sense.

We’re programmers. When we’re faced with a problem, we learn new tools and strategies, and keep right on going!

cameron · January 22, 2024, 10:19pm

Well the basic deal looks like this:

 def flattened(thing):
     pending = [thing]
     while pending:
         obj = pending.pop()
         ... do whatever with obj ...
         for child in obj.its_children():
             if you_would_recurse_into(child):
                 pending.append(child)

So that you have a queue (pending) of things needing processing. Pull
the first thing off the queue, process it. If it has children you would
have processed by recursion, just appending them to the queue. Repeat
until the queue is empty.

For your walk() replacement, thing would be the top directory name.
The children can be obtained with os.listdir() and any children which
are directories can be appended to the queue.

Whether you pull from the start or end of the queue affects the order,
as does what order you append things to the queue. And in principle the
queue count be any container such as a set or heap as long as you can
pull a single item from it.

Cheers,
Cameron Simpson cs@cskk.id.au

kobosh · January 23, 2024, 2:33pm

Thanks Christ but wait. It seems we are stuck to recursion because os.walk is recursive. So just for the fun of it can we come up with pure iterative solution without ever using os module?

kobosh · January 23, 2024, 3:17pm

thanks Cameron that is exactly what I did. I modified source code of os.walk and made it leaner and used as shown below. can please take a look and make better and also print not found when folder does not exist?

import os
import os.path as path
def mywalk(top,folder, topdown=True):#, onerror=None, followlinks=False):
    
   
    islink, join, isdir = path.islink, path.join, path.isdir

    names=os.listdir(top)

    dirs, nondirs = [], []
    for name in names:
        if isdir(join(top, name)):
            dirs.append(name)
        else:
            nondirs.append(name)
    
    if topdown:
       yield top, dirs, nondirs
    for name in dirs:
        new_path = join(top, name)
        
        for x in mywalk(new_path,folder):#, topdown, onerror, followlinks):
                yield x
    #I NEED TO PRINT not found if folder not found at the end
    #I don't know  if folder not in dirs??
p = "./root"
rt=''
folder='python'
for root,d_names,f_names in mywalk(p,folder,topdown=False):
     rt=root
     if folder in rt:
        print('....',rt) 
        rt=''

cameron · January 24, 2024, 12:08am

thanks Cameron that is exactly what I did. I modified source code of
os.walk and made it leaner and used as shown below. can please take a
look and make better and also print not found when folder does not
exist?
import os
import os.path as path
def mywalk(top,folder, topdown=True):#, onerror=None, followlinks=False):
   islink, join, isdir = path.islink, path.join, path.isdir

You can do this in the import:

 from os.path import islink, join, isdir

Personally I usually write this as:

 from os.path import islink, joinpath, isdirpath

basicly because “join” is a very generic name.

   names=os.listdir(top)
   dirs, nondirs = [], []
   for name in names:
       if isdir(join(top, name)):
           dirs.append(name)
       else:
           nondirs.append(name)
   if topdown:
      yield top, dirs, nondirs
   for name in dirs:
       new_path = join(top, name)
       for x in mywalk(new_path,folder):#, topdown, onerror, followlinks):

This is a recursive call - you’re calling mywalk from within mywalk.
The iterative solution keeps a collection of outstanding directories to
walk and iterates over that.

kobosh · January 24, 2024, 4:01pm

Thanks Cameron what about adding “not found” to mywalk function? is there a way to do that?

cameron · January 24, 2024, 10:19pm

You criteria for “not found” are not clear to me.

The general approach is to set a flag, something like this:

 found = False
 for item in items:
     if this_is_what_we_wanted(item):
         found = True
         # often you might only need to find the first matching item
         # so we might break out of the loop at this point;
         # if you need to do things to _all_ matching items,
         # do not break
         break
 if not found:
     print("no matching item found")

Topic		Replies	Views
Writing recursion functions Python Help	1	334	October 14, 2022
Check all folders if empty or not using python code Python Help help	4	1084	May 25, 2023
Missing variable assignment Python Help	11	368	December 29, 2023
Help understand code, beginner Python Help help	4	298	February 27, 2024
Adding the method find() to list Ideas	9	7839	May 8, 2020

Find directory without using os.walk or recursion

Related Topics