I have a python source directory containing compiled python3.10 along with the rest of the libraries needed to build python. I wrote two functions, one synchronous and the other asynchronous to measure how long it took for both to traverse the source directory and count the number of files therein basing on their file extensions.
The async function was magnitudes of times faster than the sync function, but I realized later that the traversal for the async function was incomplete. It only traversed the first directories and their children but not the children’s children and children’s children’s children and so forth. Here is the code I wrote for the async and sync functions, hopefully anyone here can help me diagnose the issue as I’m looking around as well.
synchronous function
from pathlib import Path
files = {}
def check (folder):
for file in folder.iterdir():
if file.is_dir():
print("checking: ",str(file)) #added this to track the folders being traversed
check(file)
else:
ext = file.suffix
if ext in files:
files[ext] += 1
else:
files[ext] = 1
check(Path(r"path-to-python-built-directory"))
asynchronous function
import asyncio
from pathlib import Path
files = {}
async def check (folder):
for file in folder.iterdir():
if file.is_dir():
print("checking: ",str(file)) #added this to track folders being traversed
asyncio.create_task(check(file))
else:
ext = file.suffix
if ext in files:
files[ext] += 1
else:
files[ext] = 1
async def main ():
await check(Path(r"path-to-python-built-directory"))
asyncio.run(main())