POST AT DISCUSS.PYTHON.ORG
I have downloaded many webpages as HTML and associated folder. The folder has the same name as HTML file, except it has an additional “_files” at the end. For example,
sample.html
sample_files
I’d like to move both of these into a folder named sample
I’d like to permanently associate these two objects with each other by placing them in the same folder. A new folder will be created for each matching pair, taking the same name as the html file without the extension.
HTML file and _files directory should be in the same folder. If one is found in a folder but not the other, then no new directory should be created.
I have a bash script to put each html file in its own directory, but not the associated folder, as follows . . .
find . -type f -name '*.rtf' -exec sh -c '
for f; do mkdir -p -- "${f%.*}" && mv -v -- "$f" "${f%.*}" ; done' _ {} +
I am learning python and find bash more difficult, I prefer a python solution (but either is acceptable).
I can os-walk through the top-level folder containing all the other directories and HTML files. But I get mixed up when it comes to placing these two objects in a parent folder. I seem to be operating on two levels at one time.
This is the code that didn’t work.
#!/usr/bin/env python
import os
import string
import shutil
from os.path import splitext
from pathlib import Path
##=## RUN FROM FOLDER WITH ITEMS TO BE COMPRESSED ##=##
for root, dirs, files in os.walk('/Volumes/HighSierra/Users/ericlindell/Documents/testTTS-combine/'):
# CHECK ALL FILES IN FOLDER FOR MP3 EXTENSION
for checkFileHTML in files:
# IF IT IS AN HTML FILE
if checkFileHTML.endswith('.HTML'):
HTMLFile = checkFileHTML
HTMLFileNoExt = checkFileHTML[:-5]
# ITERATE OVER FOLDERS
for checkDir in dirs:
# FIND DIR ENDING _files
if checkDir.endswith('_files'):
checkDirAbbrev = checkDir[:-6]
if checkDirAbbrev == HTMLFileNoExt:
os.makedirs(checkDirAbbrev)
# MOVE ZIP INTO NEWDIR
filePath = os.path.join(root, checkHTML)
shutil.move(filePath, HTMLFileNoExt)
# MOVE HTML FILE INTO NEWDIR
filePath = os.path.join(root, checkDirAbbrev)
shutil.move(filePath, checkDirAbbrev)
Any help much appreciated !!