My problem here is that there doesn’t seem to be any way to find out what has gone wrong. I’m relatively new to programming by the way.
So basically, I have made a simple script that recursively processes all the files under a directory. What I’ve included below is a simplification of the code I was using that I can run to reproduce the same problem. I’m happy to include the full script if that would be better. When I run the code below on almost any directory, it works perfectly. My problem is that my script was designed to work on a specific directory I have, which is full of files nested inside a maze of subdirectories, and although it has been working flawlessly until now, and allowed me to process 1000s of files, it seems to have encountered a specific file that causes a crash.
The details are as follows:
When I run my script, or when I run the snippet below, and import_path is set to this specific directory, the program crashes with nothing but the word ‘Killed’.
The problem directory has been working for a long time until now. Leading me to believe that the problem is caused by a specific file.
I can’t think of a way to get the address of the file, in order to delete it.
I’m running these programs from the command line on my Debian 12 PC
import os
slots = 1
import_path = '/media/user/60/to_import'
def process2(address):
	print(address)
def process(address):
	if address.endswith(('.jpg', '.png', 'jpeg')):
		process2(address)
		global slots
		slots = slots - 1
def list_files_scandir(path):
	with os.scandir(path) as entries:
		for entry in entries:
			if slots > 0:
				if entry.is_file():
					process(entry.path)
				elif entry.is_dir():
					list_files_scandir(entry.path)
			else:
				break
list_files_scandir(import_path)
I would be grateful for any insight.
Edit: I have fixed a typo, and included the full script below:
from PIL import Image
import os
import sqlite3
import hashlib
import shutil
max_files = 100
import_path = '/home/user/sort-qimgv/import'
import_files = os.listdir(import_path)
imported = 0
skipped = 0
db_path = "/home/user/sort-qimgv/database.db"
def hash(filename):
	with open(filename, 'rb', buffering=0) as f:
		return hashlib.file_digest(f, 'sha256').hexdigest()
def thumbnail(file):
		WIDTH = 1920
		HEIGHT = 1280
		img = Image.open(file)
		img.thumbnail((WIDTH, HEIGHT))
		img.save(file)
if len(import_files) == 0: 
	raise Exception("No files to import.")
else: 
	print("Files found.")
# How many slots are free?
conn = sqlite3.connect(db_path)
cur = conn.cursor()
cur.execute("SELECT * FROM `index` WHERE location=0")
rows = cur.fetchall()
slots = max_files - len(rows)
conn.close()
print(str(slots) + ' slots.')
def import_file(file):
	hash_first = hash(file)
	conn = sqlite3.connect(db_path)
	cur = conn.cursor()
	cur.execute("SELECT * FROM hash WHERE hash=?", (hash_first,))
	rows = cur.fetchall()
	conn.close()
	if len(rows) > 0:
		global skipped
		skipped = skipped + 1
		os.remove(file)
		return
	conn = sqlite3.connect(db_path)
	cur = conn.cursor()
	cur.execute("insert into hash (hash) values (?)", (hash_first,))
	conn.commit()
	if os.path.getsize(file) > 1000000:
		thumbnail(file)
		hash_second = hash(file)
		conn = sqlite3.connect(db_path)
		cur = conn.cursor()
		cur.execute("insert into hash (hash) values (?)", (hash_second,))
		conn.commit()
	# find a suitable id
	conn = sqlite3.connect(db_path)
	cur = conn.cursor()
	cur.execute("SELECT MIN(id) + 1 FROM `index` WHERE id + 1 NOT IN (SELECT id FROM `index`)")
	rows = cur.fetchall()
	conn.close()
	value = rows[0][0]
	if value is None:
		newid = 1
	else:
		newid = value
	conn = sqlite3.connect(db_path)
	cur = conn.cursor()
	cur.execute("insert into `index` values (?, 0)", (newid,))
	cur.execute("insert into old_name values (?, ?)", (newid, str(file)))
	conn.commit()	
	new_destination = "/home/user/sort-qimgv/sort/" + str(newid) + os.path.splitext(file)[1]
	shutil.move(file, new_destination)
	global imported, slots
	imported = imported + 1
	slots = slots - 1
def process(address):
	if address.endswith(('.jpg', '.png', 'jpeg')):
		print(address)
		import_file(address)
def list_files_scandir(path):
	with os.scandir(path) as entries:
		for entry in entries:
			if slots > 0:
				if entry.is_file():
					process(entry.path)
				elif entry.is_dir():
					list_files_scandir(entry.path)
			else:
				break
list_files_scandir(import_path)
print(str(imported) + ' imported, ' + str(skipped) + ' skipped.') 
Edit 2:
I’ve realised that a lot of what I put in the snippet was extraneous. This is actually all I need to reproduce the bug:
user@C1:~/sort-qimgv$ cat test2.py
import os
import_path = '/media/user/60/to_import'
def list_files_scandir(path):
	with os.scandir(path) as entries:
		for entry in entries:
			if entry.is_file():
				print(entry.path)
			elif entry.is_dir():
				list_files_scandir(entry.path)
list_files_scandir(import_path)
user@C1:~/sort-qimgv$ python3 test2.py
Killed
user@C1:~/sort-qimgv$
Edit 3:
Well, this is kind of embarrassing… What was reliably going wrong yesterday had miraculously fixed itself today, and I have done absolutely nothing except restart my computer.
So I’m totally nonplussed about what was going on, and am now unable to investigate further.
I think it must have has something to do with the way the drive was mounted, due to the spontaneous after a restart. The reason I ruled that out before was that all the other directories on that same drive didn’t have that problem. I also mounted it this time in exactly the same way I always do using Thunar. Anyway, that’s the end. Sorry for the anticlimax. I’m happy to answer any questions, but obviously I can’t do any testing unless it happens again.