This:
backups = sorted(
[f for f in os.listdir(backup_dir) if f.startswith(os.path.splitext(filename)[0]) and f.endswith(os.path.splitext(filename)[1])],
key=lambda f: os.path.getctime(os.path.join(backup_dir, f))
)
packs too much into one line.
It can be simplified to:
backups = [f for f in os.listdir(backup_dir) if f.startswith(os.path.splitext(filename)[0]) and f.endswith(os.path.splitext(filename)[1])]
backups = sorted(
backups,
key=lambda f: os.path.getctime(os.path.join(backup_dir, f))
)
and then to:
backups = [f for f in os.listdir(backup_dir) if f.startswith(os.path.splitext(filename)[0]) and f.endswith(os.path.splitext(filename)[1])]
backups.sort(key=lambda f: os.path.getctime(os.path.join(backup_dir, f)))
Unfortunately, the first line has a subtle problem.
Suppose you have 2 files, “foo.txt” and “foobar.txt”.
Backing them up could lead to backups called, say, “foo_20241223_171418.txt” and “foobar_20241223_172025.txt”.
Now look at what the retention code is doing: f.startswith(os.path.splitext(filename)[0]) and f.endswith(os.path.splitext(filename)[1]
.
When you backup “foo.txt”, it asks whether “foo_20241223_171418.txt” starts with “foo”. It does.
But it also asks whether “foobar_20241223_172025.txt” starts with “foo”. It does.
So it treats “foobar_20241223_172025.txt” as though it’s a backup of “foo.txt”.
That’s not what you want.
You want it to check whether the name of the backup file is the name of the original file plus a timestamp plus the extension. Best to put that test into a function to keep the code clearer:
backups = [f for f in os.listdir(backup_dir) if is_backup_file(f, filename)]
backups.sort(key=lambda f: os.path.getctime(os.path.join(backup_dir, f)))
and define the functions:
def is_backup_file(backup_filename, orig_filename):
backup_base, backup_ext = os.path.splitext(backup_filename)
orig_base, orig_ext = os.path.splitext(orig_filename)
return backup_base.startswith(orig_base) and is_timestamp(backup_base[len(orig_base) : ]) and backup_ext == orig_ext
def is_timestamp(ts):
try:
timestamp = datetime.strptime(ts, "_%Y%m%d_%H%M%S")
except ValueError:
return False
return True