I’ve gotten a fairly complicated error, but what I don’t understand is the lines its reporting as errors. The stack trace is checking if __name == "__main__"
, an empty line, opening a file, iterating over a list, assigning a variable, and sleeping (sleeping makes sense, but this is the wrong one. Logging indicates it was sleeping in a line 5 lines later)? It looks like the line numbers python is using are off, probably 5 lines earlier than they should be. By which I mean the line shown does correspond to the line number indicated, but I think that does not the line it should be displaying. What could be causing this? The code largely works fine, I just killed it with ^C.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/c/users/Max/Documents/python/Save_To_Wayback/savetowayback.py", line 585, in <module>
if __name__ == "__main__":
^^^^^^
File "/mnt/c/users/Max/Documents/python/Save_To_Wayback/savetowayback.py", line 574, in main
File "/mnt/c/users/Max/Documents/python/Save_To_Wayback/savetowayback.py", line 556, in parse_args_and_save
with open_utf8(NEW_URLS, "r") as file:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/c/users/Max/Documents/python/Save_To_Wayback/savetowayback.py", line 508, in save_url_list
for url in urls:
File "/mnt/c/users/Max/Documents/python/Save_To_Wayback/savetowayback.py", line 485, in add_link
last_url = None
^^^^^^^^^^^^^^
File "/mnt/c/users/Max/Documents/python/Save_To_Wayback/savetowayback.py", line 476, in save_url
time.sleep(BLOCKED_BY_ROBOTS_DELAY)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
I’m not entirely sure what parts of my code to include, but here’s a couple potentially relevant parts.
The function where the error occurred:
def save_url(link: WebsiteLink) -> None:
errors = 0
while True:
try:
capture_with_logging(link)
time.sleep(DEFAULT_DELAY)
return
except save.BlockedByRobots as exc:
logger.error(f"Error {errors} Skipping blocked by robots: {link}, {exc}")
# should not save in this case
time.sleep(BLOCKED_BY_ROBOTS_DELAY)
return
except Exception as exc:
errors += 1
logger.warning(f"Error {errors}: {link}, {exc}")
time.sleep(too_many_reqs_delay(errors))
The surrounding function of the second level:
def add_link(url: str) -> str | None:
logger.debug(f"add_link {url}")
last_url = None
link = make_link(url.strip())
while link.url:
save_url(link)
last_url = pick_url_to_save(link, url)
link = link.attempt_get_next()
return last_url