Oh! Okay. In that case, lemme introduce you to heaps!
A heap in Python is built on top of a list, but instead of sorting the list completely, you kinda “half sort” it according to some rules (the details can be found in the heapq module’s docs). Importantly, the element at the start of the list is the one with the lowest value, just like it would be if you sorted the list.
Instead of sorting node.others()
and then merging it with the remainder of the list, what you’d do is simply add the others into the heap. This can be done fairly efficiently.
And instead of iterating over the list in strict order, you repeatedly ask the heapq module to pop the first element off the list; it’ll then rebalance things so that popping the first element off is always reasonably fast.
Ultimately, what you get is a heap containing all the things you haven’t looked at yet. (There’s no record of all the ones you have looked at; if you want that, push them into a list as you process them.) Everything runs in O(log n) time, which is generally going to be fast enough for what you need.
The biggest advantage of this switch is that you no longer need to wonder whether the iteration and mutation will raise an exception, because the iteration logic basically looks like this:
while nodes_list:
node = heapq.heappop(nodes_list)
if node.adds_others:
for new_node in node.others():
heapq.heappush(nodes_list, new_node)
# process the node as normal
This is now completely normal and well-understood, and is most definitely safe. Your two (or three) concerns are assured:
-
It won’t raise an exception. Each step is individually safe (heappop will raise if the heap is empty, but it’s only called after the while
loop has checked that there’s still stuff to process), so you can be confident that this is safe.
-
It won’t end iterating before all nodes in the list have been considered. Safe because adding to a heap correctly means the loop will continue.
-
Including where the last node adds others. This is where the order of checks becomes important; the while loop only checks for content in the list after any potential nodes have been added to the heap. So the list might become empty between the heappop
and the subsequent heappush
calls, but that won’t end the loop.
A heap is a cool data structure that looks, to me, like someone went “wow, keeping a list sorted is just WAY too much trouble and I can’t be bothered”. It’s a lazy approach to sorting that just does the barest minimum to get by, and then lets tomorrow take care of itself. In real-world usage, it turns out that laziness is actually really practical!