Improve dqeue: add a method to remove matching elements

Hi there,

for deque, we have 2 ways to remove element as of now:

  1. Use remove method, for example
q = deque([1,2,3])
q.remove(2)

this is good to remove specific value

  1. Use index, e.g. del q[1], if q is a deque, however, delete by index is expensive for deque, since to locate n’th element is O(n) time

It will be nice to have an atomic way to remove elements that matching condition defined by a callback function, e.g.;

q = deque([1,2,3,4,5,6,7,8,9,10])
# this allow us to remove all even values
q.remove_for(lambda value: value%2==0)

I can probably do it with a utility function, but how do I lock the deque without introducing additional lock.

from collections import deque
from typing import Any, Callable

def remove_for(q:deque, should_remove: Callable[[Any],bool]) -> None:
    for _ in range(0, len(q)):
        value = q.pop()
        if not should_remove(value):
            q.appendleft(value)

q = deque([1,2,3,4,5,6,7,8,9,10])
remove_for(q, lambda value:value%2==0)

Thanks,
Stone

There is a more general way:

q = deque(value for value in q if value % 2)

It works not only with deque, but with any collections, including immutable ones.

Just a note: This creates a new deque object (the old one stays unmodified). It is not an atomic removal of elements.

It is more atomic than the other method. Every time you access deque q you see either the original deque, or the deque with all requested elements removed, not something with reordered and partially removed elements.

2 Likes

Thx. My question is, this is not atmoic, you might get deque mutated during iteration if someone calls q.extendleft(range(0, 100000)) during time you are still in the loop.

That can already happen with deque.remove:

from collections import deque

class C:
    def __eq__(*_):
        d.append(0)

d = deque(range(3))
d.remove(C())

Outcome (Try it online!):

IndexError: deque mutated during remove().

This is expected, you cannot mutate deque while you are iterating it, in your case you mutated it in eq during iteration. However, if you have 2 atomic mutation operation, each might mutate deque, you won’t get such behavior as long as each is atomic. This is something like below:

from collections import deque
from typing import Any, Callable

def remove_for(q:deque, should_remove: Callable[[Any],bool]) -> None:
    with q.get_lock(): # I know deque does not expose it's internal lock, assuming it has
        for _ in range(0, len(q)):
            value = q.pop()
            if not should_remove(value):
                q.appendleft(value)

q = deque([1,2,3,4,5,6,7,8,9,10])
remove_for(q, lambda value:value%2==0)

Thanks,
Stone

When a data structure performs poorly, that may be a clue that you are misusing the data structure.

Dequeues are designed for fast and efficient removal at the ends. If you are trying to remove many items from the middle, you are fighting the data structure and should consider using a regular list.

In Python, in general, the best way to remove many items from a sequence is usually to create a new sequence. I agree with Serhiy that something like this is the best solution:

from collections import deque
q = deque(values)
... # later
q = deque(value for value in q if value % 2)

If it is important to update the deque in place rather than replace it
with a new object, we can do so with a temporary list:

tmp = [value for value in q if value % 2]
q.clear()
q.extend(tmp)

If this is not sufficient, we can consider creating a deque.replace(items), or allow slicing:

q[:] = (value for value in q if value % 2)

We don’t have to allow slice assignment with arbitrary bounds, but it would be good to allow replacement of the entire deque in one atomic assignment.

4 Likes

Actually what I ask is very similar to std:erase_if (see std::erase, std::erase_if (std::deque) - cppreference.com) in c++.

q = deque(value for value in q if value % 2) uses too much memory, it uses O(N-M) extra memory given q has N elements and M elements match the “remove” criteria, I am expecting the operation to use O(1) extra memory.

Use O(N) time to remove matching items from a deque with N elements is reasonable.

– Stone

actually std:erase_if is not exactly like what I asked for, but std:remove_if is very similar to what I asked for.

We do not have such a method for lists. Deques are used rarely.