Inbuilt module implementing doubly linked list

Couldn’t this particular problem be solved efficiently by iterating over the instructions twice – once to figure out the final size of the array that needs to be allocated, and then once more to place the old and new data into the array? If the first step sorts the instructions by position (e.g. by using the heap module), then the second step will know where to place everything in the result array, I think.

(Unless the point of the exercise is to practice using linked lists. In that case, I would try implementing a linked list in Python and see if it’s fast enough.)

Edit: Never mind. This is clearly worse than O(n) because of the sorting.

Even with a doubly linked list (as used in collections.deque) this may not run in under 1 sec.
The bottleneck is the insert operation, which requires extra memory allocation for the inserted objects (plus pointers). In the worst case, if you have 100K insertions in a 100K deque, it just depends on your hardware how fast this is. It turns out that even on a fast laptop (Apple M2 Max) 100K insert ops (in a deque of 100K ints) take a little bit more than 1sec (1.7 sec). If you have a random mixture of the three query instructions, then it will take less than 1 sec in total, since the other operations are just following pointers.

Btw - the CPython source code contains some comments about implementation details for insert and rotate that you might find interesting:

1 Like

With a linked list implemented as a plain Python class (four lines of code), inserting 100000 items into the middle of a 100000 element list (resulting in 200000 elements) takes 0.025 seconds on my machine.

The problem with the deque approach is probably what @barry-scott pointed out, that you need to hang on to a pointer to the element where you want to insert new elements, rather than looking it up each time. But it doesn’t seem like this is possible with the deque. As the documentation says: “Indexed access is O(1) at both ends but slows to O(n) in the middle. For fast random access, use lists instead.”

In the OP’s particular case, it would be simple (with a custom linked list implementation) to keep a pointer to the current position in the list.

3 Likes

Did you do 100_000 separate insertions calls or just one insertion for 100_000 items?

100_000 separate insertion calls but only 1 call to find the insertion point (the middle node). The latter part, finding the insertion point, is what’s slow when using a deque.

In the OP’s case, there’s a “current classroom” where things will be inserted, so by keeping a reference to that list node, performance should be fine with a plain Python class.

But you were saying you were using a simple list? I’m getting this on a simple list and deque:

>>> from timeit import timeit
>>> lis = list(range(100_000))
>>> timeit("lis.insert(50_000, 1)", globals=globals(), number=100_000)
4.527243708998867
>>> timeit("lis[50_000]", globals=globals(), number=100_000)
0.00461541699769441
>>> from collections import deque
>>> d = deque(range(100_000))
>>> timeit("d.insert(50_000, 1)", globals=globals(), number=100_000)
1.673891165999521
>>> timeit("d[50_000]", globals=globals(), number=100_000)
0.13383754099777434

Oh, sorry, I have edited the post to hopefully make it more clear. What I meant is that I implemented my own linked list in the simplest possible way in a few lines of Python. (Just to check that performance is perfectly OK for the task with plain Python without doing anything fancy.)

Ah, ok! That makes sense. Yes, I agree. The deque code is slow because “insert” is implemented as a rotation, followed by append, followed by reverse rotation. So in that sense finding the insertion point is slow.
So, as to the original question “Why does Python not have a built-in linked list”, I guess the most appropriate answer would be: Python doesn’t need this, because it’s simple enough to do this in a few lines of Python code :slight_smile:

Use a deque and rotate it instead of moving, then every operation is O(1).

Or use two lists, one for the classrooms on your left, one for the classrooms on your right in reverse order. Again O(1) for each operation then.

1 Like

The deque version, simulating a moving “pointer” into the deque by instead rotating it and remembering the index where we are (= the rotation amount). With 10^5 initial elements and 10^5 queries it takes me ~20 ms.

from collections import deque

A = [1, 2, 3, 4, 5]
queries = [(2, 3), (3, 1, 1), (1, 2), (3, 5, 7)]

d = deque(A)
i = 0
for q in queries:
    match q:
        case 1, L:
            d.rotate(L)
            i -= L
        case 2, R:
            d.rotate(-R)
            i += R
        case 3, X, Y:
            d[1] = X
            d.insert(2, Y)
d.rotate(i)
print(*d)

Attempt This Online!

Alternatively, with a sentinel signifying the ends:

from collections import deque

A = [1, 2, 3, 4, 5]
queries = [(2, 3), (3, 1, 1), (1, 2), (3, 5, 7)]

d = deque(A)
d.append(None)
for q in queries:
    match q:
        case 1, L:
            d.rotate(L)
        case 2, R:
            d.rotate(-R)
        case 3, X, Y:
            d[1] = X
            d.insert(2, Y)
d.rotate(~d.index(None))
d.pop()
print(*d)

Attempt This Online!

Nice. That’s at least 100 times several times faster than my linked list implementation.

It does slow down the farther apart the insertions are, though, since the rotations are O(n). With some unlucky inputs it takes ~1 second on my machine.

Here is my benchmark:

def generate_queries(count, array_size):
    pos = 0
    for i in range(count // 2):
        if array_size - pos < pos:
            i = int(array_size * 0.25)
            yield 1, pos - i
        else:
            i = int(array_size * 0.75)
            yield 2, i - pos
        yield 3, 4, 5
        pos = i
        array_size += 1

A = list(range(1, 100_000 + 1))
queries = list(generate_queries(len(A), len(A)))

Edit: Whoops, I didn’t realize that the problem specifies 1 < L,R < 5. With that restriction, the O(n) rotation doesn’t matter. This also reduces the impact of the plain Python linked list, so the deque-based version is only 2–3x as fast.

If you need to do insertions at random positions, some sort of merge would most likely be the best. Regardless of the exact positions to be inserted into, you can do the merge in linear time.

Whoops, I just realized that the problem specified 1 < L,R < 5, so my unlucky input is not valid. :slight_smile:

The deque-based solution is the way to go, then.

1 Like

Are you sure you are comparing apples to pears? I thought you were dealing with 100K queries (as was I in the timings I posted earlier), while Stefan seems to only be using 5 queries on a deque of 5 items.

No, like I said:

1 Like

Benchmark:

100000 initial elements
33374 L-queries
33299 R-queries
33327 replace-queries

  8.4 ± 0.1 ms  baseline
 16.3 ± 0.1 ms  solve_deque_optimized
 20.1 ± 0.1 ms  solve_deque
 20.7 ± 0.2 ms  solve_deque_sentinel
 33.8 ± 0.4 ms  solve_stacks2
 34.5 ± 0.5 ms  solve_stacks
 97.5 ± 2.7 ms  solve_linked_list

Python: 3.11.4 (main, Sep  9 2023, 15:09:21) [GCC 13.2.1 20230801]

(baseline is just for curiosity, it just iterates and identifies the queries like I do in all my solvers. So for example my optimized deque solver takes ~7.9 ms to actually process the queries.)

Code

Attempt This Online!

from random import choices, randint, randrange
from time import perf_counter as time
from statistics import mean, stdev
from collections import deque
import sys

def baseline(A, queries):
  for q in queries:
    match q:
      case 1, L:
        pass
      case 2, R:
        pass
      case 3, X, Y:
        pass

def solve_list(A, queries):
  a = A[:]
  i = 0
  for q in queries:
    match q:
      case 1, L:
        i -= L
      case 2, R:
        i += R
      case 3, X, Y:
        a[i+1:i+2] = X, Y
  return a

def solve_stacks(A, queries):
  left = []
  right = A[::-1]
  for q in queries:
    match q:
      case 1, L:
        for _ in range(L):
          right.append(left.pop())
      case 2, R:
        for _ in range(R):
          left.append(right.pop())
      case 3, X, Y:
        right[-2:-1] = Y, X
  return left + right[::-1]

def solve_stacks2(A, queries):
  left = []
  right = A[::-1]
  for q in queries:
    match q:
      case 1, L:
        right += left[:~L:-1]
        del left[-L:]
      case 2, R:
        left += right[:~R:-1]
        del right[-R:]
      case 3, X, Y:
        right[-2:-1] = Y, X
  return left + right[::-1]

def solve_deque(A, queries):
  d = deque(A)
  i = 0
  for q in queries:
    match q:
      case 1, L:
        d.rotate(L)
        i -= L
      case 2, R:
        d.rotate(-R)
        i += R
      case 3, X, Y:
        d[1] = X
        d.insert(2, Y)
  d.rotate(i)
  return list(d)

def solve_deque_sentinel(A, queries):
  d = deque(A)
  d.append(None)
  for q in queries:
    match q:
      case 1, L:
        d.rotate(L)
      case 2, R:
        d.rotate(-R)
      case 3, X, Y:
        d[1] = X
        d.insert(2, Y)
  d.rotate(~d.index(None))
  d.pop()
  return list(d)

def solve_deque_optimized(A, queries):
  d = deque(A)
  rotate = d.rotate
  prepend = d.appendleft
  d.rotate(-1)
  i = 1
  for q in queries:
    match q:
      case 1, L:
        i -= L
        rotate(L)
      case 2, R:
        i += R
        rotate(-R)
      case 3, X, Y:
        d[0] = Y
        prepend(X)
  rotate(i)
  return list(d)

def solve_linked_list(A, queries):
  head = curr = [None] * 3
  for a in A:
    curr[2] = curr = [curr, a, None]
  curr[2] = [None] * 3
  curr = head[2][2]
  for q in queries:
    match q:
      case 1, L:
        for _ in range(L):
          curr = curr[0]
      case 2, R:
        for _ in range(R):
          curr = curr[2]
      case 3, x, y:
        curr[1] = x
        curr[2][0] = curr[2] = [curr, y, curr[2]]
  out = []
  curr = head[2]
  while curr[2]:
    out.append(curr[1])
    curr = curr[2]
  return out

funcs = [
  # solve_list,  # too slow
  solve_stacks,
  solve_stacks2,
  solve_deque,
  solve_deque_sentinel,
  solve_deque_optimized,
  solve_linked_list,
]

# Generate large test case
n = 10**5
A = choices(range(100), k=n)
i = 0
queries = []
while len(queries) < 10**5:
  instr = randint(1, 3)
  if instr == 1:
    if i > 1:
      L = randint(2, min(i, 4))
      queries.append((1, L))
      i -= L
  elif instr == 2:
    if i < n-2:
      R = randint(2, min(n-1 - i, 4))
      queries.append((2, R))
      i += R
  else:
    if i < n-1:
      queries.append((3, randrange(100), randrange(100)))
      n += 1
count = next(zip(*queries)).count
print(len(A), 'initial elements')
print(count(1), 'L-queries')
print(count(2), 'R-queries')
print(count(3), 'replace-queries')
print()

# Correctness
expect = None
for f in funcs:
  result = f(A, queries)
  if expect is None:
    expect = result
  else:
    assert result == expect

# Speed
funcs.append(baseline)
times = {f: [] for f in funcs}
def stats(f):
    ts = [t * 1e3 for t in sorted(times[f])[:5]]
    return f'{mean(ts):5.1f} ± {stdev(ts):3.1f} ms '
for _ in range(25):
    for f in funcs:
        t0 = time()
        f(A, queries)
        times[f].append(time() - t0)
for f in sorted(funcs, key=stats):
    print(stats(f), f.__name__)

print('\nPython:', sys.version)

Perhaps because it is trivially easy to add your own?