Inbuilt module implementing doubly linked list

Why doesn’t Python have an inbuilt module for linked lists? There must be one for it, just like a list in C++ STL.

It is rare to need the properties of double-link-lists in python code
i.e. insert and remove O(1) at any point in the list.

What is your use case?

I think that’s because built-in types like lists, sets, and dictionaries, as well as collections in the ‘collections’ module, cover many use cases efficiently compared to linked lists which are specialized data structures that are optimal for certain use cases.

I just wanted to insert elements in between the existing data in O(1).

How large is your data set? O(1) means it’ll theoretically be faster at some point, but unless your dataset is phenomenally huge, the extra overhead will outweigh the benefits.

I have two datasets with length of order 10^5 and I want to insert data from one into another at the designated indices.

Linked lists do not give you O(1) inserts in the middle because the linked list needs to be traversed to find the insertion point. Depending on what you need to do with the resulting list, there are more efficient ways of dealing with this problem. The first thing that comes to mind is to concatenate the lists and sort the resulting list appropriately to have the elements end up where you need them.

Those are not really large data sets. If you write the merge of the two sets as a list comprehension, you should end up with good enough performance.

Alternatively, you may want to check whether holding the data in DuckDB would work and you could have DuckDB take care of the merging for you. Since DuckDB does a lot of smart optimizations before actually going to work, this may be even more performant.

I would recommend a merge, which in Python can be done quite efficiently with concatenation and a properly-designed sort. But don’t worry too much about algorithmic complexity; 100,000 elements really isn’t that many for a modern computer. (Maybe different if you’re on a microcontroller, but then you’d be worried about total memory size.)

at the designated indices.

Was this not quite what you meant? If you know what the indices are in advance, it’s an ideal job for an array or even a regular list, using slices.

You are correct if you have scan for the insert point, but I have worked on algotithms where you already have a pointer to an element that in the middle.
But this is very specialist stuff, kernel code in the cases i vaguely recall.

Given N classrooms in a school and ith classroom has a capacity of A[i] students. Bob is a
builder and follows the instructions of Alice.
Alice gives Q instructions of the following types:

  • 1L: Move L classrooms left
  • 2 R: Move R classrooms right
  • 3 X Y: Remove the next classroom and add two new classrooms of capacity X and
    Y, respectively to the right of the current classroom. (After performing this operation
    classroom number changes accordingly)

Note. The queries are always valid.
Initially. Bob is in the 1st classroom. After performing all instructions of Alice, print the
capacity of all classrooms from 1 to total classrooms.

3 < N,Q < 10^5
1 < L,R < 5
1 < A[i], X, Y < 10^5

Eg:
N=5, Q=4
A = [1, 2, 3, 4, 5]
Queries:

  1. 2 3
  2. 3 11
  3. 1 2
  4. 3 5 7

How would you solve it with a time limit of 1 sec without a doubly linked list?

I’d look for an extension library providing an optimised doubly linked list.

That’s a cute problem, but niche (that is almost verbatim the same as examples on the “cheat on your homework” websites chegg and brainly that provide services for plagiarists). But it doesn’t merit a new core module.

How would you solve it with a time limit of 1 sec without a doubly linked list?

Actually scratch my previous reply, I’d use a deque. It’s not emphasised in its documentation (which focuses on fast prepends and appends) but it has an insert method:

If we look at the source code, the comments reveal it is actually implemented as a doubly linked-list.

/* Data for deque objects is stored in a doubly-linked list of fixed
 * length blocks.  
...

https://github.com/python/cpython/blob/main/Modules/_collectionsmodule.c#L56

or if I needed even more performance, I’d experiment with this third party library that already wraps bits of the C++ STL:

I’d write straightforward code using Python lists, then I’d benchmark it. I’d ask what the real-world constraints were leading to the 1-second time limit, and what options I had (for instance, could I buy a faster computer, or rewrite the code in Rust?)

Based on that, I’d look for existing extension modules on PyPI that give me the performance improvement I need, or write a C extension if I had to.

The one thing I certainly wouldn’t do is ask for a new stdlib module and stop working on the problem until Python 3.13 came out…

1 Like

I’ll be interested to see a real builder perform these “classroom insertions” in O(1).

This is an extremely contrived example that is presumably designed purely as a homework exercise in using doubly linked lists.

Agreed, and when an example is this obviously contrived, it’s best to craft your own DLL class than to try to get one in the standard library.

Yeah, thanks, everyone, for the discussion. I mostly work in C and C++ and recently switched to Python. While solving this problem, when I didn’t find any analogous data structure to list in C++ implementing linked lists I thought if C++ have a doubly linked list implemented in its STL, then why don’t Python have it?
But, as per the discussion, it seems like it is not a much-used Data structure and is very rare and problem specific so better to use a 3rd party module or write my own class for it.

then why don’t Python have it?

You’re making a false assertion in this question. Because you haven’t even read the replies from people trying to help you properly, let alone read the docs.

Python has a Deque, which is implemented as a doubly linked list. It’s in the collections module

2 Likes

You don’t need to write an extension module to create a data structure with the kind of performance your example requires. You can do that in pure Python. Most likely that is precisely the goal of an exercise like that.