Add binary search tree to collections

marraui · February 6, 2026, 3:10am

Sorry if this has been brought up before, I looked through previous posts but didn’t see another post about this.

It would be nice to have a self balancing binary search tree in the collections library, similar to std::set and std::multiset in C++

tim.one · February 6, 2026, 4:19am

Unlikely to happen. The popular Sorted Containers package, by Grant Jenks, proved to be a “category killer” for this kind of task:

Pure Python.
Relatively simple code.
Runs at least as fast as tree-based C/C++ coded extensions, which it pretty much put out of business.
And with far less memory overhead (no pointers);
Extensive docs and testing.

It builds on lists of Python lists (no pointers), leveraging that modern HW is extremely fast at moving contiguous chunks of memory around in L1 cache, provided it’s not “too much” at a time. SortedList is maintained in sorted order, using binary search to find elements and insertion points. Rebalancing is both simple and rare, much like a very fat and shallow RAM-based B-tree.

Give it a try! Most people who do are happy with it, and don’t want to write mounds of C code that’s less practical

AnalogAlgorithm · February 6, 2026, 10:05am

Could sortedcontainers be incorporated in the standard library?

tim.one · February 6, 2026, 8:11pm

“In theory”, sure, but that’s something to discuss with the package’s author and the Steering Council. There are tradeoffs of many kinds to consider.

I’m personally fine with it either way.

nedbat · February 6, 2026, 8:25pm

Adding things to the standard library increases the maintenance burden on the core team, and limits when updates can be made.

Is it a problem to install it from PyPI?

marraui · February 6, 2026, 9:17pm

Maybe not a good enough reason to do so, but the reason I asked is because you can’t install modules in programming competitions like the icpc

tim.one · February 6, 2026, 9:58pm

Which is one of the tradeoffs to consider, There are a number of contexts in which a Python user can’t change anything about the installation a service supplies - so if a facility isn’t built in, it may as well not exist.

In the case of sortedcontainers, though, the source code is pure Python, so you could include it all in your submission to a competitive programming service. Indeed, there’s very little a competitive programmer won’t do to score high

It’s more a puzzle for C extension modules, like regex, which is even more popular than sortedcontainers, and is pretty much a drop-in replacement for Python’s re module, but with many more advanced features and far faster speed in many cases of pathological regexps. Last I heard, its author didn’t want to be tied to Python’s process-heavy and relatively sluggish release cycle.

Stefan2 · February 7, 2026, 10:25am

@tim.one In ICPC you don’t have Internet access and can’t bring electronics. Only printed stuff. So you’d have to copy all the code from paper by manually typing it. That’s no good, especially given the time pressure.

tim.one · February 7, 2026, 9:19pm

Thanks! I didn’t know that. Many years ago I had a lot of fun “competing” on SPOJ, which was all done via automated web submissions. I lost interest when it became apparent that a number of top scorers were just iterating, “cleverly” arranging to end a run with “time limit exceeded” or “non zero exit code” to game the system into revealing a bit more info about the (secret) inputs on each run.

They were quite happy to supply a Python with some external packages, though. At the time, the biggest help was installing Psyco (a pioneering attempt, abandoned in 2010, at creating a JIT for Python, which evolved into the far more practical PyPy - memory use under Psyco exploded, but PyPy often needs less RAM than CPython).

Anyway, same story in the end: there are always tradeoffs to consider. In this case, I doubt “make life easier for ICPC contestants” would carry much weight with the Steering Council. “Add a core type that guarantees good worst-case time behaviors for operations on lexicographically ordered sequences” might.

But that gets tricky too. sortedcontainers.SortedList isn’t enough on its own. and neither is a naïve binary search tree. Add a pile of strings of the form 'x' * N for increasing N, and they’ll still spend ever-more massive amounts of time chewing over ever-longer matching prefixes

So any attempt here needs to define “the problem” to be solved, not just “add some possibly related data structure”.

AnalogAlgorithm · February 8, 2026, 8:36pm

Not a problem at all.

But the projects seems to be stale.

Last release on pypi in May 2021.

Quite few open issues GitHub · Where software is built

No typing support

tim.one · February 8, 2026, 9:18pm

Missing context. What, specifically, is not a problem?

Alas, yes, the project appears to have been abandoned a few years ago. No signs of life at all, just protracted silence. Still works fine for me, though.

Neither I nor a bot could find any hint of an explanation. I expect (but don’t know) Grant burned out on it and moved on.

It remains theoretically possible it could be folded into the core, but that’s less likely when the original author has lost interest. All of it would have to be driven by others, along with a realistic plan for eternal ongoing maintenance.

nedbat · February 8, 2026, 9:25pm

You have a few options:

Use it as-is anyway.
Fork the repo and fix those problems, publishing under a new name.
Try to get the maintainer’s attention to find out their plan.

A fourth option is to propose to add it the standard library, but that doesn’t solve any of these problems, it’s just nominating a specific group of people to solve them.

Most of those seem like feature requests, so “quite a few issues” is a little misleading.

This was a reply to me asking if it was a problem to install it from PyPI.

Stefan2 · February 8, 2026, 10:47pm

Last commit was March 2, 2024. He started working at OpenAI that month.

tim.one · February 9, 2026, 1:26am

Missing from that list: “none of the above”

The OP asked for a “self balancing binary search tree”, and sortedcontainers got dragged into it because it effectively killed off the C/C++ coded extensions that aimed at implementing self-balancing BSTs.

Of which there are many different flavors. The design question remains: should Python offer such a thing? Many other language environments do. and it’s at least curious that Python doesn’t.

If the answer is “yes”, then sortedcontainers is just one example of relevant prior art that could be followed. The design space is large, and a PEP would be needed to weigh the various tradeoffs (always good worst-case time? always good amortized time? faster average-case time at the expense of occasionally worse bad-case time? should it adapt to try to speed recent searches? and so on, weighing too against memory burden).

Its continuing popularity sure suggests that sortedcontainers found a sweet spot in this space,

dg-pb · February 9, 2026, 8:16am

One alternative that I think might have more value is to focus on non-sorted container, which has random access O(1) (or close to it) and sub-linear insert/delete (anywhere).

Once such is available, making sorted containers is easy (in addition to many other cool things).

I have been in search for something that would fit for quite a while now.

sortedcontainers approach is good, but there are few other good options to consider.

I think sortedcontainers hit the sweet spot for PyPI package, but for stdlib it might be very different story.

hwelch · February 9, 2026, 2:40pm

I have to interact with locked python environments a lot and it’s made me write most of my code to rely only on stdlib.

There are ways to modify the shipped conda environment (this is for ESRI’s arcpy environment), but for non-technical users that process is a bit complicated and can’t be relied on.

Do still think that major justification is needed for any changes to stdlib, and looking into common managed environments like that should be done first. Sometimes it’s easier to just have the environment maintainer include a PyPi package.

tim.one · February 10, 2026, 5:09am

You can approach that using sortedcontainers.SortedKeyList now, by passing a constant function (like key=lambda x: 42). Then .add() is the same as.append() on a regular list (all incoming element keys then compare equal, so “stability” counsels adding each new one to the end).

To maintain the “it’s sorted” invariant, they don’t support an .insert(index, elt) method, but it’s certainly capable of inserting anywhere in O(log n) time. And it already supports by-index del list[index] and .pop(index) methods, also O(log n).

Indexing is cheaper than you might think While it’s O(log n), the relevant n is the number of leaf-level lists, not the number of elements. The former is about 500 times smaller than the latter. O() hides this distinction, but in real life it matters a lot.

Anyway, I think a simplification of SortedList (forget the “sorted” part) is already close to what you’d like. Don’t know what else you have in mind, but offhand don’t see a good reason for why, say, ropes or gap arrays would be expected to do better.

Absolutely the kind of thing that should be discussed, and I’m always keen to be surprised

dg-pb · February 10, 2026, 6:25pm

I agree that sortedcontainers approach is a solid candidate.

However, in my experience I use it as it is only at higher level of things.
But none of my attempts to use it in hot paths were successful.

Benchmarks of my first iteration on improving upon sortedcontainers.SortedList (I am using this one now, but it is quite far from the final version that I would be content with):

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃      macOS-11.7.10-x86_64-i386-64bit | CPython: 3.12.4           ┃
┃          5 repeats, 10 times | 2025-04-15T14:04:17               ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃        Units: µs     10     100      1000      10000      100000 ┃
┃                  ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃       get_lst_cx ┃    1       8        97      1,024       9,689 ┃
┃       add_lst_cx ┃    6      43       606     10,392     111,747 ┃
┃       get_lst_sc ┃    3      22       267      8,751     126,809 ┃
┃       add_lst_sc ┃    8      59       785     11,988     119,023 ┃
┗━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

_cx - my 1st iteration
_sc - sortedcontainers
get is random access - simple improvement.
add is similar in speed - not much can be done to improve (at least in pure Python).

Note that this is about relative performance - actual routines are loops of 1 operation.

Also, the way they approach the tree is more complex than it could be.

I have only started iteration 2, which has flat structure of hierarchical bins that links to each other via reference indices. Much like Automaton states (this was way before I started working with Automaton btw). And initial benchmarks were even better for random access and more importantly for insort.

So although the concept is the same, there are options to explore how to implement it - and results can depend quite a lot on this (both in performance and maintainability).

And why I put iteration 2 of above on hold is because I found new structure - “Tiered Array”.
I don’t know if it would be better or not, but it is definitely worth exploring.

Main advantage of it is possibility of no copy - it has the quality of heapq in a sense that you can map it onto the list and do some work. Once you are done, can rotate ring buffers back into places in O(n) (much like final sorted on heapqed list).

It’s insert/delete cost is O(n^(1/k)). So that sounds like worse than O(logn). But in practice it is very capable. Theoretically, you want O(logn) and you know the size - can solve log2(n) == n^(1/k).

For example, for list of size 100K inplace (well, memory is ~ 0.5% of n) binary insertion sort using tiered vector (k=2) is only 2x slower than list.sort.

I don’t know how much it scales though - k=3 and k=4 should retain reasonable performance (relative to alternatives). Total size of this would be ~ 1000 (1st level) * 100^3 (3 more levels) = 1e9. I am quite certain this would hold steady against alternatives up to this size, but not sure about larger ones.

Lately I saw an attempt to make collections.deque O(1) random access which didn’t go through. For a good reason I suppose.

And given that deque exists, there is little to no chance that another deque will be implemented with O(1) random access. Why would we want to maintain 2 deques…?

And! Certain variation of “Tiered Vector” can have O(1) append / appendleft, so it could potentially fill the gap as deque with O(1) random access as well.

So something along the lines of:

Maps onto simple list
O(1) - O(logn) random access
O(logn) - O(n^(1/k)) insert / delete anywhere
O(1) insert / delete at ends.

Might be possible… Amortized…

This might sound ambitious, but personally, I would love to have something like this implemented and optimized in stdlib. One multipurpose container that fills many gaps well.

There are benefits in that.

Enhancements trickle to many places - gets better polish over time.
Or Cython could optimize one container = performance gains in many places.
Also, easier to learn and remember something that I use often.

tim.one · February 11, 2026, 7:22am

To cut guessing, I’m going under the assumption that this is an early version of the idea you have in mind: Tiered Vectors: Efficient Dynamic Arrays for Rank-Based Sequences

The scheme there is quite simple: a list L of deques, each of capacity B. All deques except for the last are kept full (hold B elements). A conceptual index i >= 0 then resolves to the deque L[i // B], very fast if B is a power of 2. And then it’s almost done. The element at index i % B in that deque is then “the answer” Also very fast if B is a power of 2.

That obviously makes indexing constant-time, which is great. And without needing to build additional structures to speed indexing (as SortedList does)..

The trick is speeding insert/delete. Toward that end, deques must be implemented as circular arrays. so that pushing/popping on their ends can be done in O(1) time regardless of how large B is. This complicates the idea of what “an index” means within a deque, but not by much.

So, e.g., to delete the element at index i, all the deques at indices (in the list L) from i // B through the end must be touched. The first deque may have to move up to B-1 elements to “plug the hole”. Then for each deque after that, pop its leftmost element and push it as the new rightmost element in the deque before it in the list. Those are O(1) operations. So no worse than O(B) overall, assuming len(L) <= B too.

SortedList also has O(1) indexing, but only in special cases: if i >= 0 and i < len(first_sublist) or i < 0 and i >= -len(last_sublist). In particular, xs[0] and xs[-1] are always constant time. Very handy for double-ended priority queues - but those don’t come up often

Note: “list of deques” above is conceptual. Offhand I don’t see a reason for why it couldn’t be implemented as a single contiguous vector of L*B elements, with lists of length L of “head and tail” indices (into that vector) of the conceptually distinct L different circular deques. Or even one list just giving “head” indices, and a single int giving the number of elements in the last deque currently occupied (all the others are full - hold B elements); Whatever, O(\sqrt{n}) additional storage is certainly acceptable.

Offhand I didn’t find anything about fancier elaborations of the base idea, but you (@dg-pb) seem to have fancier versions in mind. Please share relevant links?

tim.one · February 11, 2026, 7:39am

Never mind. My bot pointed me to this

which already did a world of implementation experiments.