The heapq module contains some private max-heap variants of its heap functions: _heapify_max, _heappop_max, _heapreplace_max. This exist to support the higher-level functions like merge(). I’d like the _max variants to be made public (remove the underscore prefix), and documented. This will make it easy for users to create and manipulate max heaps as well.
The current public heapq functions only work with min-heaps. Using them to create a max-heap requires either storing inverted values (i.e. -x, which only works for numbers), or using some kind of wrapper class to invert comparisons.
Several such solutions are discussed in the 12 year old Stack Overflow question What do I use for a max-heap implementation in Python?. The second most popular solution is to use the private *_max variant functions, which I would like to make public.
(inspired by looking at the MinHeap problem on @trey 's Python Morsels)
@rhettinger What do you think? Looks like a sensible request. If there’s a good reason not to do this it would be nice to have it written up for posterity.
Two of those (non-)discussions were just dismissed on the basis that it has been rejected in the past. The first request was rejected on the basis that people haven’t needed it, but if you do, just negate the values.
If it is as simple as just negating your (numeric only) values, then why do the heapq have max-heap code for internal use? Why doesn’t it just negate the values?
The issue keeps coming up on the bug tracker, and on Stackoverflow:
Indeed. I note it was mentioned by both the reporter and the responding developer on subsequent reports that this was brought up many times in the past and rejected, but the rationale given on the very first issue for rejecting it was that there hadn’t been “significant demonstrated need”, to which the original user did reply but was never responded to.
Certainly, there may be a compelling reason for rejecting this, but if so it should at least be stated and publicly documented, given the amount of interest and requests.
We can do this. Much of the code is already there.
Mostly when this has come up before, it was usually a curiosity question, “I see a misheap, why isn’t there a maxheap?”. There wasn’t much in the way of actual needs other than being given this as a homework problem or coding challenge. That said, I’ve had some use cases and would like it if the functionality were exposed.
Instead of creating a max version, wouldn’t it make more sense to add a reverse= and key= optional arguments, like you have with sort? This would allow you to have heaps based on any ordering you want.
A good use case for a max heap is mentioned in heapq’s documentation itself:
As far as I know, in-place sorting with a max heap is the only way to sort an arbitrary list with O(n log n) time complexity and O(1) space complexity.
I recently posted a fairly lengthy answer to an old StackOverflow question with all the hoop-jumping workarounds only because the max heap functions aren’t exposed as public functions from heapq:
But on second thought, if in-place sorting is the only good use case for a max heap, then maybe instead of making max heap functions public we can make in-place sorting itself a public function in heapq.
Hey, I just want to point out that this seems like a big gap. “Just invert the values negatively” isn’t a great answer when you’re sorting complex objects with custom comparitors. What, I’m just supposed to invert my gt and lt methods in an expensive wrapper class? What if I’m trying to use the objects I put in the heap for other purposes? Other algorithms use heaps. What if I’m writing an A*-like heuristic function (actually, I have a lot of these, hundreds and hundreds of lines of code of tens of implementations of variations on these for game AI) and need to maximize a heuristic? I have to take my heuristic val and invert it before putting it in the heap, resulting in a ton of overly complex (and computationally wasteful) conversion code, just to get the value back out and invert it again before doing all the other value comparisons that need to be done. I’m surprised more people aren’t complaining, I guess not that many people write complex heuristic searches in python?
I got sick of the overly complex and obtuse negation in my value functions (over 2x the amount of code is necessary because of missing max-heap, I need a heuristic func that produces the actual values for other things to use, and then a separate heuristic func that chains the output from the value func and inverts each property individually for the heap-based search), and started switching to use heapq_max · PyPI instead for much, much cleaner code for my (many) use cases where the search heuristic and value heuristic are the same, but it is so old (or inefficiently compiled?) that the performance is about 100% worse than the min-heap in heapq. I’ve tried heap-class · PyPI as well but its performance is even worse than that, for both min and max heaps.
I’m using python because I dont want to write C (nor figure out how to get my C compiled as fast as heapq is, as well as figuring out all the interop stuff…)
Please, just give us high performance max-heap methods from heapq it seems so strange for such an important core data structure from core algorithms to be completely missing.