What Do I Use for a Max-Heap Implementation in Python

built-in max heap API in Python

In the past I have simply used sortedcontainers's SortedList for this, as:

> a = SortedList()
> a.add(3)
> a.add(2)
> a.add(1)
> a.pop()
3

It's not a heap, but it's fast and works directly as required.

If you absolutely need it to be a heap, you could make a general negation class to hold your items.

class Neg():
def __init__(self, x):
self.x = x

def __cmp__(self, other):
return -cmp(self.x, other.x)

def maxheappush(heap, item):
heapq.heappush(heap, Neg(item))

def maxheappop(heap):
return heapq.heappop(heap).x

But that will be using a little more memory.

how to get the max heap in python

The documentation says,

Our pop method returns the smallest item, not the largest (called a
“min heap” in textbooks; a “max heap” is more common in texts because
of its suitability for in-place sorting).

So you cannot get a max heap directly. However, one way to get it indirectly is to push the negative of the item on the heap, then take the negative again just after you pop the item. So instead of heappush(h, (200, 1)) you execute heappush(h, (-200, -1)). And to pop and print the max item, execute

negmaxitem = heappop(h)
maxitem = (-negmaxitem[0], -negmaxitem[1])
print(maxitem)

There are other ways to get the same effect, depending on what exactly you are storing in the heap.

Note that trying h[-1] in a min-heap does not work to find the max item--the heap definition does not guarantee the max item will end up at the end of the list. nlargest should work but has time complexity of O(log(n)) to just examine the largest item in a min-heap, which defeats the purpose of the heap. My way has time complexity O(1) in the negative-heap to examine the largest item.

MaxHeap Python implementation

First of all, you are right that this computation of left and right is not how it should be, but... it still works.

With this way of working we have a graph where the root's left child is actually a self-reference (a loop). Because the code never continues a __floatUp or __bubbleDown recursive call when the values of parent and child are equal, this loop in the graph will never be traversed.

So we have here a root node that has at the most only one real child (its right child). From that child onwards, things return to normal. Except for this strange root, all other nodes root a subtree that is a complete binary tree. This strange root makes the "heap" slightly less optimal, as often the tree will have one more level than absolutely necessary. But it will still give correct results.

Other remarks

Some other strange things in this code:

  • peek will raise an error when the heap is empty.
  • peek will give the wrong result when the root's value is a falsy value, like 0. Then it will return False. These two points should be fixed by applying the same len check as pop does.
  • In pop the final else block can never execute, as the value of len() can never be anything else than > 0 or == 0. This makes no sense.
  • pop will raise an error when the heap is empty, because then the middle case will execute, where the code tries to pop from an empty list

The video

You have not correctly represented what the video provides as code.

The video actual explains at 0:50 that the array should start at index 1, ignoring index 0.

At 3:45 and further, you can see that the presenter has indeed coded it that way. The code you present here is not the same as theirs. It seems you have tried to adapt it somehow, and thereby introduced more problems than the original code had.

Min/Max Heap implementation in Python

I've added __repr__ and __gt__ methods to your Comparator class, so the code now runs, and the Comparator instances display their val when printed.

The important thing is to get those comparison methods to do the comparisons correctly between two Comparator instances.

You'll notice that I've eliminated most of the methods from MaxHeap. They aren't needed because the methods inherited from MinHeap work ok. You may wish to restore this one to MaxHeap

def __getitem__(self, i):
return self.heap[i].val

depending on how you intend to use MaxHeap.

import heapq

class MinHeap:
def __init__(self):
self.heap = []

def push(self, item):
heapq.heappush(self.heap, item)

def pop(self):
return heapq.heappop(self.heap)

def peek(self):
return self.heap[0]

def __getitem__(self, item):
return self.heap[item]

def __len__(self):
return len(self.heap)

class MaxHeap(MinHeap):
def push(self, item):
heapq.heappush(self.heap, Comparator(item))

class Comparator:
def __init__(self, val):
self.val = val

def __lt__(self, other):
return self.val > other.val

def __eq__(self, other):
return self.val == other.val

def __repr__(self):
return repr(self.val)

if __name__ == '__main__':
max_heap = MaxHeap()
max_heap.push(12)
max_heap.push(3)
max_heap.push(17)

while True:
try:
print(max_heap.pop())
except IndexError:
# The heap's empty, bail out
break

output

17
12
3

It's probably a Good Idea to give Comparator the full set of rich comparison methods. They aren't needed to make the above code work, but they will make the Comparator instances more flexible. So in case you want them, here they are:

def __lt__(self, other):
return self.val > other.val

def __le__(self, other):
return self.val >= other.val

def __gt__(self, other):
return self.val < other.val

def __ge__(self, other):
return self.val <= other.val

def __eq__(self, other):
return self.val == other.val

def __ne__(self, other):
return self.val != other.val

MinHeap/MaxHeap implementation in Python

These classes are based on Python's heapq structure, which is built on a standard Python list. The smallest item in the minheap and the largest item in the maxheap is at index zero. So just return

self.heap[0]

If the heap is empty, that will cause an error, but that probably is what you want.

Heap Implementation in Python 3

What I feel by looking at your code and the input is that your current output is right and your expected output is wrong actually.

The thorough explanation is as such:

Initially, we push 0 so the heap is having 0 only at the top.

We push 1, heap becomes something like this:


  1.      0
    /
    1

similarly if we follow all steps, the heap in each step is going to be like this: (See below for each step)



  1.         0
    / \
    1 3

  2.         0
    / \
    1 3
    /
    17

  3.         0
    / \
    1 3
    / \
    17 21

  4.         0
    / \
    1 3
    / \ /
    17 21 36

  5.         0
    / \
    1 3
    / \ / \
    17 21 36 7

  6.         0
    / \
    1 3
    / \ / \
    17 21 36 7
    /
    25

  7.         0
    / \
    1 3
    / \ / \
    17 21 36 7
    / \
    25 100

  8.           0
    / \
    / \
    1 3
    / \ / \
    / \ / \
    17 21 36 7
    / \ /
    25 100 19

Now at this point, 19's parent is greater so, we swap it with 21.
Hence the correct heap would be something like this:

             0
/ \
/ \
1 3
/ \ / \
/ \ / \
17 19 36 7
/ \ /
25 100 21

Now if we you do the level order traversal you will get the correct output as your current output.

I think the expected output is correct for input "2" instead of "21" and may be this is where your confusion begins



Related Topics



Leave a reply



Submit