built-in max heap API in Python
In the past I have simply used sortedcontainers's SortedList
for this, as:
> a = SortedList()
> a.add(3)
> a.add(2)
> a.add(1)
> a.pop()
3
It's not a heap, but it's fast and works directly as required.
If you absolutely need it to be a heap, you could make a general negation class to hold your items.
class Neg():
def __init__(self, x):
self.x = x
def __cmp__(self, other):
return -cmp(self.x, other.x)
def maxheappush(heap, item):
heapq.heappush(heap, Neg(item))
def maxheappop(heap):
return heapq.heappop(heap).x
But that will be using a little more memory.
how to get the max heap in python
The documentation says,
Our pop method returns the smallest item, not the largest (called a
“min heap” in textbooks; a “max heap” is more common in texts because
of its suitability for in-place sorting).
So you cannot get a max heap directly. However, one way to get it indirectly is to push the negative of the item on the heap, then take the negative again just after you pop the item. So instead of heappush(h, (200, 1))
you execute heappush(h, (-200, -1))
. And to pop and print the max item, execute
negmaxitem = heappop(h)
maxitem = (-negmaxitem[0], -negmaxitem[1])
print(maxitem)
There are other ways to get the same effect, depending on what exactly you are storing in the heap.
Note that trying h[-1]
in a min-heap does not work to find the max item--the heap definition does not guarantee the max item will end up at the end of the list. nlargest
should work but has time complexity of O(log(n))
to just examine the largest item in a min-heap, which defeats the purpose of the heap. My way has time complexity O(1)
in the negative-heap to examine the largest item.
MaxHeap Python implementation
First of all, you are right that this computation of left
and right
is not how it should be, but... it still works.
With this way of working we have a graph where the root's left child is actually a self-reference (a loop). Because the code never continues a __floatUp
or __bubbleDown
recursive call when the values of parent and child are equal, this loop in the graph will never be traversed.
So we have here a root node that has at the most only one real child (its right child). From that child onwards, things return to normal. Except for this strange root, all other nodes root a subtree that is a complete binary tree. This strange root makes the "heap" slightly less optimal, as often the tree will have one more level than absolutely necessary. But it will still give correct results.
Other remarks
Some other strange things in this code:
peek
will raise an error when the heap is empty.peek
will give the wrong result when the root's value is a falsy value, like 0. Then it will returnFalse
. These two points should be fixed by applying the samelen
check aspop
does.- In
pop
the finalelse
block can never execute, as the value oflen()
can never be anything else than> 0
or== 0
. This makes no sense. pop
will raise an error when the heap is empty, because then the middle case will execute, where the code tries to pop from an empty list
The video
You have not correctly represented what the video provides as code.
The video actual explains at 0:50 that the array should start at index 1, ignoring index 0.
At 3:45 and further, you can see that the presenter has indeed coded it that way. The code you present here is not the same as theirs. It seems you have tried to adapt it somehow, and thereby introduced more problems than the original code had.
Min/Max Heap implementation in Python
I've added __repr__
and __gt__
methods to your Comparator
class, so the code now runs, and the Comparator
instances display their val
when printed.
The important thing is to get those comparison methods to do the comparisons correctly between two Comparator
instances.
You'll notice that I've eliminated most of the methods from MaxHeap
. They aren't needed because the methods inherited from MinHeap
work ok. You may wish to restore this one to MaxHeap
def __getitem__(self, i):
return self.heap[i].val
depending on how you intend to use MaxHeap
.
import heapq
class MinHeap:
def __init__(self):
self.heap = []
def push(self, item):
heapq.heappush(self.heap, item)
def pop(self):
return heapq.heappop(self.heap)
def peek(self):
return self.heap[0]
def __getitem__(self, item):
return self.heap[item]
def __len__(self):
return len(self.heap)
class MaxHeap(MinHeap):
def push(self, item):
heapq.heappush(self.heap, Comparator(item))
class Comparator:
def __init__(self, val):
self.val = val
def __lt__(self, other):
return self.val > other.val
def __eq__(self, other):
return self.val == other.val
def __repr__(self):
return repr(self.val)
if __name__ == '__main__':
max_heap = MaxHeap()
max_heap.push(12)
max_heap.push(3)
max_heap.push(17)
while True:
try:
print(max_heap.pop())
except IndexError:
# The heap's empty, bail out
break
output
17
12
3
It's probably a Good Idea to give Comparator
the full set of rich comparison methods. They aren't needed to make the above code work, but they will make the Comparator
instances more flexible. So in case you want them, here they are:
def __lt__(self, other):
return self.val > other.val
def __le__(self, other):
return self.val >= other.val
def __gt__(self, other):
return self.val < other.val
def __ge__(self, other):
return self.val <= other.val
def __eq__(self, other):
return self.val == other.val
def __ne__(self, other):
return self.val != other.val
MinHeap/MaxHeap implementation in Python
These classes are based on Python's heapq
structure, which is built on a standard Python list. The smallest item in the minheap and the largest item in the maxheap is at index zero. So just return
self.heap[0]
If the heap is empty, that will cause an error, but that probably is what you want.
Heap Implementation in Python 3
What I feel by looking at your code and the input is that your current output is right and your expected output is wrong actually.
The thorough explanation is as such:
Initially, we push 0 so the heap is having 0 only at the top.
We push 1, heap becomes something like this:
0
/
1
similarly if we follow all steps, the heap in each step is going to be like this: (See below for each step)
0
/ \
1 30
/ \
1 3
/
170
/ \
1 3
/ \
17 210
/ \
1 3
/ \ /
17 21 360
/ \
1 3
/ \ / \
17 21 36 70
/ \
1 3
/ \ / \
17 21 36 7
/
250
/ \
1 3
/ \ / \
17 21 36 7
/ \
25 1000
/ \
/ \
1 3
/ \ / \
/ \ / \
17 21 36 7
/ \ /
25 100 19
Now at this point, 19's parent is greater so, we swap it with 21.
Hence the correct heap would be something like this:
0
/ \
/ \
1 3
/ \ / \
/ \ / \
17 19 36 7
/ \ /
25 100 21
Now if we you do the level order traversal you will get the correct output as your current output.
I think the expected output is correct for input "2" instead of "21" and may be this is where your confusion begins
Related Topics
Set Markers for Individual Points on a Line in Matplotlib
Pandas Group by and Find First Non Null Value for All Columns
How to Use SQL Parameters with Python
Type Hint for a Function That Returns Only a Specific Set of Values
Python - Using Pandas Structures with Large CSV(Iterate and Chunksize)
Syntaxerror Inconsistency in Python
Subprocess Readline Hangs Waiting for Eof
Python's JSON Module, Converts Int Dictionary Keys to Strings
Why Sum on Lists Is (Sometimes) Faster Than Itertools.Chain
Pandas Fill Missing Values in Dataframe from Another Dataframe
Hiding a Password in a Python Script (Insecure Obfuscation Only)
Python Ctypes Issue on Different Oses
How to Convert List of Key-Value Tuples into Dictionary
How to Extract an Arbitrary Line of Values from a Numpy Array
Sqlalchemy: What's the Difference Between Flush() and Commit()