A heap is a specialized tree-based data structure which is essentially an almost complete tree that satisfies the heap property:
- in a max heap, for any given node
C
, ifP
is a parent node ofC
, then the key (the value) ofP
is greater than or equal to the key ofC
. - in a min heap, the key of
P
is less than or equal to the key ofC
.
The node at the “top” of the heap (with no parents) is called the root node.
Heaps are usually implemented with an array, as follows:
- Each element in the array represents a node of the heap, and
- The parent / child relationship is defined implicitly by the elements’ indices in the array.
Given a node at index i
,
- its children are at indices
2i + 1
and2i + 2
, and - its parent is at index
floor((i-1)/2)
Building heap
|
|
Time Complexity:
build_heap
takes O(n)
time complexity but heap_sort
takes O(nlog(n))
time complexity.
- We call
heapify
, which takeslog(n)
, forn
elements, but the time complexity of building heap comes out to beO(n)
. - The main idea is that in the
build_heap
algorithm the actualheapify
cost is notO(log(n))
for all elements.
Operations
The common operations involving heaps are:
Basic
find-max
(orfind-min
):
find a maximum item of a max-heap, or a minimum item of a min-heap, respectively (a.k.a. peek)insert
:
adding a new key to the heap (a.k.a., push)extract-max
(orextract-min
):
returns the node of maximum value from a max heap [or minimum value from a min heap] after removing it from the heap (a.k.a., pop)delete-max
(ordelete-min
):
removing the root node of a max heap (or min heap), respectivelyreplace
:
pop root and push a new key. More efficient than pop followed by push, since only need to balance once, not twice, and appropriate for fixed-size heaps.
Creation
create-heap
:
create an empty heapheapify
:
create a heap out of given array of elementsmerge
(union
):
joining two heaps to form a valid new heap containing all the elements of both, preserving the original heaps.meld
:
joining two heaps to form a valid new heap containing all the elements of both, destroying the original heaps.
Inspection
size
:
return the number of items in the heap.is-empty
:
return true if the heap is empty, false otherwise.
Internal
increase-key
ordecrease-key
:
updating a key within a max- or min-heap, respectivelydelete
:
delete an arbitrary node (followed by moving last node and sifting to maintain heap)sift-up
:
move a node up in the tree, as long as needed; used to restore heap condition after insertion. Called “sift” because node moves up the tree until it reaches the correct level, as in a sieve.sift-down
:
move a node down in the tree, similar to sift-up; used to restore heap condition after deletion or replacement.
Variants
Operation | find-max | delete-max | insert | increase-key | meld |
---|---|---|---|---|---|
Binary | Θ(1) | Θ(log n) | O(log n) | O(log n) | Θ(n) |
Binomial | Θ(1) | Θ(log n) | Θ(1) | Θ(log n) | O(log n) |
Fibonacci | Θ(1) | O(log n) | Θ(1) | Θ(1) | Θ(1) |
Applications
- Heapsort
- Priority queues can be efficiently implemented using Binary Heap.
- Heap Implemented priority queues are used in Graph algorithms like Prim’s Algorithm and Dijkstra’s algorithm.
- Binomial Heap and Fibonacci Heap are variations of Binary Heap. These variations perform union also in
O(logn)
time which is aO(n)
operation in Binary Heap. - Order Statistics: The Heap data structure can be used to efficiently find the kth smallest (or largest) element in an array. https://en.wikipedia.org/wiki/K-way_merge_algorithm
MinHeap Implementation in Python
|
|
Using Python’s heapq library
heapq
implements min_heap
, see code below:
|
|
Kth largest or smallest
Do not use this when doing heap problems this creates a min heap everytime
|
|
Implement max_heap
with heapq
This library supports min_heap
implementation, to make max_heap
multiply each element with -1
Push custom object with heapq
If the object implements comparison operators, you can push your custom objects into the heap.
In case you are not able to update/implement class definition, you can use tuple to push custom objects, wherein the first member of the tuple is the comparison key.
eg:heapq.heappush(heap, (l.val, l))
, herel
is a class object (lets say linked-list)The problem with putting
(l.val, l)
on the heap is that when two tuples’s first member have the same value, then a comparison will happen between the second member values, i.e. betweenl
class objects.
Either make sure your first member will never have a conflict (this shall throw runtime error) or you can introduce a third member (unique value) in those tuples, between the two members you already have.
eg:heapq.heappush(heap, (l.val, i, l))
But remember if the second member will have conflict this too shall throw runtime error.
Sources: