◇ dsa-prep/heaps.md
09 — Heaps & Priority Queues
1. Heap conceptually
~7 min read·updated 5/29/2026
09 — Heaps & Priority Queues
Neetcode Week 8a. A heap is a partially ordered tree that gives you the minimum (or maximum) in O(log n) time. The two big patterns: Top K and Two Heaps for the Median.
Table of Contents
- Heap conceptually
- Array-backed heap (the actual implementation)
- Operations and complexity
- Python
heapq - C++
priority_queue - Pattern A — Top K Elements
- Pattern B — Two Heaps for Streaming Median
- Pattern C — K-way Merge
- Pattern D — Scheduling / Greedy with priority
- Pitfalls
- Practice list
1. Heap conceptually
A min-heap is a binary tree where every parent is ≤ its children. The minimum is always at the root.
1 ← min, always at root
/ \
3 2
/ \ / \
5 4 6 7
Constraints:
- Heap order: parent ≤ children (min-heap) or parent ≥ children (max-heap).
- Shape: complete binary tree (every level filled left-to-right except possibly the last).
The shape constraint means we can store the heap in an array with no pointers.
A max-heap is the same but with parent ≥ children. Top is the maximum.
2. Array-backed heap
For node at index i in a 0-indexed array:
- left child:
2i + 1 - right child:
2i + 2 - parent:
(i - 1) // 2
Index: 0 1 2 3 4 5 6
┌────┬────┬────┬────┬────┬────┬────┐
Heap: │ 1 │ 3 │ 2 │ 5 │ 4 │ 6 │ 7 │
└────┴────┴────┴────┴────┴────┴────┘
Visualizes as:
1 (idx 0)
/ \
3 (1) 2 (2)
/ \ / \
5(3) 4(4) 6(5) 7(6)
Push (sift up)
- Append at the end of the array.
- While new element < its parent, swap with parent.
Pop (sift down)
- Swap root with last element. Pop last (the old root, returned).
- Starting at root, while it's > smaller child, swap with the smaller child.
Both operations are O(log n) because the tree has height log n.
3. Operations and complexity
| Op | Time | Description |
|---|---|---|
peek() | O(1) | Return root |
push(x) | O(log n) | Insert + sift up |
pop() | O(log n) | Remove root + sift down |
heapify(arr) | O(n) | Build heap from array |
replace(x) | O(log n) | Pop and push (a single sift-down) |
⚠️ Building a heap from
nelements is O(n), not O(n log n). The sift-down approach (Floyd's algorithm) only does total work proportional to n. Don't sayO(n log n)forheapifyin interviews.
| Op | Why O(n) for heapify? |
|---|---|
| Naive: push n times | Each push O(log n), total O(n log n). |
| Floyd: from index n/2-1 down to 0, sift down | Most nodes are near the bottom and only sift a few levels. Math gives sum ≤ 2n = O(n). |
4. Python heapq
Python's heap is a min-heap stored in a list. It mutates the list in place.
import heapq h = [] heapq.heappush(h, 3) heapq.heappush(h, 1) heapq.heappush(h, 5) heapq.heappop(h) # 1 h[0] # peek minimum (just index 0) # heapify in O(n) arr = [5, 1, 3, 8, 2] heapq.heapify(arr) # arr is now [1, 2, 3, 8, 5] — valid min-heap # heappushpop — push and pop in one go (faster than two ops) heapq.heappushpop(h, x) # heapreplace — pop then push (errors if heap empty) heapq.heapreplace(h, x) # nlargest / nsmallest — for static arrays, when k is small heapq.nlargest(3, [4, 1, 7, 3, 8, 2]) # [8, 7, 4] heapq.nsmallest(3, [4, 1, 7, 3, 8, 2]) # [1, 2, 3]
Max-heap in Python (the workaround)
# Negate values heapq.heappush(h, -x) top_max = -heapq.heappop(h)
Heap of tuples (multi-key sort)
heapq.heappush(h, (priority, secondary, item)) # ordered by priority, then secondary
Pitfall: if the items themselves aren't comparable, ties on priority will try to compare item and crash. Add an incrementing counter as a tiebreaker:
counter = 0 heapq.heappush(h, (priority, counter, item)) counter += 1
This pattern shows up constantly.
5. C++ priority_queue
C++'s default is a max-heap:
priority_queue<int> pq; pq.push(3); pq.push(1); pq.top(); // 3 (max) pq.pop();
For a min-heap:
priority_queue<int, vector<int>, greater<int>> pq;
For pairs / custom objects, provide a comparator:
auto cmp = [](const pair<int,int>& a, const pair<int,int>& b) { return a.first > b.first; // min-heap by first }; priority_queue<pair<int,int>, vector<pair<int,int>>, decltype(cmp)> pq(cmp);
Difference summary
Python heapq | C++ priority_queue | |
|---|---|---|
| Default | Min-heap | Max-heap |
| Negate trick for opposite | yes | no — use greater<T> |
| Heapify in place | heapq.heapify(list) | make_heap(v.begin(), v.end()) |
| Pop returns value | heappop(h) returns | pq.pop() is void; use pq.top() first |
6. Pattern A — Top K Elements
Trigger: "k largest / smallest", "k closest to origin", "top k frequent."
Approach 1 — Heap of size k
For "top K largest," maintain a min-heap of size k. The min of the heap is the k-th largest seen so far. When a new element is bigger, push it and pop the min.
import heapq def top_k_largest(nums, k): h = [] # min-heap for x in nums: if len(h) < k: heapq.heappush(h, x) elif x > h[0]: heapq.heapreplace(h, x) # pop and push in one return list(h)
Complexity: O(n log k). Space: O(k).
For "top K smallest," use a max-heap of size k (negate values in Python).
Approach 2 — Heapify and pop k times
def top_k_smallest(nums, k): heapq.heapify(nums) # O(n) return [heapq.heappop(nums) for _ in range(k)] # O(k log n)
Complexity: O(n + k log n). Better when k ≈ n (since heap-of-size-k = O(n log k) ≈ O(n log n)).
Approach 3 — Quickselect (average O(n))
Find the k-th element with quickselect (average O(n)) and then partition. Worth knowing for a "best you can do" answer; details in §7 of foundations. Often skipped in favor of heap solution because it's robust and easier to write.
Top K Frequent Elements
from collections import Counter import heapq def top_k_frequent(nums, k): count = Counter(nums) return heapq.nlargest(k, count.keys(), key=count.get)
Or bucket sort for O(n) — see 01-arrays-hashing.md.
K Closest Points to Origin
import heapq def k_closest(points, k): return heapq.nsmallest(k, points, key=lambda p: p[0]**2 + p[1]**2)
7. Pattern B — Two heaps
Trigger: "running median," "median of stream," "balance two halves."
Maintain two heaps:
small— max-heap of the smaller halflarge— min-heap of the larger half
Invariants:
- Every element in
small≤ every element inlarge. len(small) == len(large)orlen(small) == len(large) + 1.
Median is:
- If sizes equal: average of
small.top()andlarge.top(). - Else:
small.top().
Find Median from Data Stream (Hard)
import heapq class MedianFinder: def __init__(self): self.small = [] # max-heap (negate) self.large = [] # min-heap def addNum(self, num): # 1. Push to small (negate for max-heap) heapq.heappush(self.small, -num) # 2. Move max of small to large to maintain ordering heapq.heappush(self.large, -heapq.heappop(self.small)) # 3. Rebalance sizes if len(self.large) > len(self.small): heapq.heappush(self.small, -heapq.heappop(self.large)) def findMedian(self): if len(self.small) > len(self.large): return -self.small[0] return (-self.small[0] + self.large[0]) / 2
Steps 1–2: insert into the wrong heap to maintain ordering, then bubble across. This naturally enforces invariant 1.
Step 3: keep small ≥ large in size by 0 or 1.
Complexity: add = O(log n), median = O(1).
Why two heaps?
To get the median in O(1), you need to split the data at the middle. Heaps give you O(1) access to each side's extreme, and O(log n) inserts. Sorted data structures (BST, sorted list) give the same ops but two heaps are simpler.
8. Pattern C — K-way Merge
Trigger: "merge K sorted arrays / lists," "k smallest in matrix."
Push the head of each sorted source into a heap. Pop smallest, append, push next.
Merge K Sorted Lists (Hard) — see also 06-linked-list.md
import heapq def merge_k_lists(lists): h = [] for i, lst in enumerate(lists): if lst: heapq.heappush(h, (lst.val, i, lst)) # i for tiebreak dummy = ListNode() tail = dummy while h: val, i, node = heapq.heappop(h) tail.next = node tail = node if node.next: heapq.heappush(h, (node.next.val, i, node.next)) return dummy.next
Complexity: O(N log k) where N = total elements, k = number of lists.
Kth Smallest in Sorted Matrix (Medium)
import heapq def kth_smallest(matrix, k): n = len(matrix) # Push first column h = [(matrix[r][0], r, 0) for r in range(min(k, n))] heapq.heapify(h) for _ in range(k - 1): val, r, c = heapq.heappop(h) if c + 1 < n: heapq.heappush(h, (matrix[r][c + 1], r, c + 1)) return h[0][0]
Treat each row as a sorted source. K-way merge until we pop k times.
(Better solution exists with binary search on the answer — see 05-binary-search.md.)
9. Pattern D — Scheduling
Trigger: "schedule tasks with cooldown," "process intervals by priority."
Task Scheduler (Medium)
Tasks have types A-Z. Same type must wait
ncycles before reusing. Find minimum time.
from collections import Counter, deque import heapq def least_interval(tasks, n): count = Counter(tasks) h = [-c for c in count.values()] # max-heap by count heapq.heapify(h) q = deque() # (count_remaining, ready_time) time = 0 while h or q: time += 1 if h: cnt = heapq.heappop(h) + 1 # one fewer of this task if cnt < 0: # still has remaining q.append((cnt, time + n)) # ready at time + n if q and q[0][1] == time: heapq.heappush(h, q.popleft()[0]) return time
The heap holds tasks ready now; the deque holds tasks waiting for cooldown to expire.
Reorganize String (Medium) — same pattern with chars
Rearrange string so no two adjacent chars are equal.
def reorganize_string(s): count = Counter(s) h = [(-cnt, ch) for ch, cnt in count.items()] heapq.heapify(h) res = [] prev = (0, '') while h: cnt, ch = heapq.heappop(h) res.append(ch) if prev[0] < 0: heapq.heappush(h, prev) prev = (cnt + 1, ch) # one used; reschedule next round return ''.join(res) if len(res) == len(s) else ""
10. Pitfalls
Heap doesn't support efficient delete-by-value
Removing an arbitrary element from a heap is O(n). For "lazy deletion" pattern: keep a separate set of "deleted" items, and skip them when they bubble to the top.
deleted = set() while h and h[0] in deleted: heapq.heappop(h) return h[0]
Forgetting tiebreaker in tuples
heapq.heappush(h, (priority, my_object)) # crashes on tie if my_object isn't comparable
Insert a counter as second element.
Confusing min vs max
Python = min by default. C++ priority_queue = max by default. Always state which.
Negate-trick subtleties
For min-heap-via-negation in Python, remember to negate back when reading:
heapq.heappush(h, -x) top = -h[0] # ← negate back
Heap is not sorted
Iterating a heap doesn't yield sorted order. Heap order is partial — only root is guaranteed minimum/maximum.
Building a heap is O(n)
Don't say "n log n" for heapify. It's O(n) by Floyd's algorithm.
Updating priority in Python
heapq doesn't support decrease-key. Workaround: push the new (priority, item) and ignore stale popped items.
while h and h[0][1] != current_priority[h[0][2]]: heapq.heappop(h)
Dijkstra needs decrease-key conceptually but lazy works
For shortest-path-with-weights problems, the lazy approach (push duplicates, skip stale) is fine and is what most interview implementations look like. See 12-advanced-graphs.md.
11. Practice list
| # | Problem | Pattern | Difficulty |
|---|---|---|---|
| 703 | Kth Largest Element in Stream | Heap of size k | Easy |
| 1046 | Last Stone Weight | Max-heap | Easy |
| 973 | K Closest Points to Origin | Top K | Medium |
| 215 | Kth Largest Element in Array | Heap or QuickSelect | Medium |
| 621 | Task Scheduler | Heap + cooldown queue | Medium |
| 355 | Design Twitter | Heap-merge per user | Medium |
| 295 | Find Median from Data Stream | Two heaps | Hard |
| 23 | Merge K Sorted Lists | K-way merge | Hard |
| 378 | Kth Smallest in Sorted Matrix | K-way merge | Medium |
| 767 | Reorganize String | Heap scheduling | Medium |
| 347 | Top K Frequent Elements | Heap or bucket sort | Medium |
TL;DR ✅
- A heap = complete binary tree with parent ≤/≥ children. Stored in array (no pointers).
peekO(1),pushandpopO(log n),heapifyO(n).- Python
heapqis min-only; negate values for max. C++priority_queueis max by default; passgreater<>for min. - Top K = heap of size k. O(n log k).
- Two heaps = streaming median.
- K-way merge = heap of source heads.
- Lazy deletion for "decrease key" — push new, skip stale on pop.
- Add a tiebreaker when pushing tuples of incomparable objects.
Next: 10-backtracking.md.
External resources
- NeetCode Heap playlist: https://www.youtube.com/playlist?list=PLot-Xpze53lfRGq-aHftEbHJtoX0E5DCN
- Python heapq docs: https://docs.python.org/3/library/heapq.html
- VisuAlgo heap: https://visualgo.net/en/heap
// 1 view