Patterns & Tips
A comprehensive reference for heap and priority queue patterns. Use this as a cheat sheet during practice and a quick refresher before interviews.
Heap Pattern Cheat Sheet
| Pattern | Heap Type | Key Idea | Time |
|---|---|---|---|
| Top-K Largest | Min-heap of size K | Push all, pop when size > K. Root = Kth largest. | O(n log K) |
| Top-K Smallest | Max-heap of size K | Push negated, pop when size > K. | O(n log K) |
| K-Way Merge | Min-heap of K elements | Push head of each list, pop min, push next from same list. | O(N log K) |
| Two Heaps (Median) | Max-heap + Min-heap | Lower half in max-heap, upper in min-heap. Balance sizes. | O(log n) insert |
| Scheduling | Min-heap on end/time | Sort by start, heap tracks earliest available resource. | O(n log n) |
| Lazy Deletion | Min or Max-heap | Mark deleted, skip on pop. Used in sliding window problems. | Amortized O(log n) |
| Greedy + Heap | Varies | Sort one dimension, heap manages the other. (e.g., max performance) | O(n log n) |
Python heapq Tricks & Gotchas
1. Max-Heap via Negation
# Python only has min-heap. Negate for max-heap.
import heapq
max_heap = []
heapq.heappush(max_heap, -10)
heapq.heappush(max_heap, -20)
largest = -heapq.heappop(max_heap) # 20
2. Tuple Comparison with Tiebreaker
# WRONG: crashes if values are uncomparable
# heapq.heappush(heap, (priority, my_object))
# RIGHT: use a counter as tiebreaker
counter = 0
heapq.heappush(heap, (priority, counter, my_object))
counter += 1
3. Heapify vs Repeated Push
# SLOW: O(n log n) - pushing one at a time
heap = []
for x in data:
heapq.heappush(heap, x)
# FAST: O(n) - heapify a list in-place
heap = list(data)
heapq.heapify(heap)
4. heappushpop vs heapreplace
# heappushpop: push first, then pop (may pop what you just pushed)
result = heapq.heappushpop(heap, val)
# heapreplace: pop first, then push (never pops what you push)
result = heapq.heapreplace(heap, val)
# Both are faster than separate push + pop
5. nlargest/nsmallest - When to Use
import heapq
data = [3, 1, 4, 1, 5, 9, 2, 6, 5]
# Use nlargest/nsmallest when k is small relative to n
top3 = heapq.nlargest(3, data) # O(n log 3) = O(n)
bot3 = heapq.nsmallest(3, data) # O(n log 3) = O(n)
# When k ~ n, just sort
# When k = 1, use min()/max() instead
Decision Tree: When to Use a Heap
Step 1: Does the problem ask for K-th largest/smallest, top-K, or frequent-K?
→ Yes: Use Top-K pattern (min-heap of size K for largest, max-heap for smallest)
Step 2: Does it involve merging K sorted sequences?
→ Yes: Use K-way merge pattern
Step 3: Does it ask for a running median or balanced partition?
→ Yes: Use two-heap pattern (max-heap + min-heap)
Step 4: Does it involve scheduling, intervals, or resource allocation?
→ Yes: Use scheduling pattern (sort by start, heap by end/priority)
Step 5: Does it need repeated min/max queries with dynamic insertions?
→ Yes: Use a basic heap (consider lazy deletion if removals are needed)
Otherwise: Consider if sorting, binary search, or a different data structure is more appropriate.
Common Mistakes
Forgetting Negation
Using a min-heap when you need max. Always negate values for a max-heap in Python. Don't forget to negate back when reading!
No Tiebreaker
Pushing (priority, object) crashes when priorities are equal and objects aren't comparable. Always use (priority, counter, object).
Wrong Heap Size
For "K largest", use a min-heap of size K (not max-heap of size n). The root of the min-heap IS the Kth largest.
Modifying Heap Directly
Never modify heap elements in-place (e.g., heap[i] = new_val). This breaks the heap invariant. Use lazy deletion instead.
Heap vs Alternatives
| Approach | Insert | Get Min/Max | Delete Min/Max | Best For |
|---|---|---|---|---|
| Heap | O(log n) | O(1) | O(log n) | Priority queue, top-K, streaming |
| Sorted Array | O(n) | O(1) | O(1) | Static data, rare inserts |
| BST / SortedList | O(log n) | O(log n) | O(log n) | Arbitrary deletions, rank queries |
| Sorting | N/A | O(1) | N/A | One-time top-K on static data |
Frequently Asked Questions
Use a heap when: (1) data arrives as a stream and you can't sort upfront, (2) you only need top-K elements where K << n (heap is O(n log K) vs O(n log n) for sort), or (3) you need dynamic insertions after initial processing. Use sorting when you have all data upfront and need the full sorted order, or when K is close to n.
Python's heapq doesn't support arbitrary deletion. Use lazy deletion: mark elements as "deleted" in a hash map, and when they appear at the top during heappop, skip them. This gives amortized O(log n) per operation. For frequent arbitrary deletions, consider using sortedcontainers.SortedList which supports O(log n) deletion by value.
heappushpop(heap, val) pushes the value first, then pops the smallest. This means the pushed value might be immediately popped if it's the smallest. heapreplace(heap, val) pops first, then pushes. The popped value is always from the original heap. Both are more efficient than calling push and pop separately because they only sift once.
Beam search maintains the top-K most probable partial sequences during autoregressive decoding (translation, summarization, code generation). At each step, each of the K beams generates V candidates (V = vocabulary size), giving K*V total candidates. A min-heap of size K efficiently selects the K best from these K*V candidates in O(KV log K) time. Without a heap, you'd need to sort all K*V candidates at O(KV log KV).
queue.PriorityQueue is thread-safe but slower due to locking overhead. For coding interviews and single-threaded applications, always use heapq directly — it's faster and gives you more control. PriorityQueue is designed for producer-consumer patterns in multi-threaded programs, not for algorithmic problems.
Python's heapq doesn't support decrease-key natively. Two approaches: (1) Lazy insertion: push the updated (lower) key as a new entry and mark the old entry as invalid. When you pop an invalid entry, skip it. This is the standard approach for Dijkstra's algorithm in Python. (2) Use a different structure: if you need true decrease-key, consider a Fibonacci heap (not in Python stdlib) or indexed priority queue.
The top 10 most frequently asked: (1) Kth Largest Element, (2) Top K Frequent Elements, (3) Merge K Sorted Lists, (4) Find Median from Data Stream, (5) Meeting Rooms II, (6) Task Scheduler, (7) K Closest Points to Origin, (8) Reorganize String, (9) The Skyline Problem, (10) Ugly Number II. Master these and you'll handle most heap questions confidently.
Quick Reference Templates
Top-K Template
import heapq
def top_k_largest(nums, k):
"""Return k largest elements."""
heap = []
for num in nums:
heapq.heappush(heap, num)
if len(heap) > k:
heapq.heappop(heap)
return sorted(heap, reverse=True) # optional sort
K-Way Merge Template
import heapq
def k_way_merge(lists):
"""Merge k sorted lists into one sorted list."""
heap = []
for i, lst in enumerate(lists):
if lst:
heapq.heappush(heap, (lst[0], i, 0))
result = []
while heap:
val, list_i, elem_i = heapq.heappop(heap)
result.append(val)
if elem_i + 1 < len(lists[list_i]):
next_val = lists[list_i][elem_i + 1]
heapq.heappush(heap, (next_val, list_i, elem_i + 1))
return result
Two Heaps (Median) Template
import heapq
class RunningMedian:
def __init__(self):
self.lo = [] # max-heap (negated)
self.hi = [] # min-heap
def add(self, num):
heapq.heappush(self.lo, -num)
heapq.heappush(self.hi, -heapq.heappop(self.lo))
if len(self.hi) > len(self.lo):
heapq.heappush(self.lo, -heapq.heappop(self.hi))
def median(self):
if len(self.lo) > len(self.hi):
return -self.lo[0]
return (-self.lo[0] + self.hi[0]) / 2.0
Scheduling Template
import heapq
def min_resources(intervals):
"""Find minimum resources for overlapping intervals."""
intervals.sort()
heap = [] # end times of active intervals
for start, end in intervals:
if heap and heap[0] <= start:
heapq.heappop(heap)
heapq.heappush(heap, end)
return len(heap)
Lilly Tech Systems