◇ dsa-prep/advanced-graphs.md
12 — Advanced Graphs
1. When to use what (decision tree)
~7 min read·updated 5/29/2026
12 — Advanced Graphs
Neetcode Week 10. This is where graphs become weighted: shortest paths, MST, and the Union-Find data structure.
Table of Contents
- When to use what (decision tree)
- Dijkstra's algorithm
- Bellman-Ford (negative weights allowed)
- Bellman-Ford with k stops
- Union-Find (Disjoint Set Union)
- Minimum Spanning Tree (Kruskal & Prim)
- Floyd-Warshall (all-pairs shortest path)
- A* (mention only)
- Walked-through problems
- Pitfalls
- Practice list
1. When to use what
┌────────────────────────────────────────────────────────┐
│ │
│ Need shortest path? │
│ ├── Unweighted (or all weights = 1)? │
│ │ → BFS │
│ ├── Weights all ≥ 0? │
│ │ → Dijkstra (O((V+E) log V)) │
│ ├── Negative weights, no neg cycle? │
│ │ → Bellman-Ford (O(V·E)) │
│ ├── Need all-pairs shortest path? │
│ │ → Floyd-Warshall (O(V³)) │
│ └── Need shortest with at-most-k edges? │
│ → Bellman-Ford restricted to k iterations │
│ │
│ Need to merge groups / detect cycle in undirected? │
│ → Union-Find │
│ │
│ Need minimum spanning tree? │
│ → Kruskal (sort edges + Union-Find) │
│ → Prim (priority queue) │
│ │
└────────────────────────────────────────────────────────┘
2. Dijkstra's algorithm
Single-source shortest path, non-negative weights.
The idea
Maintain a min-heap of (distance_so_far, node). Pop the node with the smallest current distance. For each neighbor, see if going through this node gives a shorter path; if so, update and push.
import heapq from collections import defaultdict def dijkstra(graph, start): """graph: {u: [(v, weight), ...]}""" dist = {start: 0} pq = [(0, start)] while pq: d, u = heapq.heappop(pq) if d > dist.get(u, float('inf')): continue # stale entry for v, w in graph[u]: nd = d + w if nd < dist.get(v, float('inf')): dist[v] = nd heapq.heappush(pq, (nd, v)) return dist
Why does this work?
When we pop a node u, it means no future path can be shorter than what we already have. This is because:
- All weights are ≥ 0, so adding more edges can never decrease total cost.
- The heap pops in increasing order of
d.
So the popped distance is final.
Why "skip if stale" check?
We push duplicate entries when we update a shorter path, never removing the old. When the old (with larger distance) is popped, we skip it.
Complexity
| Op | Cost |
|---|---|
| Each push | O(log V) |
| Total pushes | O(E) (each edge can trigger one) |
| Total | O((V + E) log V) |
Network Delay Time (Medium)
n nodes, weighted directed edges
(u, v, w). Starting fromk, return the time for all to receive a signal, or -1 if impossible.
import heapq from collections import defaultdict def network_delay_time(times, n, k): graph = defaultdict(list) for u, v, w in times: graph[u].append((v, w)) dist = {} pq = [(0, k)] while pq: d, u = heapq.heappop(pq) if u in dist: continue # already finalized dist[u] = d for v, w in graph[u]: if v not in dist: heapq.heappush(pq, (d + w, v)) if len(dist) < n: return -1 return max(dist.values())
if u in dist: continue is the finalize on first pop version, which is more correct than "skip if stale" if we're careful.
3. Bellman-Ford
Single-source shortest path, allows negative weights, but not negative cycles reachable from source.
The idea
Relax every edge V-1 times. After V-1 iterations, the shortest path is guaranteed because any simple path has at most V-1 edges.
def bellman_ford(edges, n, src): dist = [float('inf')] * n dist[src] = 0 for _ in range(n - 1): updated = False for u, v, w in edges: if dist[u] + w < dist[v]: dist[v] = dist[u] + w updated = True if not updated: break # Optional: check for negative cycles for u, v, w in edges: if dist[u] + w < dist[v]: return None # negative cycle reachable return dist
Complexity: O(V · E).
When to use
- Negative weights allowed.
- Need to detect negative cycles.
- "Find shortest path with at most k edges" (next section).
4. Bellman-Ford with k stops
Cheapest Flights Within K Stops (Medium)
Shortest path from
srctodstusing at mostk+1edges (k stops in between).
A modified Bellman-Ford with exactly k+1 iterations:
def find_cheapest_price(n, flights, src, dst, k): dist = [float('inf')] * n dist[src] = 0 for _ in range(k + 1): new_dist = dist[:] # snapshot — important! for u, v, w in flights: if dist[u] + w < new_dist[v]: new_dist[v] = dist[u] + w dist = new_dist return dist[dst] if dist[dst] < float('inf') else -1
Why snapshot? We want each iteration to extend by exactly one more edge. If we mutate dist in place, an updated dist[v] could be used by another edge in the same iteration, effectively skipping ahead. Snapshotting prevents this.
5. Union-Find
Maintain a partition of items into disjoint sets. Two operations:
find(x)— return the representative of x's set.union(x, y)— merge the sets containing x and y.
Use cases
- Connected components (especially when adding edges incrementally).
- Cycle detection in undirected graph.
- Kruskal's MST.
- "Number of friend circles," "accounts merge."
- Many graph problems where you don't need DFS structure.
Implementation with path compression + union by rank
class UnionFind: def __init__(self, n): self.parent = list(range(n)) self.rank = [0] * n self.count = n # number of disjoint sets def find(self, x): # path compression if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) return self.parent[x] def union(self, x, y): rx, ry = self.find(x), self.find(y) if rx == ry: return False # already same set # union by rank: attach smaller under larger if self.rank[rx] < self.rank[ry]: rx, ry = ry, rx self.parent[ry] = rx if self.rank[rx] == self.rank[ry]: self.rank[rx] += 1 self.count -= 1 return True # actual merge
Complexity
With both path compression and union by rank, both find and union are nearly O(1) amortized — specifically O(α(n)) where α is the inverse Ackermann function (effectively ≤ 4 for any practical n).
For interview purposes, just say "amortized O(1) with path compression and union by rank."
Path compression intuition
Before: After find(d):
a a
↑ ↑↑↑↑
b bcde (all point directly to root)
↑
c
↑
d
We flatten the tree on every find call, making future calls instant.
Number of Connected Components (Medium)
def count_components(n, edges): uf = UnionFind(n) for u, v in edges: uf.union(u, v) return uf.count
Graph Valid Tree (Medium)
A tree has exactly
n - 1edges and is connected. Equivalently, n nodes with n-1 edges and no cycle.
def valid_tree(n, edges): if len(edges) != n - 1: return False uf = UnionFind(n) for u, v in edges: if not uf.union(u, v): return False # cycle detected return True
Redundant Connection (Medium)
An undirected graph started as a tree (n nodes, n-1 edges). One extra edge was added. Find the redundant edge — the one that creates a cycle. If multiple answers, return the one appearing last in input.
def find_redundant(edges): uf = UnionFind(len(edges) + 1) # 1-indexed for u, v in edges: if not uf.union(u, v): return [u, v]
The first union that fails is the one creating the cycle.
6. Minimum Spanning Tree
A spanning tree of a connected graph: a subset of edges connecting all V nodes with V-1 edges (no cycles). MST is one with minimum total weight.
Kruskal's Algorithm
- Sort edges by weight.
- For each edge in order: if its endpoints are in different sets (Union-Find), add to MST.
- Stop when V-1 edges added.
def kruskal(n, edges): edges.sort(key=lambda e: e[2]) # by weight uf = UnionFind(n) cost = 0 used = 0 for u, v, w in edges: if uf.union(u, v): cost += w used += 1 if used == n - 1: break return cost if used == n - 1 else -1 # disconnected
Complexity: O(E log E) for sort + O(E · α(V)) for unions.
Prim's Algorithm
- Pick any starting node.
- Maintain a min-heap of
(weight, node)for edges from MST to outside. - Pop smallest. If endpoint not in MST, add it; push its outgoing edges.
import heapq def prim(graph, n): """graph: {u: [(v, weight), ...]}""" visited = {0} pq = [(w, v) for v, w in graph[0]] heapq.heapify(pq) cost = 0 while pq and len(visited) < n: w, v = heapq.heappop(pq) if v in visited: continue visited.add(v) cost += w for u, ww in graph[v]: if u not in visited: heapq.heappush(pq, (ww, u)) return cost if len(visited) == n else -1
Complexity: O((V + E) log V).
Min Cost to Connect All Points (Medium) — Prim or Kruskal
Given 2D points, connect them all with min total Manhattan-distance edges.
Kruskal would generate O(n²) edges; for n up to 1000, that's 10⁶ — workable but borderline. Prim with implicit-graph: at each step, walk all unvisited nodes (O(n²) total) — also O(n²).
import heapq def min_cost_connect_points(points): n = len(points) visited = [False] * n pq = [(0, 0)] cost = 0 edges = 0 while edges < n: w, u = heapq.heappop(pq) if visited[u]: continue visited[u] = True cost += w edges += 1 for v in range(n): if not visited[v]: d = abs(points[u][0]-points[v][0]) + abs(points[u][1]-points[v][1]) heapq.heappush(pq, (d, v)) return cost
7. Floyd-Warshall
All-pairs shortest path, O(V³). Conceptually: for each intermediate
k, update every (i, j) usingi → k → jif shorter.
def floyd_warshall(graph, n): dist = [[float('inf')] * n for _ in range(n)] for i in range(n): dist[i][i] = 0 for u in graph: for v, w in graph[u]: dist[u][v] = w for k in range(n): for i in range(n): for j in range(n): if dist[i][k] + dist[k][j] < dist[i][j]: dist[i][j] = dist[i][k] + dist[k][j] return dist
Use when V ≤ ~500 and you need all-pairs distances. Rare in interviews; nice to recognize.
8. A* search
Heuristic-guided search. Like Dijkstra, but uses a heuristic estimate to prioritize moves toward the goal. Used in pathfinding (games, GPS).
Out of scope for most LeetCode problems. Mention it if asked about real-world pathfinding.
9. Walked-through problems
Swim in Rising Water (Hard)
2D grid where
grid[i][j]is height. Water rises one unit per minute. You can move to adjacent cells if water height ≥ both. Find min time to reach(n-1, n-1).
This is a Dijkstra where the "distance" of a path is the max cell height on it.
import heapq def swim_in_water(grid): n = len(grid) visited = set() pq = [(grid[0][0], 0, 0)] # (max_so_far, r, c) while pq: t, r, c = heapq.heappop(pq) if (r, c) in visited: continue visited.add((r, c)) if r == n - 1 and c == n - 1: return t for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]: nr, nc = r+dr, c+dc if 0 <= nr < n and 0 <= nc < n and (nr,nc) not in visited: heapq.heappush(pq, (max(t, grid[nr][nc]), nr, nc)) return -1
The cost of an edge (u, v) is max(cost_at_u, height[v]). This is "Dijkstra with a different relaxation rule" — the bottleneck shortest path.
Reconstruct Itinerary (Hard) — Eulerian path
Find a path that uses every edge exactly once, lexicographically smallest. Hierholzer's algorithm.
Alien Dictionary (Hard) — topological sort with derivation
Given words sorted in unknown alphabet, recover the alphabet's order. Compare adjacent words to derive edges, then topological sort.
10. Pitfalls
Dijkstra with negative weights
Doesn't work — Dijkstra "finalizes" a node on first pop, but a negative edge later could improve its distance. Use Bellman-Ford for negatives.
Bellman-Ford in-place vs snapshot
- In-place (mutate
distdirectly): chains updates within one iteration. Used for "shortest path no edge limit" — gives correct answer. - Snapshot (use
new_dist): each iteration extends by exactly one edge. Used for "shortest path with at most k edges."
Distinction matters for the k-stops problem.
Union-Find without compression
Without path compression, find becomes O(log n) (or even O(n) for adversarial inputs). With both path compression and union by rank, O(α(n)). Always use both.
Updating Union-Find recursively in Python
def find(self, x): if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) # recursive return self.parent[x]
This recurses up to the depth of the tree. For huge n, you may hit recursion limits. Iterative version:
def find(self, x): root = x while self.parent[root] != root: root = self.parent[root] while self.parent[x] != root: nxt = self.parent[x] self.parent[x] = root x = nxt return root
Forgetting to break on disconnected MST
If the input graph is disconnected, neither Kruskal nor Prim can build a spanning tree. Always check len(visited) == n (Prim) or used_edges == n - 1 (Kruskal) at the end.
Overflow in large weights (C++)
int can overflow when summing path costs. Use long long.
11. Practice list
| # | Problem | Pattern | Difficulty |
|---|---|---|---|
| 743 | Network Delay Time | Dijkstra | Medium |
| 778 | Swim in Rising Water | Modified Dijkstra | Hard |
| 787 | Cheapest Flights Within K Stops | Bellman-Ford k iters | Medium |
| 269 | Alien Dictionary | Topo sort | Hard |
| 332 | Reconstruct Itinerary | Hierholzer | Hard |
| 1584 | Min Cost to Connect All Points | Prim's | Medium |
| 684 | Redundant Connection | Union-Find | Medium |
| 261 | Graph Valid Tree | Union-Find | Medium |
| 323 | Number of Connected Components | Union-Find | Medium |
| 547 | Number of Provinces | Union-Find | Medium |
| 721 | Accounts Merge | Union-Find with hashmap | Medium |
| 1697 | Checking Existence of Edge Length Limited Paths | Offline + Union-Find | Hard |
TL;DR ✅
- BFS for unweighted shortest path. Dijkstra for non-negative weighted. Bellman-Ford for negatives or k-stop variants.
- Dijkstra: min-heap, lazy stale-skip, O((V+E) log V).
- Bellman-Ford: V-1 relaxation rounds. O(V·E). Snapshot for "at most k edges."
- Union-Find:
findwith path compression +unionby rank → effectively O(1). - Kruskal MST: sort edges + Union-Find. Prim MST: min-heap of frontier edges.
- Floyd-Warshall: O(V³) all-pairs. Use only when needed.
Next: 13-dp-1d.md.
External resources
- NeetCode Advanced Graphs playlist: https://www.youtube.com/playlist?list=PLot-Xpze53lcfZh2WgUMOl_uvGcZxfjyA
- VisuAlgo Dijkstra: https://visualgo.net/en/sssp
- VisuAlgo MST: https://visualgo.net/en/mst
- William Fiset Union-Find: https://www.youtube.com/watch?v=ibjEGG7ylHk
// 0 views