Graphs
BFS, DFS, islands, topological sort, cycle detection.
11 — Graphs
Neetcode Week 9. Graphs are the most-asked Hard topic at Google. Many problems disguise themselves: a 2D grid is a graph, a Sudoku board is a graph, dependencies are a graph. Once you spot it, you reach for BFS, DFS, or topological sort.
Table of Contents
- Graph terminology
- Representations
- BFS — breadth-first search
- DFS — depth-first search
- Pattern: number-of-islands family
- Pattern: multi-source BFS
- Pattern: cycle detection
- Topological sort (Kahn + DFS)
- Walked-through problems
- Pitfalls
- Practice list
1. Graph terminology
A graph G = (V, E):
V= set of vertices (nodes)E= set of edges (pairs connecting two vertices)
| Term | Meaning |
|---|---|
| Directed | Edge has direction (a→b ≠ b→a) |
| Undirected | Edges go both ways |
| Weighted | Each edge has a number (weight/cost) |
| Cyclic / acyclic | Has at least one cycle / no cycles |
| DAG | Directed Acyclic Graph |
| Connected (undirected) | A path exists between every pair |
| Strongly connected (directed) | Same, in directed sense |
| Tree | Connected, undirected, acyclic, V-1 edges |
| Forest | Disjoint trees |
| Adjacency | Two vertices are adjacent if there's an edge between them |
| Degree | Number of edges at a vertex (in-degree / out-degree for directed) |
Undirected graph: Directed graph:
A───B A───→B
│ │ │ │
│ │ ↓ ↓
C───D C───→D
2. Representations
Adjacency list (most common)
For each vertex, store a list of neighbors.
graph = { 'A': ['B', 'C'], 'B': ['A', 'D'], 'C': ['A', 'D'], 'D': ['B', 'C'], }
Or with defaultdict:
from collections import defaultdict graph = defaultdict(list) graph['A'].append('B') graph['B'].append('A') # for undirected, add both directions
| Op | Cost |
|---|---|
| Find neighbors of v | O(degree(v)) |
| Check if (u, v) is edge | O(degree(u)) |
| Total memory | O(V + E) |
Adjacency matrix
matrix[i][j] = 1 if edge from i to j, else 0. Memory O(V²) — only good for dense graphs.
n = 4 matrix = [[0]*n for _ in range(n)] matrix[0][1] = 1 # edge from 0 to 1
| Op | Cost |
|---|---|
| Check if (u, v) is edge | O(1) |
| Find neighbors | O(V) |
| Total memory | O(V²) |
Edge list
Just [(u, v, weight), ...]. Used for Bellman-Ford and Kruskal's MST.
Implicit graphs
Sometimes the graph isn't given — you compute neighbors on the fly:
- Grid: neighbors of
(r, c)are(r±1, c)and(r, c±1). - Word ladder: neighbors of "cat" are "bat", "car", etc. (one letter changed).
- Knight's tour: neighbors are the L-shaped moves.
Use the same BFS/DFS algorithms; just generate neighbors when needed.
3. BFS — breadth-first search
Visit all nodes at distance 1, then all at distance 2, and so on.
Use cases:
- Shortest path in unweighted graph (or where all edges cost 1).
- Level-order traversal of a tree.
- Finding any node satisfying a condition, via shortest steps.
Template
from collections import deque def bfs(start, graph): visited = {start} q = deque([start]) while q: node = q.popleft() # process node for nbr in graph[node]: if nbr not in visited: visited.add(nbr) q.append(nbr)
Critical: mark as visited when adding to the queue, not when popping. Otherwise you'll add the same node multiple times.
Tracking levels (distance)
def shortest_path_unweighted(start, target, graph): if start == target: return 0 visited = {start} q = deque([(start, 0)]) # (node, dist) while q: node, d = q.popleft() for nbr in graph[node]: if nbr == target: return d + 1 if nbr not in visited: visited.add(nbr) q.append((nbr, d + 1)) return -1 # unreachable
Or process layer-by-layer (snapshot trick from 07-trees.md).
Why BFS gives shortest path on unweighted
Each level is "1 more step from the start." The first time we encounter a node, it's via the shortest path. (False if edge weights vary — then use Dijkstra.)
4. DFS — depth-first search
Go as deep as possible, then backtrack.
Use cases:
- Connected components.
- Topological sort.
- Cycle detection.
- Any "explore all reachable" question where order doesn't matter.
Recursive
def dfs(node, graph, visited): if node in visited: return visited.add(node) # process node for nbr in graph[node]: dfs(nbr, graph, visited)
Iterative (explicit stack)
def dfs_iter(start, graph): visited = {start} stack = [start] while stack: node = stack.pop() # process node for nbr in graph[node]: if nbr not in visited: visited.add(nbr) stack.append(nbr)
Same logic as BFS but with stack instead of queue.
When to use BFS vs DFS
| Problem says... | Use |
|---|---|
| "Shortest" something in unweighted | BFS |
| "Reachability" / "connected components" | Either (DFS easier to write) |
| "All paths" | DFS (with backtracking) |
| "Cycle in directed graph" | DFS with white/gray/black |
| "Topological sort" | Either (Kahn = BFS, post-order DFS) |
5. Pattern: Number of islands family
A 2D grid is a graph where each cell is a vertex, neighbors are 4 adjacent cells.
Number of Islands (Medium)
Count connected components of '1' cells.
def num_islands(grid): rows, cols = len(grid), len(grid[0]) count = 0 def dfs(r, c): if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] != '1': return grid[r][c] = '0' # mark visited dfs(r+1, c); dfs(r-1, c); dfs(r, c+1); dfs(r, c-1) for r in range(rows): for c in range(cols): if grid[r][c] == '1': dfs(r, c) count += 1 return count
Classic DFS over a grid. For each unvisited '1', flood the whole island with DFS, increment count.
Max Area of Island (Medium)
Same skeleton; DFS returns the size:
def max_area(grid): rows, cols = len(grid), len(grid[0]) def dfs(r, c): if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] != 1: return 0 grid[r][c] = 0 return 1 + dfs(r+1, c) + dfs(r-1, c) + dfs(r, c+1) + dfs(r, c-1) return max((dfs(r, c) for r in range(rows) for c in range(cols)), default=0)
Surrounded Regions (Medium)
Capture all 'O' regions that are not connected to the border.
Reverse approach: from each border 'O', DFS to mark safe cells. After, all unmarked 'O's are surrounded.
def solve(board): if not board: return rows, cols = len(board), len(board[0]) def dfs(r, c): if r < 0 or r >= rows or c < 0 or c >= cols or board[r][c] != 'O': return board[r][c] = 'S' # mark safe dfs(r+1, c); dfs(r-1, c); dfs(r, c+1); dfs(r, c-1) for r in range(rows): dfs(r, 0); dfs(r, cols-1) for c in range(cols): dfs(0, c); dfs(rows-1, c) for r in range(rows): for c in range(cols): board[r][c] = 'X' if board[r][c] == 'O' else ('O' if board[r][c] == 'S' else 'X')
The "trick" is marking what to keep, then converting the rest. Often easier than tracking what to capture.
Pacific Atlantic Water Flow (Medium)
Two BFS/DFS from each ocean's borders. Cell that's reachable from both belongs in the answer. Same "reverse" pattern.
6. Pattern: multi-source BFS
Trigger: "earliest time something spreads," "minimum distance from any of these sources."
Push all sources into the queue at distance 0, then BFS as usual. The first time a cell is visited, that's the closest source.
Rotting Oranges (Medium)
Each rotten orange spreads rot to adjacent fresh oranges per minute. Minimum minutes until all rotten?
from collections import deque def oranges_rotting(grid): rows, cols = len(grid), len(grid[0]) q = deque() fresh = 0 for r in range(rows): for c in range(cols): if grid[r][c] == 2: q.append((r, c, 0)) elif grid[r][c] == 1: fresh += 1 last_time = 0 while q: r, c, t = q.popleft() last_time = t for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]: nr, nc = r+dr, c+dc if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == 1: grid[nr][nc] = 2 fresh -= 1 q.append((nr, nc, t + 1)) return last_time if fresh == 0 else -1
Walls and Gates (Medium)
Same pattern: all gates pushed at distance 0. Each cell gets the distance to its closest gate.
7. Pattern: cycle detection
In an undirected graph (DFS)
def has_cycle_undirected(graph): visited = set() def dfs(node, parent): visited.add(node) for nbr in graph[node]: if nbr not in visited: if dfs(nbr, node): return True elif nbr != parent: # back edge to non-parent → cycle return True return False return any(dfs(v, None) for v in graph if v not in visited)
The "parent" parameter avoids treating the back edge through which we entered as a cycle.
In a directed graph (DFS with white/gray/black)
def has_cycle_directed(graph): WHITE, GRAY, BLACK = 0, 1, 2 color = {v: WHITE for v in graph} def dfs(v): color[v] = GRAY for u in graph[v]: if color[u] == GRAY: return True # back edge in current DFS path → cycle if color[u] == WHITE and dfs(u): return True color[v] = BLACK return False return any(dfs(v) for v in graph if color[v] == WHITE)
GRAY means "in the current DFS path." Hitting a GRAY node means we've looped back.
8. Topological sort
For DAGs only. Linear ordering where every edge
(u, v)hasubeforev.
Real-world: course prerequisites, build-system dependencies, package install order.
Approach 1: Kahn's algorithm (BFS)
- Compute in-degree for every vertex.
- Push all vertices with in-degree 0 into the queue.
- Pop one. Add to result. Decrement in-degrees of its neighbors. Push any neighbor whose in-degree drops to 0.
- Repeat.
If at the end, result has fewer than V vertices → there's a cycle.
from collections import defaultdict, deque def topo_sort_kahn(num_courses, prerequisites): graph = defaultdict(list) in_deg = [0] * num_courses for c, pre in prerequisites: graph[pre].append(c) in_deg[c] += 1 q = deque(v for v in range(num_courses) if in_deg[v] == 0) order = [] while q: v = q.popleft() order.append(v) for nbr in graph[v]: in_deg[nbr] -= 1 if in_deg[nbr] == 0: q.append(nbr) return order if len(order) == num_courses else [] # [] = cycle
Approach 2: DFS with finishing-time (post-order reversed)
def topo_sort_dfs(num_courses, prerequisites): graph = defaultdict(list) for c, pre in prerequisites: graph[pre].append(c) WHITE, GRAY, BLACK = 0, 1, 2 color = [WHITE] * num_courses order = [] def dfs(v): if color[v] == GRAY: return False if color[v] == BLACK: return True color[v] = GRAY for u in graph[v]: if not dfs(u): return False color[v] = BLACK order.append(v) return True for v in range(num_courses): if not dfs(v): return [] # cycle return order[::-1]
When a DFS finishes a node, all its dependents are already processed; recording in finish order then reversing gives a valid topo order.
Course Schedule (Medium) — just cycle check
def can_finish(num_courses, prerequisites): return len(topo_sort_kahn(num_courses, prerequisites)) == num_courses
Course Schedule II (Medium) — return the order
Use Kahn directly, return order if len(order) == num_courses else [].
9. Walked-through problems
Clone Graph (Medium)
Given an undirected graph, deep-copy it.
DFS with memoization (old node → new node).
def clone_graph(node): if not node: return None mapping = {} def dfs(n): if n in mapping: return mapping[n] copy = Node(n.val) mapping[n] = copy for nbr in n.neighbors: copy.neighbors.append(dfs(nbr)) return copy return dfs(node)
Word Ladder (Hard)
Transform
beginWordtoendWordone letter at a time. Each intermediate must be inwordList. Minimum number of transformations?
Treat words as nodes; two words are neighbors if they differ by one letter. BFS for shortest.
Trick to find neighbors efficiently: for each word, generate all "wildcards" like _at, c_t, ca_ and group words by these wildcards. This way you find one-edit neighbors in O(L) instead of O(N * L) per word.
from collections import defaultdict, deque def ladder_length(beginWord, endWord, wordList): word_set = set(wordList) if endWord not in word_set: return 0 L = len(beginWord) pattern_map = defaultdict(list) for w in word_set: for i in range(L): pattern = w[:i] + '*' + w[i+1:] pattern_map[pattern].append(w) visited = {beginWord} q = deque([(beginWord, 1)]) while q: word, level = q.popleft() for i in range(L): pattern = word[:i] + '*' + word[i+1:] for nbr in pattern_map[pattern]: if nbr == endWord: return level + 1 if nbr not in visited: visited.add(nbr) q.append((nbr, level + 1)) return 0
10. Pitfalls
Marking visited at pop time (BFS)
while q: n = q.popleft() if n in visited: continue # too late! visited.add(n) ...
Same node can be added to the queue multiple times before being popped → wasted work. Mark visited when adding to queue.
Forgetting bidirectional edges (undirected)
graph[u].append(v) # graph[v].append(u) ← MISSING for undirected
Off-by-one in distance
Starting node is at distance 0, not 1. Verify with a tiny test.
Confusing recursion stack and graph stack
In iterative DFS, use a stack of nodes. Don't stack functions/calls.
Trying topo sort on cyclic graph
Topo sort assumes DAG. If there's a cycle, Kahn produces fewer than V nodes (good signal).
Using set() for visited on giant graphs
For grids, [[False] * cols for _ in range(rows)] is faster than a set of tuples. Profile before changing.
Multi-source BFS edge case
If all cells are sources, distance is 0 everywhere — handle that.
Stack overflow on deep DFS
Python's default recursion limit is 1000. Use sys.setrecursionlimit(10**6) or convert to iterative.
11. Practice list
| # | Problem | Pattern | Difficulty |
|---|---|---|---|
| 200 | Number of Islands | DFS on grid | Medium |
| 133 | Clone Graph | DFS + memo | Medium |
| 695 | Max Area of Island | DFS returning size | Medium |
| 417 | Pacific Atlantic Water Flow | Reverse-BFS | Medium |
| 130 | Surrounded Regions | Border DFS | Medium |
| 994 | Rotting Oranges | Multi-source BFS | Medium |
| 286 | Walls and Gates | Multi-source BFS | Medium |
| 207 | Course Schedule | Cycle detection | Medium |
| 210 | Course Schedule II | Topo sort | Medium |
| 127 | Word Ladder | BFS on wildcard-graph | Hard |
| 261 | Graph Valid Tree | Cycle + connectivity | Medium |
| 323 | Connected Components in Undirected Graph | DFS / Union-Find | Medium |
TL;DR ✅
- Adjacency list is the default representation.
defaultdict(list). - BFS for shortest path in unweighted graphs and grids.
- DFS for reachability, connected components, cycle detection.
- Mark visited when adding to the queue/stack, not when processing.
- Multi-source BFS = push all sources at distance 0, then BFS.
- Topological sort: Kahn (in-degree + queue) or DFS post-order. DAG only.
- Implicit graphs (grids, word ladder) use the same algorithms — just generate neighbors on the fly.
Next: 12-advanced-graphs.md.
External resources
- NeetCode Graph playlist: https://www.youtube.com/playlist?list=PLot-Xpze53lcfZh2WgUMOl_uvGcZxfjyA
- VisuAlgo BFS/DFS: https://visualgo.net/en/dfsbfs
- VisuAlgo topological sort: https://visualgo.net/en/dfsbfs (toggle topology mode)
- William Fiset graph theory series: https://www.youtube.com/playlist?list=PLDV1Zeh2NRsDGO4--qE8yH72HFL1Km93P