Tries

Prefix tree for autocomplete, dictionary, word-search-on-grid problems.

tries.mdreadme

08 — Tries (Prefix Trees)

Neetcode Week 7b. Tries are specialized trees for prefix-based string operations. Once you've seen one trie problem, you'll spot the pattern instantly: "given many strings, support prefix queries efficiently."

Table of Contents

  1. What a trie is
  2. Why not just a hashmap?
  3. Trie node and operations
  4. Walked-through: Implement Trie
  5. Wildcard search (Add and Search Word)
  6. Combining trie with DFS — Word Search II
  7. Bitwise tries (Maximum XOR)
  8. Pitfalls
  9. Practice list

1. What a trie is

A trie stores a set of strings such that common prefixes share nodes.

Inserting "cat", "car", "cargo", "dog":

                  (root)
                 /      \
                c        d
                |        |
                a        o
               / \       |
              t*  r       g*
                  |\
                 _ g
                * |
                  o*
                  
* = end-of-word marker
OperationTrieHashmap of strings
InsertO(L)O(L) — for hash + comparison
Lookup exactO(L)O(L) avg
Lookup prefixO(L)O(N · L) — must scan all strings
Iterate all words with prefix PO(P + sum of lengths after P)impossible without scan

(L = length of word, N = number of words, P = prefix length)

The key advantage: prefix operations are cheap, which hashmap can't match.


2. Why not just a hashmap?

For exact-match queries (does "cat" exist?), hashmap is fine. But for:

  • "Does any stored word start with ca?"
  • "Find all words starting with ca."
  • "Find longest word in the dictionary that is a prefix of s."

A hashmap forces O(N · L) work because you'd have to scan every key. A trie does it in O(L + output size).

Real-world uses: autocomplete, spellcheck dictionaries, IP routing tables, search-engine indexing, contact lookup.


3. Trie node and operations

class TrieNode: def __init__(self): self.children = {} # char → TrieNode self.is_end = False # marks end of a stored word

For lowercase-only inputs, you can use [None] * 26 instead of a dict — slightly faster.

Insert

def insert(root, word): cur = root for c in word: if c not in cur.children: cur.children[c] = TrieNode() cur = cur.children[c] cur.is_end = True

Search (exact)

def search(root, word): cur = root for c in word: if c not in cur.children: return False cur = cur.children[c] return cur.is_end # crucial: must be marked as end

Starts-with (prefix)

def starts_with(root, prefix): cur = root for c in prefix: if c not in cur.children: return False cur = cur.children[c] return True

The only difference between search and starts_with is the final is_end check.


4. Walked-through: Implement Trie (Medium)

The full LeetCode #208 boils down to packaging the above:

class Trie: def __init__(self): self.root = TrieNode() def insert(self, word): cur = self.root for c in word: if c not in cur.children: cur.children[c] = TrieNode() cur = cur.children[c] cur.is_end = True def search(self, word): cur = self.root for c in word: if c not in cur.children: return False cur = cur.children[c] return cur.is_end def startsWith(self, prefix): cur = self.root for c in prefix: if c not in cur.children: return False cur = cur.children[c] return True

Memorize this skeleton. You'll re-use it as the foundation for harder trie problems.


5. Wildcard search (Add and Search Word — Medium)

Add words. Search supports . as a wildcard matching any single char.

The challenge: . means we don't know which child to descend — we might need to try all children. That's a DFS.

class WordDictionary: def __init__(self): self.root = TrieNode() def addWord(self, word): cur = self.root for c in word: if c not in cur.children: cur.children[c] = TrieNode() cur = cur.children[c] cur.is_end = True def search(self, word): def dfs(i, node): if i == len(word): return node.is_end c = word[i] if c == '.': for child in node.children.values(): if dfs(i + 1, child): return True return False else: if c not in node.children: return False return dfs(i + 1, node.children[c]) return dfs(0, self.root)

Worst-case complexity if every character is . and the trie has 26 branches at every level: O(26^L). In practice trie is sparse so it's fast.


6. Combining trie with DFS — Word Search II (Hard)

Given a board of letters and a list of words, find all words that exist on the board (form by adjacent letters).

Naïve: for each word, run a DFS from each cell. O(W · N · M · 4^L). Too slow.

Trie idea: insert all words into a trie. DFS from each cell along the trie — only recurse if the current letter is a child in the current trie node.

class TrieNode: def __init__(self): self.children = {} self.word = None # store full word at end-node for easy collection def find_words(board, words): # Build trie root = TrieNode() for w in words: cur = root for c in w: if c not in cur.children: cur.children[c] = TrieNode() cur = cur.children[c] cur.word = w rows, cols = len(board), len(board[0]) res = [] def dfs(r, c, node): ch = board[r][c] next_node = node.children.get(ch) if not next_node: return if next_node.word: res.append(next_node.word) next_node.word = None # avoid duplicates board[r][c] = '#' # mark visited for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]: nr, nc = r+dr, c+dc if 0 <= nr < rows and 0 <= nc < cols and board[nr][nc] != '#': dfs(nr, nc, next_node) board[r][c] = ch # restore # Pruning: if next_node has no more children, remove it if not next_node.children: node.children.pop(ch) for r in range(rows): for c in range(cols): dfs(r, c, root) return res

Two important optimizations:

  1. Set next_node.word = None after collecting — handles duplicate words and lets you avoid maintaining a separate "found" set.
  2. Trie pruning: when a leaf has no children left, remove it from its parent so future DFS doesn't explore dead ends.

This is one of the hardest "must-know" problems for Google.


7. Bitwise tries (Maximum XOR of Two Numbers in an Array)

A trie can store bit sequences of integers. For each integer x, insert its 32-bit binary representation along the trie.

To find max(x ^ y) for any pair, traverse the trie from MSB and at each bit greedily go to the opposite bit's child (to maximize XOR).

def find_max_xor(nums): HIGHEST = 31 root = {} def insert(n): cur = root for i in range(HIGHEST, -1, -1): b = (n >> i) & 1 if b not in cur: cur[b] = {} cur = cur[b] def query(n): cur = root ans = 0 for i in range(HIGHEST, -1, -1): b = (n >> i) & 1 opposite = 1 - b if opposite in cur: ans |= 1 << i cur = cur[opposite] else: cur = cur[b] return ans for n in nums: insert(n) return max(query(n) for n in nums)

Niche pattern but elegant. You won't be expected to know this on day 1 — file it under "year-2 polish."


8. Pitfalls

Forgetting is_end

A trie stores prefixes implicitly — every node represents some prefix. Without is_end, search("cat") returns true even if you only inserted "cathedral." Always mark and check.

Using a list of children when the alphabet is large

[None] * 26 is fine for a-z. For Unicode or a large alphabet, use a dict to save memory.

Not handling duplicate insertions

insert("cat"); insert("cat") — second insert is a no-op (good). But insert("car"); insert("car") for some problems should still register multiplicity (e.g., delete-after-find). Add a count field if needed.

Memory blow-up with naive trie of size

For 10⁵ words of length 10, a trie has up to 10⁶ nodes. Each Python dict node is ~250 bytes → 250 MB. Use slots, list-children, or compress into a radix tree if memory matters.

Trie pruning cost

Pruning during DFS in Word Search II must be done carefully — only prune when the node has no children AND word is None. Otherwise you could prune a stem of a still-needed word.

Forgetting to restore mutated state

In Word Search II, we set board[r][c] = '#' to avoid re-using a cell. Always restore on backtrack.


9. Practice list

#ProblemPatternDifficulty
208Implement TrieFoundationMedium
211Add and Search WordWildcard DFSMedium
212Word Search IITrie + DFSHard
14Longest Common PrefixTrie OR vertical scanEasy
648Replace WordsTrie longest prefixMedium
1268Search Suggestions SystemTrie + DFS sortedMedium
421Maximum XOR of Two NumbersBit trieMedium
642Design Search AutocompleteTrie + heapHard

TL;DR ✅

  • A trie is a tree where edges = characters, nodes share common prefixes.
  • Use a trie when you need prefix queries, autocomplete, or "any word starting with…" operations.
  • is_end is non-negotiable — distinguishes "stored word" from "intermediate prefix."
  • Wildcard search uses DFS over multiple children at the wildcard.
  • Word Search II = trie + DFS with pruning. Hard but a Google favorite.
  • Bitwise tries solve max-XOR-pair problems in O(N · 32).

Next: 09-heaps.md.

External resources

main
UTF-8·typescript