◇ dsa-prep/binary-search.md
05 — Binary Search
1. The core idea
~7 min read·updated 5/29/2026
05 — Binary Search
Neetcode Week 4. Binary search is conceptually simple — halve the search space — but getting boundary conditions right is the hardest skill in DSA. Master one template and stick with it.
Table of Contents
- The core idea
- The canonical template
- Two templates: lower_bound vs classic
- Python
bisectand C++lower_bound - Search on rotated sorted array
- Binary search on the answer (the secret superpower)
- Walked-through problems
- Pitfalls
- Practice list
1. The core idea
Given a sorted array (or any monotonic predicate over a range), find the answer in O(log n) by repeatedly halving the search space.
search for 7 in [1, 3, 4, 6, 8, 9, 12, 15]
↑ ↑
lo hi
mid = (0+7)/2 = 3 → arr[3]=6 < 7 → lo = 4
[1, 3, 4, 6, 8, 9, 12, 15]
↑ ↑
lo hi
mid = (4+7)/2 = 5 → arr[5]=9 > 7 → hi = 4
[1, 3, 4, 6, 8, 9, 12, 15]
↑ ↑
lo hi (=lo)
mid = 4 → arr[4]=8 > 7 → hi = 3
now lo=4 > hi=3 → not found, return -1
Each step halves the range. After log₂(n) iterations, the range collapses.
Required precondition
The data (or predicate) must be monotonic: as you move along the search space, you can definitively say "go left" or "go right" given any element. A sorted array is the canonical case, but the predicate can be more abstract — see §6.
2. The canonical template
There are several variants. Pick one and use it everywhere. Here's mine:
def binary_search(arr, target): lo, hi = 0, len(arr) - 1 # inclusive bounds while lo <= hi: # loop while range non-empty mid = lo + (hi - lo) // 2 # avoid overflow in C++ (Python is fine) if arr[mid] == target: return mid elif arr[mid] < target: lo = mid + 1 # +1 so we make progress else: hi = mid - 1 # -1 so we make progress return -1 # not found
Why these three rules matter
lo <= hiwithmid + 1andmid - 1updates → no infinite loop.- Always make progress by skipping
mid(since we just tested it). - Inclusive bounds mean the answer is always in
[lo, hi]while the loop runs.
What if I want the index of leftmost target?
Use the lower_bound template (next section).
3. Two templates
Template 1: classic find
Find any occurrence of target. As above.
Template 2: lower_bound (find leftmost position)
"Find the first index where arr[i] >= target." Useful when target may not be present, or for "insert position."
def lower_bound(arr, target): lo, hi = 0, len(arr) # NOTE: hi = len, exclusive while lo < hi: # NOTE: < not <= mid = lo + (hi - lo) // 2 if arr[mid] < target: lo = mid + 1 else: hi = mid # NOTE: hi = mid, not mid-1 return lo # lo == hi at end
Differences from Template 1:
hi = len(arr)(exclusive)while lo < hi(not<=)hi = midon the "go left" branch (notmid - 1)
Why? We're looking for the boundary, not for an exact match. The exclusive hi and < loop turn this into a range collapse: at termination, lo == hi and points to the answer.
When to use which
| Goal | Template |
|---|---|
| Is target present? Get its index. | Template 1 |
| Position to insert target keeping sorted | Template 2 (lower_bound) |
| First index satisfying P (true on right, false on left) | Template 2 |
| Last index where arr[i] = target | Template 2 with arr[mid] <= target |
Pro tip: master Template 2 (lower_bound). It generalizes to "find the first true in a monotonic predicate" — which is exactly the §6 pattern.
Generic predicate version
def first_true(lo, hi, P): """Smallest x in [lo, hi] with P(x) true. Assumes P is monotonic: F,F,F,T,T,T.""" while lo < hi: mid = lo + (hi - lo) // 2 if P(mid): hi = mid else: lo = mid + 1 return lo
This is your Swiss-Army-knife.
4. Python bisect and C++ lower_bound
Python
import bisect arr = [1, 3, 4, 4, 6, 8] bisect.bisect_left(arr, 4) # 2 — first index with arr[i] >= 4 bisect.bisect_right(arr, 4) # 4 — first index with arr[i] > 4 bisect.insort(arr, 5) # arr is now [1, 3, 4, 4, 5, 6, 8]
| Function | Meaning |
|---|---|
bisect_left(arr, x) | leftmost insertion point — same as our lower_bound |
bisect_right(arr, x) (also bisect) | rightmost insertion point — same as upper_bound |
bisect_right(arr, x) - bisect_left(arr, x) = count of x in arr (cute O(log n) trick).
C++
auto it = lower_bound(v.begin(), v.end(), x); // first >= x auto it = upper_bound(v.begin(), v.end(), x); // first > x int idx = it - v.begin(); bool found = binary_search(v.begin(), v.end(), x);
Use the standard library when possible. Hand-rolled binary search is bug-prone; use it only when you need a custom predicate.
5. Search on rotated sorted array
A rotated sorted array like [4,5,6,7,0,1,2] is piecewise sorted. We can still binary-search in O(log n), with a trick.
Find Minimum in Rotated Sorted Array
def find_min(nums): lo, hi = 0, len(nums) - 1 while lo < hi: mid = lo + (hi - lo) // 2 if nums[mid] > nums[hi]: # min is in right half lo = mid + 1 else: # min is at mid or in left half hi = mid return nums[lo]
Key insight: comparing nums[mid] to nums[hi] (right end) tells us which half is sorted, and the minimum is in the unsorted one (or at mid).
Search in Rotated Sorted Array
def search(nums, target): lo, hi = 0, len(nums) - 1 while lo <= hi: mid = lo + (hi - lo) // 2 if nums[mid] == target: return mid if nums[lo] <= nums[mid]: # left half sorted if nums[lo] <= target < nums[mid]: hi = mid - 1 # target in sorted left else: lo = mid + 1 else: # right half sorted if nums[mid] < target <= nums[hi]: lo = mid + 1 # target in sorted right else: hi = mid - 1 return -1
The pattern: at each step, at least one half is sorted. Check whether target lies in that sorted half. If yes, binary-search there; else, the other half.
6. Binary search on the answer
The most powerful BS variant. It transforms an optimization problem into a feasibility check.
Pattern
"Find the minimum X such that we can [achieve goal] with X" →
define feasible(x): bool
return whether x is sufficient
if feasible is monotonic (true for all x ≥ threshold),
binary-search for the threshold.
Koko Eating Bananas (Medium)
Piles of bananas. Koko eats
kper hour. Withhhours total, what's the minimumkto finish all piles?
Observation: if Koko can finish at speed k, she can also finish at any speed > k. So feasible(k) is monotonic (false for small k, true for large k).
Search range for k: 1 to max(piles) (no point eating faster than the biggest pile).
import math def min_eating_speed(piles, h): def hours_needed(k): return sum(math.ceil(p / k) for p in piles) lo, hi = 1, max(piles) while lo < hi: mid = lo + (hi - lo) // 2 if hours_needed(mid) <= h: # mid is feasible hi = mid else: lo = mid + 1 return lo
Complexity: O(n log M) where M = max(piles). The feasibility check is O(n); we run it O(log M) times.
Capacity to Ship Packages in D Days
Same flavor: minimum capacity such that we can ship within D days.
def ship_within_days(weights, D): def can_ship(cap): days, cur = 1, 0 for w in weights: if cur + w > cap: days += 1 cur = 0 cur += w return days <= D lo, hi = max(weights), sum(weights) # cap >= heaviest single package while lo < hi: mid = lo + (hi - lo) // 2 if can_ship(mid): hi = mid else: lo = mid + 1 return lo
Median of Two Sorted Arrays (Hard)
The famous one. I won't fully derive it here (it's tricky), but the idea is binary search on the partition point of one array, deriving the partition of the other. See: https://www.youtube.com/watch?v=q6IEA26hvXc
The takeaway: anything that can be phrased as "is X enough?" with monotonic answer is a candidate for BS-on-answer.
7. Walked-through problems
Search a 2D Matrix (Medium)
Sorted left-to-right, each row's first > previous row's last. Treat as flattened sorted array.
def search_matrix(matrix, target): rows, cols = len(matrix), len(matrix[0]) lo, hi = 0, rows * cols - 1 while lo <= hi: mid = lo + (hi - lo) // 2 r, c = divmod(mid, cols) # convert flat index to 2D if matrix[r][c] == target: return True elif matrix[r][c] < target: lo = mid + 1 else: hi = mid - 1 return False
Complexity: O(log(m·n)) = O(log m + log n).
Time Based Key-Value Store (Medium)
set(key, val, timestamp),get(key, timestamp)returns latest value for key withts ≤ timestamp.
class TimeMap: def __init__(self): self.store = {} # key → list of (ts, val) def set(self, key, value, timestamp): if key not in self.store: self.store[key] = [] self.store[key].append((timestamp, value)) # appended in order def get(self, key, timestamp): if key not in self.store: return "" arr = self.store[key] # Find rightmost ts <= timestamp lo, hi = 0, len(arr) - 1 res = "" while lo <= hi: mid = (lo + hi) // 2 if arr[mid][0] <= timestamp: res = arr[mid][1] lo = mid + 1 # try for larger ts else: hi = mid - 1 return res
Or use bisect.bisect_right on a parallel list of timestamps.
8. Pitfalls
Off-by-one in mid
In some templates mid = (lo + hi) // 2, in others lo + (hi - lo + 1) // 2 (round up to avoid infinite loop when shrinking with lo = mid instead of mid + 1). Stick with one template.
Infinite loop when lo = mid
If the loop sets lo = mid (not mid + 1) and mid keeps computing to the same value, you'll loop forever. Pair lo = mid with mid = lo + (hi - lo + 1) // 2 (round up).
Forgetting <= vs <
- Template 1 uses
lo <= hiwithmid ± 1updates. - Template 2 uses
lo < hiwithhi = mid(no -1).
Mixing them causes off-by-one bugs.
Wrong upper bound in BS-on-answer
The upper bound must be definitely feasible (so the answer exists in [lo, hi]). Common mistakes:
- For Koko: upper bound =
max(piles), notsum(piles)(any larger speed is wasteful). - For ship-packages: upper bound =
sum(weights)(one big day works), and lower bound =max(weights)(must fit heaviest package).
Not checking monotonicity of predicate
If feasible(k) is not monotonic, BS-on-answer is wrong. Always verify: "if k works, does k+1 also work?"
Integer overflow (C++)
(lo + hi) / 2 overflows when lo + hi > INT_MAX. Use lo + (hi - lo) / 2.
In Python, ints are arbitrary precision, so it doesn't matter — but using the safe form is good habit.
Confusing index vs value
Always be clear: are you binary-searching over array indices (Templates 1/2) or answer values (BS-on-answer)? They use the same algorithm but the meaning of lo, hi, mid differs.
9. Practice list
| # | Problem | Variant | Difficulty |
|---|---|---|---|
| 704 | Binary Search | Classic | Easy |
| 74 | Search a 2D Matrix | Index → 2D | Medium |
| 875 | Koko Eating Bananas | BS on answer | Medium |
| 153 | Find Minimum in Rotated Sorted Array | Rotated | Medium |
| 33 | Search in Rotated Sorted Array | Rotated | Medium |
| 981 | Time Based Key-Value Store | BS on history | Medium |
| 4 | Median of Two Sorted Arrays | BS on partition | Hard |
| 1011 | Capacity to Ship Within D Days | BS on answer | Medium |
| 410 | Split Array Largest Sum | BS on answer | Hard |
| 162 | Find Peak Element | BS on slope | Medium |
TL;DR ✅
- Master one template (Template 2 / lower_bound) and use it for everything.
while lo < hi: ... if P(mid): hi = mid else: lo = mid+1; return lois the universal predicate version.- Binary search on the answer: when the question is "min X to do Y," and
Yis monotonic inX, BS the answer with a feasibility check. - Python
bisect, C++lower_bound/upper_bound— use them for plain lookups. - Rotated arrays: at least one half is sorted; check which.
- Always avoid overflow with
mid = lo + (hi - lo) // 2.
Next: 06-linked-list.md.
External resources
- NeetCode Binary Search playlist: https://www.youtube.com/playlist?list=PLot-Xpze53lcFvQEGUJVfZpr5_zCM6yPo
- Binary search template (LeetCode discuss, the canonical reference): https://leetcode.com/discuss/study-guide/786126/Python-Powerful-Ultimate-Binary-Search-Template.-Solved-many-problems
- Errichto on BS: https://www.youtube.com/watch?v=GU7DpgHINWQ
// 0 views