◐ system-design/transactions.md
16. Transactions: Local, Distributed, 2PC, Saga, TCC
A transaction is a unit of work that should appear atomic. Local transactions are well-understood. Across services or shards, it gets dramatically harder.
~6 min read·updated 5/29/2026
16. Transactions: Local, Distributed, 2PC, Saga, TCC
A transaction is a unit of work that should appear atomic. Local transactions are well-understood. Across services or shards, it gets dramatically harder.
16.1 ACID recap
(See chapter 6.) Local DB transactions provide:
- Atomicity (all-or-nothing)
- Consistency (constraints preserved)
- Isolation (concurrency safe at chosen level)
- Durability (committed = persisted)
When you have one DB, you get this nearly for free. Use it.
16.2 The distributed transaction problem
Two operations on different systems must succeed or fail together:
- Move money between two account databases.
- Reserve a flight + a hotel + a car.
- Update an order in one DB and a fulfillment record in another.
You can't just BEGIN; ... COMMIT; — different systems, different transaction managers.
Naive "dual write": write to A, then write to B. If B fails, you have inconsistency. Retrying B might work; might cascade other failures. The two-generals problem (chapter 13) says no perfect solution exists.
16.3 Two-Phase Commit (2PC)
Classical algorithm for distributed atomic commit. Coordinator + N participants.
Phase 1: Prepare
- Coordinator sends
prepareto all participants. - Each participant: do all the work, write a "ready to commit" log entry, but don't actually commit. Reply yes (committed promise) or no (abort).
Phase 2: Commit / Abort
- If all yeses → coordinator sends
commit. Each participant commits. Replies done. - If any no → coordinator sends
abort. Each participant aborts.
Why 2PC is brittle
- Coordinator crash mid-phase-2: participants are stuck holding locks indefinitely waiting for resolution. Manual recovery needed.
- Synchronous and slow: every participant must respond, twice; the slowest sets latency.
- Locks held across the wire: contention amplifies under load.
- Probability of failure increases with N: P(all N respond OK) → 0 as N → ∞.
In practice 2PC is mostly used inside a single product (XA transactions across multiple DBs in one app). It is rarely used across services.
Three-Phase Commit (3PC)
Adds a "preCommit" phase to bound the locking window. Theoretically improves safety. In practice not widely deployed because of weaker assumptions about network behavior. Not a real solution.
16.4 Sagas
Garcia-Molina & Salem (1987). The dominant pattern for distributed transactions in microservices.
A saga is a sequence of local transactions, each in one service. If a step fails, run compensating transactions to undo prior steps.
Two coordination styles
Choreography: each service listens to events and decides what to do next.
- Service A → emits "OrderCreated" event
- Service B (Inventory) → reserves stock, emits "StockReserved"
- Service C (Payment) → charges card, emits "PaymentSucceeded"
- Service A → marks order complete
If a step fails (e.g., payment), it emits "PaymentFailed" and other services react with compensations ("ReleaseStock", "CancelOrder").
Pros: decoupled; no central coordinator. Cons: hard to reason about end-to-end flow; cyclic dependencies emerge; debugging is painful.
Orchestration: a central coordinator (the saga orchestrator) tells each service what to do.
- Orchestrator → "Reserve stock" → Inventory service
- → "Charge card" → Payment service
- If any step fails → Orchestrator runs compensations in reverse order.
Pros: single source of truth for the workflow; easier to debug. Cons: orchestrator becomes a critical service.
Tools: Temporal, AWS Step Functions, Camunda, Netflix Conductor.
Compensations are not rollbacks
Once a step has committed locally, you can't rollback. You compensate (e.g., refund the charge instead of "uncharging"). Some operations have no clean compensation (sent an email, fired a missile).
Saga isolation
Sagas are not isolated. While in-flight, other transactions can see partial state. Mitigations:
- Semantic locks: mark a record "in saga X" so others know.
- Commutative operations: order doesn't matter.
- Reorder: do reversible steps first, irreversible last.
16.5 TCC (Try-Confirm-Cancel)
Variant of saga where each service exposes:
try: reserve resources, don't finalize.confirm: actually commit (must succeed if try succeeded).cancel: release reservation.
Coordinator calls try on all; if all succeed, confirm all; else cancel all.
Stronger than basic saga (more like 2PC) at the cost of designing every service to support try/confirm/cancel. Used in payments, travel booking.
16.6 The Outbox Pattern
For "DB write + publish event" atomicity. Critical for any service that emits events.
Naive: write to DB; then publish to Kafka. If Kafka fails, you have inconsistency.
Outbox:
- Write the change AND an event row to an
outboxtable in the same local transaction. - A separate process polls/streams the outbox table → publishes to Kafka → marks event as sent.
CDC tool (Debezium) can stream the outbox table directly via WAL. Now you have at-least-once event delivery with strict consistency between data and events.
Companion: idempotent consumer dedups by event ID.
16.7 The Inbox Pattern
Mirror of outbox on the consumer side. Store incoming event IDs in an inbox table; refuse processing if seen. Atomic dedup.
Combined: outbox (producer) + inbox (consumer) gives effectively-once processing without exotic infrastructure.
16.8 Optimistic vs pessimistic concurrency
Pessimistic
Lock the resource before working. SELECT ... FOR UPDATE. Other transactions wait.
- Strong: contention serializes.
- Weak: deadlocks; throughput limit.
Optimistic
Don't lock. Read with a version number. On write: UPDATE ... WHERE version = X. If version changed, retry.
- Strong: no locks; high concurrency.
- Weak: contention causes retry storms (livelock).
Use optimistic for low-conflict workloads (web apps, most CRUD). Pessimistic for high-conflict (inventory decrement, balance updates).
16.9 Idempotency: the practical backbone
For any external API, payments, or queue consumer, make it idempotent.
- Client sends an
Idempotency-Keyper logical operation. - Server stores
(key, response)for ~24h. - Repeat requests return the cached response without re-doing the work.
- TTL after which keys can be reused for new ops.
Stripe's Idempotency-Key, AWS SQS's MessageDeduplicationId — same idea.
16.10 Outbox + saga = the modern microservices stack
Most production "distributed transactions" today are:
- Saga (choreography or orchestration) for the workflow.
- Outbox at every step, so each local DB write atomically produces events.
- Idempotency at every consumer.
- Compensations for partial-failure cleanup.
- Monitoring + alerting on stuck sagas.
This is how Uber, Netflix, Airbnb run.
16.11 Spanner / distributed SQL: real distributed transactions
Spanner does honest distributed transactions across shards using:
- Paxos per shard for replication.
- 2PC across shards for cross-shard transactions, with TrueTime providing global ordering.
Result: actual ACID transactions across petabytes globally. The cost: every cross-shard txn pays cross-region 2PC latency (10s-100s of ms).
CockroachDB, YugabyteDB, TiDB approximate this.
For most apps, this is overkill. But when you need it (financial systems, critical inventory), distributed SQL is now an option.
16.12 What to use when
| Situation | Pattern |
|---|---|
| Single DB | Local transaction |
| Multiple DBs in one service | XA / 2PC (rarely; one DB is better) |
| Across microservices | Saga + outbox + idempotency |
| Strong cross-service guarantee | Distributed SQL (Spanner, CockroachDB) |
| Money + reservation | TCC |
| Event-driven architecture | Outbox always |
16.13 Common bugs
- Dual writes without outbox → inconsistency on partial failure.
- Non-idempotent retries → duplicate side effects (double charge!).
- Compensation that doesn't compensate (the email already sent).
- Saga that never completes stuck in middle states; need timeout + alert.
- Read-modify-write race without optimistic locking → lost updates.
Key takeaways
- Local transactions are great. Use them.
- 2PC is brittle; rarely worth the cost.
- Saga (choreography or orchestration) + outbox + idempotency is the production pattern.
- TCC for stronger reservation semantics.
- Distributed SQL gives you real cross-shard ACID at a latency cost.
- Make every external/queue operation idempotent. No exceptions.
// 1 view