system-design/transactions.md

16. Transactions: Local, Distributed, 2PC, Saga, TCC

A transaction is a unit of work that should appear atomic. Local transactions are well-understood. Across services or shards, it gets dramatically harder.

~6 min read·updated 5/29/2026

16. Transactions: Local, Distributed, 2PC, Saga, TCC

A transaction is a unit of work that should appear atomic. Local transactions are well-understood. Across services or shards, it gets dramatically harder.

16.1 ACID recap

(See chapter 6.) Local DB transactions provide:

  • Atomicity (all-or-nothing)
  • Consistency (constraints preserved)
  • Isolation (concurrency safe at chosen level)
  • Durability (committed = persisted)

When you have one DB, you get this nearly for free. Use it.

16.2 The distributed transaction problem

Two operations on different systems must succeed or fail together:

  • Move money between two account databases.
  • Reserve a flight + a hotel + a car.
  • Update an order in one DB and a fulfillment record in another.

You can't just BEGIN; ... COMMIT; — different systems, different transaction managers.

Naive "dual write": write to A, then write to B. If B fails, you have inconsistency. Retrying B might work; might cascade other failures. The two-generals problem (chapter 13) says no perfect solution exists.

16.3 Two-Phase Commit (2PC)

Classical algorithm for distributed atomic commit. Coordinator + N participants.

Phase 1: Prepare

  • Coordinator sends prepare to all participants.
  • Each participant: do all the work, write a "ready to commit" log entry, but don't actually commit. Reply yes (committed promise) or no (abort).

Phase 2: Commit / Abort

  • If all yeses → coordinator sends commit. Each participant commits. Replies done.
  • If any no → coordinator sends abort. Each participant aborts.

Why 2PC is brittle

  • Coordinator crash mid-phase-2: participants are stuck holding locks indefinitely waiting for resolution. Manual recovery needed.
  • Synchronous and slow: every participant must respond, twice; the slowest sets latency.
  • Locks held across the wire: contention amplifies under load.
  • Probability of failure increases with N: P(all N respond OK) → 0 as N → ∞.

In practice 2PC is mostly used inside a single product (XA transactions across multiple DBs in one app). It is rarely used across services.

Three-Phase Commit (3PC)

Adds a "preCommit" phase to bound the locking window. Theoretically improves safety. In practice not widely deployed because of weaker assumptions about network behavior. Not a real solution.

16.4 Sagas

Garcia-Molina & Salem (1987). The dominant pattern for distributed transactions in microservices.

A saga is a sequence of local transactions, each in one service. If a step fails, run compensating transactions to undo prior steps.

Two coordination styles

Choreography: each service listens to events and decides what to do next.

  • Service A → emits "OrderCreated" event
  • Service B (Inventory) → reserves stock, emits "StockReserved"
  • Service C (Payment) → charges card, emits "PaymentSucceeded"
  • Service A → marks order complete

If a step fails (e.g., payment), it emits "PaymentFailed" and other services react with compensations ("ReleaseStock", "CancelOrder").

Pros: decoupled; no central coordinator. Cons: hard to reason about end-to-end flow; cyclic dependencies emerge; debugging is painful.

Orchestration: a central coordinator (the saga orchestrator) tells each service what to do.

  • Orchestrator → "Reserve stock" → Inventory service
  • → "Charge card" → Payment service
  • If any step fails → Orchestrator runs compensations in reverse order.

Pros: single source of truth for the workflow; easier to debug. Cons: orchestrator becomes a critical service.

Tools: Temporal, AWS Step Functions, Camunda, Netflix Conductor.

Compensations are not rollbacks

Once a step has committed locally, you can't rollback. You compensate (e.g., refund the charge instead of "uncharging"). Some operations have no clean compensation (sent an email, fired a missile).

Saga isolation

Sagas are not isolated. While in-flight, other transactions can see partial state. Mitigations:

  • Semantic locks: mark a record "in saga X" so others know.
  • Commutative operations: order doesn't matter.
  • Reorder: do reversible steps first, irreversible last.

16.5 TCC (Try-Confirm-Cancel)

Variant of saga where each service exposes:

  • try: reserve resources, don't finalize.
  • confirm: actually commit (must succeed if try succeeded).
  • cancel: release reservation.

Coordinator calls try on all; if all succeed, confirm all; else cancel all.

Stronger than basic saga (more like 2PC) at the cost of designing every service to support try/confirm/cancel. Used in payments, travel booking.

16.6 The Outbox Pattern

For "DB write + publish event" atomicity. Critical for any service that emits events.

Naive: write to DB; then publish to Kafka. If Kafka fails, you have inconsistency.

Outbox:

  • Write the change AND an event row to an outbox table in the same local transaction.
  • A separate process polls/streams the outbox table → publishes to Kafka → marks event as sent.

CDC tool (Debezium) can stream the outbox table directly via WAL. Now you have at-least-once event delivery with strict consistency between data and events.

Companion: idempotent consumer dedups by event ID.

16.7 The Inbox Pattern

Mirror of outbox on the consumer side. Store incoming event IDs in an inbox table; refuse processing if seen. Atomic dedup.

Combined: outbox (producer) + inbox (consumer) gives effectively-once processing without exotic infrastructure.

16.8 Optimistic vs pessimistic concurrency

Pessimistic

Lock the resource before working. SELECT ... FOR UPDATE. Other transactions wait.

  • Strong: contention serializes.
  • Weak: deadlocks; throughput limit.

Optimistic

Don't lock. Read with a version number. On write: UPDATE ... WHERE version = X. If version changed, retry.

  • Strong: no locks; high concurrency.
  • Weak: contention causes retry storms (livelock).

Use optimistic for low-conflict workloads (web apps, most CRUD). Pessimistic for high-conflict (inventory decrement, balance updates).

16.9 Idempotency: the practical backbone

For any external API, payments, or queue consumer, make it idempotent.

  • Client sends an Idempotency-Key per logical operation.
  • Server stores (key, response) for ~24h.
  • Repeat requests return the cached response without re-doing the work.
  • TTL after which keys can be reused for new ops.

Stripe's Idempotency-Key, AWS SQS's MessageDeduplicationId — same idea.

16.10 Outbox + saga = the modern microservices stack

Most production "distributed transactions" today are:

  1. Saga (choreography or orchestration) for the workflow.
  2. Outbox at every step, so each local DB write atomically produces events.
  3. Idempotency at every consumer.
  4. Compensations for partial-failure cleanup.
  5. Monitoring + alerting on stuck sagas.

This is how Uber, Netflix, Airbnb run.

16.11 Spanner / distributed SQL: real distributed transactions

Spanner does honest distributed transactions across shards using:

  • Paxos per shard for replication.
  • 2PC across shards for cross-shard transactions, with TrueTime providing global ordering.

Result: actual ACID transactions across petabytes globally. The cost: every cross-shard txn pays cross-region 2PC latency (10s-100s of ms).

CockroachDB, YugabyteDB, TiDB approximate this.

For most apps, this is overkill. But when you need it (financial systems, critical inventory), distributed SQL is now an option.

16.12 What to use when

SituationPattern
Single DBLocal transaction
Multiple DBs in one serviceXA / 2PC (rarely; one DB is better)
Across microservicesSaga + outbox + idempotency
Strong cross-service guaranteeDistributed SQL (Spanner, CockroachDB)
Money + reservationTCC
Event-driven architectureOutbox always

16.13 Common bugs

  • Dual writes without outbox → inconsistency on partial failure.
  • Non-idempotent retries → duplicate side effects (double charge!).
  • Compensation that doesn't compensate (the email already sent).
  • Saga that never completes stuck in middle states; need timeout + alert.
  • Read-modify-write race without optimistic locking → lost updates.

Key takeaways

  • Local transactions are great. Use them.
  • 2PC is brittle; rarely worth the cost.
  • Saga (choreography or orchestration) + outbox + idempotency is the production pattern.
  • TCC for stronger reservation semantics.
  • Distributed SQL gives you real cross-shard ACID at a latency cost.
  • Make every external/queue operation idempotent. No exceptions.

// 1 view

main
UTF-8·typescript