20. Microservices, Monoliths, Service Mesh

Microservices vs monolith is a religious war that should be a pragmatic choice. Both have their place. The right answer depends on team size, change rate, and operational maturity.

20.1 Definitions

Monolith

One codebase, one deployable, one process per replica. All features share a database (usually).

Modular monolith

One deployable, but internally organized into well-bounded modules with explicit interfaces. Sometimes a stepping stone to microservices; often the right end state.

Service-Oriented Architecture (SOA)

Multiple services, often coarse-grained, communicating via contracts (SOAP historically). Older sibling of microservices.

Microservices

Many small services, each owned by a small team, each independently deployable, each with its own data store. Communicate over network (HTTP/gRPC/queue).

20.2 The case for monoliths (or modular monoliths)

Simpler operations: one deploy, one log stream, one DB to back up.
Lower latency: function calls, not network calls.
Easier transactions: one DB → ACID for free.
Easier debugging: one stack trace, no distributed trace.
Faster iteration when team is small (< ~30 engineers).
Refactoring is local: rename a function across the codebase in one PR.

Famous examples: Stack Overflow runs on a small Postgres + IIS monolith for years; Shopify, Basecamp, GitHub started monolithic and stayed mostly so for ages.

20.3 The case for microservices

Independent scaling: scale auth separately from payments.
Polyglot: each service can pick the right language.
Team independence: separate deploys, separate on-call, separate ownership.
Fault isolation: one bad service shouldn't crash the rest (in theory).
Tech evolution: rewrite or replace one service without touching others.

20.4 The cost of microservices

Distributed system problems: network latency, partial failure, eventual consistency, distributed transactions, debugging hell.
Operational complexity: dozens or hundreds of services, each with its own deploy pipeline, DB, runbook, dashboards.
Testing: integration tests cross many services; consumer-driven contracts emerge.
Cross-service refactoring: a domain change might touch 5 services and 3 schemas; coordination required.
Latency: every internal RPC adds 1-10 ms; chains of 5 services = 5-50 ms baseline.

The Microservices Premium (Fowler): microservices have a baseline operational cost. Below ~30 engineers, the premium is rarely worth it.

20.5 When to split

Triggers for splitting a monolith:

Team > ~30 engineers and deploys are mutually blocking.
Subsystems have wildly different scale needs (e.g., a small core service + a 100x bigger search backend).
Subsystems have different reliability needs (payment must be 99.99%; rest can be 99.9%).
Subsystems are owned by different orgs.
Different release cadences (mobile API vs internal admin).

20.6 How to split

Bounded contexts (DDD)

Identify domain boundaries: payments, orders, inventory, search. Each becomes a candidate service. Don't split by technical layer (web tier, business tier, data tier) — that's distributed monolith.

Database per service

A core principle. Each service owns its DB. No other service reads it directly.

Why: you can change schema without coordinating; you can scale each DB independently; tenant data lives where it belongs.

Cost: cross-service queries become RPCs; joins disappear; data must be duplicated/eventual-consistent.

Anti-patterns when splitting

Distributed monolith: services that all change together. You got the cost without the benefit.
Shared DB across services: tight coupling, no independence. Just merge them back.
Sync RPC chains 5+ deep: latency catastrophe; one slow service = everything slow.
Chatty interfaces: 30 RPCs per user request. Combine.
Premature splitting: the right boundaries emerge from operating the system; carve them out as you learn.

20.7 Communication patterns

Synchronous (REST/gRPC)

Direct request-response. Easy to reason about. Tight coupling.

When: read paths, simple CRUD, where the caller needs an answer to proceed.

Asynchronous (queue / event)

Producer fires; consumer eventually processes.

When: writes that don't need an immediate response, fan-out, decoupling, batching.

Pub/sub

Service emits domain events; others subscribe.

When: multiple downstream interests, decoupling, integration without API contracts proliferating.

Saga

Workflow across services with compensations. (See chapter 16.)

Default mix

Read paths sync (RPC). Write paths often async with outbox + events. Critical workflows orchestrated via saga (Temporal, Step Functions).

20.8 API design between services

(See chapter 21.) Contracts are the boundary; treat them like public APIs.

Versioning: never break consumers without notice.
Backwards compatibility: schema evolution rules (chapter 8).
Idempotency keys: assume retries.
Timeouts everywhere: never block forever.

20.9 Service discovery

How do services find each other?

Client-side (Eureka, Consul)

Client queries discovery; picks an instance; calls directly.

Server-side (DNS, K8s service)

Client calls a stable name; routing layer resolves to instance.

Service mesh (Envoy + control plane)

Sidecar proxy handles discovery transparently; app code talks to localhost.

In Kubernetes: every Service has a stable DNS name (payments.default.svc.cluster.local) → kube-proxy iptables rules → backing pods. Magic.

20.10 Service mesh

A layer that handles cross-cutting concerns for service-to-service traffic.

Components

Data plane: sidecar proxy (Envoy, Linkerd-proxy) per service instance. Intercepts all traffic.
Control plane: configures the data plane (Istiod, Linkerd control).

What it gives you

mTLS between services (zero-trust networking).
L7 routing (canary, A/B, traffic shifting).
Retries, timeouts, circuit breakers, outlier detection.
Telemetry (metrics, distributed traces) for free.
Rate limiting.
Policy (deny service-A → service-B).

Cost

Operational complexity (sidecars, control plane, version upgrades).
Latency (sidecar adds ~1ms).
Resource usage (sidecar per pod = CPU/RAM).
Debugging: another layer to look at.

When it's worth it

10s+ services in production.
Strong security requirements (zero-trust).
Need for traffic shaping (canary across many services).

For 5-10 services, simple HTTP clients with retries + timeouts + good logging are usually enough.

20.11 API gateway

Single entry point for external clients. Common in microservices.

Responsibilities:

TLS termination
Auth (OAuth, JWT validation)
Rate limiting
Request routing to services
Request/response transformation
Aggregation of multiple service calls
Observability (logging, tracing)

Examples: Kong, Apigee, AWS API Gateway, Envoy Gateway, Tyk.

Risk: gateway becomes a god object. Keep logic minimal.

Backend-for-Frontend (BFF)

A gateway per client type (mobile-bff, web-bff). Lets each frontend get tailored APIs without overloading the general API.

20.12 Data ownership and eventual consistency

In microservices, you can't JOIN across service boundaries. Solutions:

Replicated read models: service A subscribes to service B's events, builds a local view.
API composition: in the read path, query both services and merge.
CQRS + event sourcing: write to one model; project to many read models.

Consistency is eventual. Design UX for it ("your order is being processed" while the saga finishes).

20.13 Operational maturity required

Microservices demand:

Containerization (Docker).
Orchestration (Kubernetes, ECS, Nomad).
CI/CD per service.
Centralized logging, metrics, tracing (the three pillars; chapter 26).
Service catalog / inventory.
Incident response across teams.
SLOs and error budgets per service.

Without these, microservices will pull a small team underwater fast.

20.14 Practical migration strategy

Strangler fig: incrementally extract services from a monolith.

Identify a bounded context.
Build the new service alongside the monolith.
Route traffic for that context to new service via a façade.
Once stable, delete monolith code.
Repeat.

Don't rewrite the whole thing in one shot. Big bang rewrites are graveyards.

20.15 Google's approach

Google internally uses many small services with an extremely sophisticated ecosystem:

Borg / Kubernetes for orchestration.
Stubby / gRPC for RPC.
Protocol Buffers for serialization.
Chubby (lock service) for coordination.
Spanner / Bigtable for storage.
Monorepo so cross-service changes are atomic at the source level (in one PR).
Strong observability and SRE practices.

The monorepo + atomic cross-service changes addresses the biggest microservices pain point (coordinated changes). Most companies don't have this.

Key takeaways

Monoliths are great until they aren't. Modular monoliths buy a long runway.
Microservices win at scale (people and load) at high operational cost.
Database per service is foundational; cross-service joins are forbidden.
Service mesh handles cross-cutting concerns at scale; overkill for small fleets.
Strangler fig over big bang rewrites.