◐ system-design/stream-processing.md
31. Stream Processing: Kafka Streams, Flink, Exactly-Once
Stream processing operates on unbounded data — events arriving continuously. Done right, you replace nightly batch jobs with seconds-fresh insights and react to events as they happen.
~6 min read·updated 5/29/2026
31. Stream Processing: Kafka Streams, Flink, Exactly-Once
Stream processing operates on unbounded data — events arriving continuously. Done right, you replace nightly batch jobs with seconds-fresh insights and react to events as they happen.
31.1 Why streaming
- Lower latency: react in seconds, not hours.
- Continuous insight: dashboards live; alerts immediate.
- Decouples producers and consumers: services emit events; many consumers process.
- Elastic processing: workers scale with load.
Use cases: fraud detection, real-time analytics, dashboards, ML feature stores, alerts, ETL pipelines.
31.2 Streams vs batch
- Batch: bounded data; result is "complete." Triggered by time.
- Stream: unbounded; result is "the answer so far." Triggered by data.
Batch is a special case of stream (a finite stream). Apache Beam models both with one API.
31.3 The streaming stack
Source: an event log
- Kafka: dominant.
- Pulsar: Kafka competitor; tiered storage; better multi-tenancy.
- Kinesis: AWS managed.
- Pub/Sub: GCP managed.
Processor: a stream framework
- Kafka Streams: library; runs in your JVM app; stateful via embedded RocksDB.
- Apache Flink: dedicated cluster; rich windowing, exactly-once, true streaming. Industry leader for stateful streaming.
- Spark Structured Streaming: micro-batch model; mature; integrates with Spark batch.
- Apache Storm, Samza: older; mostly displaced.
- ksqlDB: SQL on Kafka Streams.
- Materialize, Decodable, Estuary: SQL streaming as a service.
Sink
Where results land: another Kafka topic, a DB, a search index, a dashboard, a data warehouse.
31.4 Time in streams: event time vs processing time
Critical distinction.
- Event time: when the event actually happened (in the data's payload).
- Processing time: when the stream processor received it.
In ideal world, equal. In reality, events arrive late, out of order, in bursts. Real-time analytics must handle both.
Watermarks
A watermark is a heuristic: "all events with event time ≤ T have arrived." Allows the system to close windows and emit results.
Generated from the data (e.g., max event time seen − allowed lateness). Late events (after watermark) are either dropped, sent to a side output, or trigger window updates.
Late data strategies
- Drop.
- Side output for separate handling.
- Update windows (allowing fixes; expensive).
31.5 Windowing
Aggregations need bounds. Common windows:
Tumbling
Fixed-size, non-overlapping. "Hourly" bucket.
Hopping (sliding)
Fixed-size, overlapping. "5-minute window every 1 minute" → 5 results per minute boundary.
Session
Variable-size, separated by gaps of inactivity. "User session" = activity grouped with < 30 min gap.
Global / fixed-key
No window; aggregate forever per key.
Custom
Application-defined boundaries.
31.6 State in streams
Most useful streams are stateful: aggregates, joins, sessions.
Where state lives
- In memory (small): fast; lost on crash.
- Embedded persistent KV (RocksDB): used by Flink, Kafka Streams. Backed up via checkpoints.
- External KV (Redis, DynamoDB): visible across processes; latency cost.
Checkpointing
Flink takes consistent snapshots of state across all operators (Chandy-Lamport algorithm). On failure, restart from the last checkpoint and replay events from Kafka offset → exactly-once.
31.7 Joins in streams
Stream-stream join
Two streams, joined within a window. Order events arrive in. Used for: clickstream + purchase, ad-impression + click.
Stream-table join
Stream joined against a slowly-changing table (often broadcast or lookup). Used for enrichment: events × user dimension.
Table-table join (continuous)
Both sides continuous; result kept up to date.
Joins require state proportional to the join window. Tune carefully.
31.8 Exactly-once processing
Achieved by:
- Idempotent / transactional sink: writes are deduplicated, or use a transaction.
- Checkpointed state: on failure, restart from last consistent point.
- Replayable source: Kafka holds events for the retention window.
- Coordinated checkpoint commits: state checkpoint + source offset commit are atomic.
Flink + Kafka achieves this end-to-end with two-phase commits.
Note: "exactly-once" is for the stream pipeline. Side effects to external systems (sending an email, hitting a third-party API) still need idempotency keys.
31.9 Backpressure
Producers may outpace consumers. Strategies:
- Pull-based (Kafka): consumer fetches at its own pace; broker buffers.
- Push-based with credits (gRPC, reactive streams): consumer signals capacity.
- Drop on overload: shed events.
- Auto-scale consumers: add workers based on lag.
Watch consumer lag. Alert when it grows beyond a threshold.
31.10 Stateful operators in detail
KStream / KTable / GlobalKTable (Kafka Streams)
- KStream: an event stream.
- KTable: a changelog stream interpreted as a table (latest value per key).
- GlobalKTable: a KTable replicated to all workers (for broadcast joins).
Powerful, but a learning curve. The "duality" of streams and tables is foundational.
Flink's data model
DataStream + Table API. SQL-on-streams.
31.11 Common patterns
Real-time dashboards
Events → aggregate by minute → push to dashboard via WebSocket.
Fraud detection
Card swipes → stateful pattern detection (e.g., > $X within Y seconds in different countries) → alert.
Feature store updates
Real-time signals (click rates, user-recent-actions) computed via Flink → feature store → ML serving.
Stream-to-warehouse
Kafka events → micro-batches → Iceberg / BigQuery for analytics.
Change data capture (CDC)
DB → CDC tool (Debezium) → Kafka → downstream consumers (search index, data lake, microservices).
Outbox + CDC (recap)
Service writes outbox → Debezium reads WAL → Kafka. The reliable pattern for transactional event publishing.
31.12 Lambda and Kappa architectures
Lambda
Two pipelines:
- Batch layer: accurate, slow.
- Speed (stream) layer: fast, approximate.
- Serving layer: merges both.
Built before streaming was reliable enough alone. Now considered legacy due to maintenance burden of two pipelines.
Kappa
Single stream pipeline. Replay history when you need a recompute (Kafka retention or archived to object store).
Most modern systems are Kappa-shaped: one stream pipeline, with batch jobs only when necessary.
31.13 Materialized views
Stream processing → continuously updated materialized view in a DB / cache / index. Queries hit the materialized form.
Examples:
- Materialize, RisingWave: SQL streaming DBs that maintain materialized views in real time.
- Druid, Pinot: real-time OLAP DBs that ingest streams and serve sub-second queries.
- Custom: Flink job → Postgres view.
This pattern is increasingly the default for "real-time analytics" and dashboards.
31.14 Operational notes
- Schema evolution: streams persist; old code reads new data, new code reads old. Use schema registry.
- Reprocessing: Kafka allows resetting consumer offsets to replay history. Plan for it.
- Hot keys: skewed key distribution → one worker overloaded. Salt or repartition.
- State size: monitor; checkpoints get slow at large state.
- Downstream backpressure: handle gracefully; never drop silently without metrics.
31.15 What an interviewer wants
- Streams vs batch trade-off.
- Event time vs processing time + watermarks.
- Windowing types (tumbling, hopping, session).
- Exactly-once semantics: how Flink + Kafka achieve it.
- Choose Flink for stateful streaming; Kafka Streams for in-app embedded.
31.16 Google's contributions
- MillWheel (2013 paper): low-latency stream processing.
- Dataflow (2015 paper): unified batch + stream model. Open-sourced as Apache Beam.
- The "Streaming Systems" book (Tyler Akidau et al.) — the definitive treatment.
Beam's model is the most coherent treatment of "what does a streaming computation mean": triggers, accumulation modes, panes, watermarks. Worth knowing.
Key takeaways
- Stream processing operates on unbounded data; replaces or complements batch.
- Event time + watermarks are essential for correctness with late/out-of-order data.
- Windowing: tumbling, hopping, session.
- Exactly-once via Flink + Kafka transactions is achievable end-to-end (within the pipeline).
- Kappa architecture (one stream pipeline) is the modern default; Lambda is legacy.
- Materialized views over streams are the new "real-time analytics" default.
// 0 views