system-design/networking.md

3. Networking: TCP, HTTP, DNS, WebSockets

Distributed systems are conversations across networks. You must know what each layer guarantees and what it doesn't.

~7 min read·updated 5/29/2026

3. Networking: TCP, HTTP, DNS, WebSockets

Distributed systems are conversations across networks. You must know what each layer guarantees and what it doesn't.

3.1 The OSI / TCP-IP stack

LayerExamplesWhat it does
L7 ApplicationHTTP, gRPC, DNS, SMTPApp-level protocols
L6 PresentationTLS, JSON, ProtobufEncoding, encryption
L5 Session(rarely a separate thing in practice)Conversation state
L4 TransportTCP, UDP, QUICEnd-to-end delivery semantics
L3 NetworkIP, ICMPRouting across networks
L2 Data LinkEthernet, Wi-FiLocal network frames
L1 PhysicalCopper, fiber, radioBits on wire

In design interviews you mostly live at L4 (TCP vs UDP) and L7 (HTTP, gRPC, WebSockets). LB types are also categorized as L4 (transport-aware) vs L7 (HTTP-aware).

3.2 TCP

Transmission Control Protocol. Reliable, ordered, connection-oriented byte stream. The bedrock of HTTP, SMTP, SSH, MySQL wire protocol, etc.

Guarantees:

  • Bytes delivered in order
  • Bytes delivered exactly once (no duplicates surfaced to app)
  • Lost bytes are retransmitted
  • Flow control: receiver can slow down sender
  • Congestion control: TCP backs off when network is loaded

The 3-way handshake

Client                Server
  | -------- SYN ------> |
  | <----- SYN-ACK ----- |
  | -------- ACK ------> |
  (connection established)

This costs 1 RTT before any data can flow. Cross-continent (~150ms RTT), this is huge. TCP Fast Open and HTTP/2 connection reuse mitigate.

TCP gotchas

  • Head-of-line blocking: TCP delivers in order. If packet 5 is lost, packets 6-10 wait in the kernel until 5 is retransmitted, even if your app doesn't care about ordering. HTTP/2 multiplexes streams over one TCP connection, so a single lost packet stalls all streams. HTTP/3 (over QUIC/UDP) fixes this.
  • Connection state lives in the kernel. Each connection costs memory. C10K problem (10K concurrent connections) is solved; C10M (10M) requires kernel-bypass (DPDK, io_uring).
  • TIME_WAIT. Closed connections linger ~60s in TIME_WAIT, holding port numbers. High-churn outbound clients can exhaust ephemeral ports.

3.3 UDP

User Datagram Protocol. Unreliable, unordered, connectionless datagram. Send a packet, hope it arrives.

Why use it?

  • Lower latency (no handshake, no ACKs blocking on retransmission)
  • Acceptable for: DNS queries (retry at app layer), VoIP (drop a sample, keep going), video conferencing, real-time games, QUIC (TCP-replacement)

UDP is a building block. Reliability is added at the app layer if needed.

3.4 HTTP

Application protocol on top of TCP (HTTP/1.1, HTTP/2) or QUIC (HTTP/3).

HTTP/1.0 → HTTP/1.1

  • HTTP/1.0: one request per TCP connection. Slow.
  • HTTP/1.1: persistent connections (keep-alive), pipelining (rarely used due to head-of-line). Six parallel connections per origin (browser limit).

HTTP/2

  • Binary framing (no more parsing text headers)
  • Multiplexing: many parallel streams over one TCP connection. Reduces TCP overhead, but TCP head-of-line blocking still bites a lost packet.
  • Header compression (HPACK): repeated headers cost ~0 bytes after the first request.
  • Server push (mostly removed; browsers deprecated it).

HTTP/3 (QUIC)

  • Built on UDP, not TCP.
  • No TCP head-of-line blocking: each stream is independent.
  • 0-RTT resumption: returning clients can send data in the very first packet.
  • Built-in TLS 1.3: handshake combined with transport.
  • Adopted by Google, Meta, Cloudflare; ~30%+ of web traffic by 2024.

HTTP methods

MethodIdempotentSafeBody
GETYesYesNo
HEADYesYesNo
POSTNoNoYes
PUTYesNoYes
DELETEYesNoNo
PATCHNoNoYes
OPTIONSYesYesNo

Idempotent = same request multiple times has the same effect as once. Critical for retries: only retry idempotent methods automatically. POST is not idempotent unless you add an idempotency key (e.g., Stripe's Idempotency-Key header).

Status codes you must know

  • 2xx success: 200 OK, 201 Created, 202 Accepted (async), 204 No Content
  • 3xx redirect: 301 Moved Permanently, 302 Found, 304 Not Modified (caching)
  • 4xx client error: 400 Bad Request, 401 Unauthorized (no auth), 403 Forbidden (auth but no permission), 404 Not Found, 409 Conflict, 410 Gone, 422 Unprocessable Entity, 429 Too Many Requests
  • 5xx server error: 500 Internal, 502 Bad Gateway, 503 Service Unavailable (load shedding), 504 Gateway Timeout

Use 503 with Retry-After header for rate limit / overload — it tells well-behaved clients to back off.

Caching headers

  • Cache-Control: max-age=3600, public — cacheable for 1 hour
  • Cache-Control: no-store — don't cache anywhere
  • Cache-Control: private — browser only, not CDN
  • ETag: "abc123" + If-None-Match: "abc123" → 304 if unchanged. Saves bandwidth.
  • Last-Modified + If-Modified-Since → similar, time-based.

3.5 HTTPS / TLS

TLS = Transport Layer Security. The thing the S adds in HTTPS. Provides:

  1. Confidentiality: encrypted in transit.
  2. Integrity: tamper-detected.
  3. Authentication: you're talking to the real server (via X.509 certificates).

TLS handshake (TLS 1.2 vs 1.3)

  • TLS 1.2: 2 RTTs.
  • TLS 1.3: 1 RTT (or 0 RTT for resumed sessions). All weak ciphers removed. Mandatory forward secrecy.

Certificates

A TLS cert binds a domain to a public key, signed by a CA (Certificate Authority). Browsers ship a list of trusted CAs. Let's Encrypt made free certs ubiquitous.

For internal services: mTLS (mutual TLS). Both client and server present certs. SPIFFE/SPIRE provides workload identity in service meshes.

3.6 DNS

Domain Name System. Translates google.com142.250.190.46.

Hierarchy

  1. Root servers (.)
  2. TLD servers (.com, .org, country codes)
  3. Authoritative servers for the zone (google.com.)
  4. Recursive resolvers (your ISP, 8.8.8.8, 1.1.1.1) — they ask 1, 2, 3 on your behalf and cache.

Record types

  • A: IPv4 address
  • AAAA: IPv6 address
  • CNAME: alias (wwwexample.com)
  • MX: mail server
  • TXT: arbitrary text (used for SPF, DKIM, domain verification)
  • NS: nameserver delegation
  • SOA: start of authority (zone metadata)

TTL trade-off

  • Long TTL → more cache hits, faster lookups, slower failover (clients use stale records).
  • Short TTL → fast failover, more DNS load.

For DR and rapid failover, set TTL to 60s for the records that need to fail over. Pay the lookup cost.

DNS-based load balancing

Return multiple A records (round-robin) or use geo-DNS (return the IP closest to the client). Limited because clients cache. Used in CDN front doors (Akamai, CloudFront).

3.7 Real-time / push: WebSockets, SSE, polling

How does the server push to the client?

TechniqueDirectionComplexityWhen to use
Pollingclient pulls every X sectrivialinfrequent updates, prototypes
Long pollingclient opens, server holds, replies on eventmoderatemoderate update rate, no WS infra
Server-Sent Events (SSE)server → client only, over HTTPlowdashboards, notifications, one-way streams
WebSocketsfull-duplexmoderatechat, collab editing, gaming
gRPC streaminguni or bi-directionalmoderate (HTTP/2)service-to-service
WebRTCpeer-to-peer over UDPhighvideo/voice, low latency

WebSocket essentials

  • Starts as HTTP, upgrades via Upgrade: websocket.
  • Persistent TCP connection, full-duplex.
  • Each connection costs memory on server (~10s of KB). 1M concurrent WS connections needs careful tuning (kernel params, file descriptors). Discord runs millions per node.
  • Stateful: load balancer must route the same client to the same node, or have a shared pub/sub layer for fan-out.

Push at massive scale (chat/feeds)

  • Persistent gateway tier (WS endpoints).
  • A pub/sub backbone (Redis pub/sub, Kafka, Pulsar) so any backend can push to any user without knowing which gateway holds the connection.
  • "Connection routing" map: user_id → gateway_id, kept in a fast KV store.

3.8 Network failure modes you must design for

  • Packets lost or delayed indefinitely. Always set timeouts.
  • Asymmetric partitions. A → B works; B → A doesn't.
  • Slow but not dead. "Gray failures." Health checks pass, real requests time out. Hardest to detect. Tail latency monitoring helps.
  • Clock skew. Don't trust client time. Even servers drift; NTP corrects ~ms-level. (See chapter 14.)
  • DNS poisoning / hijacking. Use DNSSEC where it matters; use TLS to authenticate the destination.

Key takeaways

  • TCP is reliable, in-order, but pays an RTT to handshake. UDP is fire-and-forget; HTTP/3 builds reliable streams over it without TCP's head-of-line.
  • HTTP methods have semantics — only retry idempotent ones. Use idempotency keys for POST.
  • TLS 1.3 is 1 RTT (or 0); always use it.
  • DNS TTL is a knob: short for failover, long for performance.
  • Real-time push: pick polling → SSE → WebSocket as needs grow.

// 1 view

main
UTF-8·typescript