7. NoSQL Deep Dive: KV, Document, Wide-Column, Graph

"NoSQL" is a marketing umbrella for "not relational." The differences within NoSQL are larger than the differences from SQL. You must reason about each model on its own terms.

7.1 Why NoSQL emerged

By 2008-2010, three pressures hit:

Scale: Google, Amazon, Facebook had data sets and write rates no single relational box could handle.
Schema flexibility: rapid product iteration, semi-structured data (logs, events, user-generated content).
Operational fit: many web workloads only need key-based access; the cost of full SQL features wasn't paying for itself.

Influential papers:

Bigtable (Google, 2006) → wide-column.
Dynamo (Amazon, 2007) → leaderless replication, eventual consistency, consistent hashing.
MapReduce + GFS (Google) → batch + distributed FS.

Today the lines blur: distributed SQL (Spanner, CockroachDB, Vitess) and document Postgres extensions narrow the gap. But the four NoSQL families remain conceptually distinct.

7.2 Key-Value stores

Simplest interface: get(k), put(k, v), sometimes delete, cas (compare-and-set).

Use cases

Cache (Redis, Memcached)
Session store (Redis)
Lookup (DynamoDB by partition key)
Counters / rate-limiting (Redis INCR, sorted sets)
Pub/sub (Redis, NATS)

Redis specifics

Not just KV. Has rich data types:

Strings (with INCR/DECR)
Lists (push/pop both ends; queues)
Sets (membership, intersection)
Hashes (small object store; Redis stores small hashes specially as a list to save memory)
Sorted sets (ZSET): score-ordered, supports range, used for leaderboards, top-N
Streams (Redis 5+, append-only log; consumer groups like Kafka-lite)
Bitmaps and HyperLogLog (probabilistic, see chapter 17)
Geospatial (geohash-backed sorted sets)

Persistence options:

RDB: periodic snapshot (fork → write whole memory). Crash loses recent writes.
AOF: append-only file of every command. Replayed on startup. appendfsync everysec (default) or always (per-write fsync, slower).

Threading: single-threaded for command execution → simple semantics, no locks. Backed by event loop. Redis 6+ optionally multi-threads I/O.

Cluster: hash slots (16384), partitioned across nodes. Client routes; nodes redirect on slot moves (MOVED/ASK).

DynamoDB specifics

Item model: PK = partition key (+ optional sort key) → row.
Provisioned or on-demand throughput.
Single-digit ms reads (consistent or eventually consistent).
Global Secondary Indexes (GSI) and Local Secondary Indexes (LSI). GSIs are eventually consistent.
Streams (CDC): table → Lambda or Kinesis.
DAX: managed write-through cache.

Pricing model forces the design: hot partitions get throttled. Distribute writes across many partition keys.

7.3 Document stores

Each item is a structured document (JSON). Indexes on fields. Often schema-on-read.

MongoDB specifics

BSON storage; documents up to 16 MB.
Replica sets (1 primary + N secondaries) with automatic failover.
Sharding via mongos router; chunk-based, with balancer.
Multi-document transactions since 4.0 (replica set) and 4.2 (sharded). Costly; designed for the rare case, not the default.
Aggregation pipeline for ETL-like queries.
Has had a checkered history with default consistency settings (w:1 write concerns by default in old versions made it lose data on failover). Modern defaults are safer.

When document is right

Variable schema across items
Self-contained "object per row" — order with line items, blog post with comments
No (or rare) cross-document joins required

When document is wrong

Many-to-many relationships (you'll either denormalize and over-fetch or app-side join)
Transactional consistency across documents (works but slow)
Analytics workloads (use a column store)

7.4 Wide-column stores (column families)

Mental model: a sparse, distributed, multi-dimensional sorted map.

(row_key, column_family:column, timestamp) → value

Rows are sorted by row_key. Designed for writes at scale (LSM internals) and range scans on row keys.

Bigtable

Google's original (2006). Powers Search, Maps, Gmail, Analytics, etc.

Row key is the only "index". You design the row key to encode access patterns.
Single-row transactions only.
Tablet servers manage ranges of rows; Master coordinates assignment; Chubby provides locking.
Underlying storage: GFS / Colossus.
SSTable + memtable + WAL (the LSM pattern).

Cassandra

Open-source descendant blending Bigtable's data model with Dynamo's leaderless replication.

Tunable consistency: per-query R, W, N (replicas needed for read, write, total). For strong: R + W > N.
No master: gossip protocol; any node can serve any request, coordinates writes via consistent hashing.
Partition key + clustering key: data within a partition is sorted by clustering key.
CQL: SQL-like query language. Don't be fooled — no joins, no subqueries, no GROUP BY across partitions.
Anti-entropy: read repair, hinted handoff, Merkle-tree based repair process.

Modeling rule: one table per query. Denormalize aggressively.

HBase

Open-source Bigtable clone, on top of HDFS. Strong row consistency. Region servers; ZooKeeper for coordination.

When wide-column shines

Time-series / event data (key = (device_id, ts), range scan on time)
Hot writes (LSM throughput)
Sparse columns (millions of possible columns per row, but each row has few)
Multi-PB scale

When it hurts

Ad-hoc queries (you'd need a new table)
Transactions across rows
Joins of any kind

7.5 Graph databases

Native graph storage and traversal.

Property graphs (Neo4j, Neptune, JanusGraph)

Nodes and edges, both with properties.
Cypher query language: MATCH (a:User)-[:FRIEND]->(b:User) RETURN a, b
Optimized index-free adjacency: each node has direct pointers to its edges → constant-time hop, regardless of graph size.

Triple stores (RDF: subject, predicate, object)

SPARQL query language.
Used in semantic web, knowledge graphs.

When a graph DB beats a relational DB

Variable-depth traversals: shortest path, k-hop neighbors
Frequent recursive queries (Postgres CTEs work but get slow)
Pattern matching on relationships

When NOT to use one

Most "social" features (timeline, follow counts) are bounded-depth and fit relational fine.
The data model is more important than the storage. Unless you genuinely live in graphland, the operational cost of another system isn't worth it.

7.6 The Dynamo paper (and its descendants)

Amazon Dynamo (2007 SOSP paper) introduced ideas now ubiquitous:

Consistent hashing for partitioning (chapter 17).
Sloppy quorum + hinted handoff for write availability when replicas are down.
Vector clocks for conflict detection (chapter 14).
Read repair + anti-entropy (Merkle trees) for eventual consistency.
No master: any node can coordinate.

Direct descendants: Cassandra, Riak, Voldemort, ScyllaDB. Modern DynamoDB (the AWS service) is not the same; it uses single-leader replication per partition with multi-AZ failover.

7.7 BASE vs ACID

The marketing counterpart to ACID:

Basically Available
Soft state
Eventual consistency

In practice, "BASE" is just shorthand for "we picked AP over CP, and we'll converge eventually." Don't quote it as a principle; reason about your actual consistency needs and the actual guarantees your store provides.

7.8 Choosing a NoSQL store

Workload	Pick
Cache, session, ephemeral counters	Redis
Massive scale (PB), wide-column, time-series	Bigtable / Cassandra
Document model, simple horizontal scale	MongoDB / Firestore
Single-digit ms KV at AWS	DynamoDB
Graph, deep traversals	Neo4j
Search-as-a-feature	Elasticsearch

Default to Postgres; reach for these when the workload truly needs them.

7.9 What Google interviewers love to hear

You know Bigtable's data model, can explain why row key design matters, and how it relates to Cassandra/HBase.
You know Spanner's contribution: it brought relational schemas + ACID transactions + horizontal scale together via TrueTime + Paxos. Before Spanner, "NoSQL or scale" was a forced choice.
You can articulate trade-offs: why use a column-family store over Postgres for clickstream events, but Postgres for orders.

Key takeaways

"NoSQL" is four families: KV, document, wide-column, graph. Each solves a different problem.
Cassandra / Bigtable / HBase = wide-column for write-heavy, scaled-out, denormalized workloads.
Redis is a Swiss army knife: KV, queue, pub/sub, geo, leaderboard.
Distributed SQL (Spanner) erodes the "scale → must give up SQL" narrative.