Notion — one decision, ten years of consequences.
Notion’s entire data model is one piece of architectural furniture: everything is a block, and the database is a tree of blocks. That single decision drives every subsequent product superpower — and every subsequent engineering pain.
The architecture — from public signals
The five things you can see from outside.
Everything is a block
Page titles, paragraphs, toggles, embedded databases, columns — all blocks with ID, type, parent, children, permissions. The renderer walks the tree recursively. This single model drives every product superpower (composability) and every engineering pain (recursive descent at read).
Per-block ACLs
Sharing applies to individual blocks within pages. Permission resolution happens block-by-block at read. Combined with recursive rendering, this is computationally non-trivial; Notion's engineering blog acknowledges the caching investment.
Postgres re-shard was the inflection
From 1 shard → 32 → 96 → Citus / Vitess. The block model's flexibility came at a database-scaling cost. The blog series on the re-shard is the most-cited engineering write-up in product tenancy.
AI as service tier, on existing data
Notion AI uses RAG over your workspace blocks. The retrieval substrate is the same block store; embeddings are computed at write time, retrieved at query time. No parallel data store for AI.
Single-codebase clients
React app rendered in Electron (desktop) and RN WebView (mobile). The block tree and sync logic are shared. The choice gave Notion product velocity at the cost of native feel — the inverse of Linear's choice.
What to steal. What to avoid copying.
Steal — the patterns that compound
- Block model for composable content — if your product needs “everything is composable” (page builders, dashboard builders, editors with embedded elements). Pay the read-time complexity for write-time flexibility.
- Workspace as tenant + plan-for-shard from day one — same lesson as Linear, larger scale. Even with discipline, the re-shard took months of senior engineering.
- AI on top of existing data, not parallel to it — RAG over your own data via embeddings at write time. Do not build a parallel data store; the sync cost is brutal.
- PgBouncer + read replicas before sharding — sharding is irreversible; pooling is fixable. Defer sharding until metrics demand it.
Avoid copying — unless you're them
- The block model if your content is structurally fixed — forms, e-commerce listings, structured records. You'll pay the complexity without realising the flexibility benefit.
- Single-codebase clients if performance is critical — mobile-first or performance-sensitive apps need separate near-native clients. Linear's choice, not Notion's.
- Block-level ACLs if page-level is sufficient — per-block permissions are powerful but few users use them, and they're expensive to resolve.
What this teardown can't tell you
- Exact sharding key + rebalancing strategy today.
- Embedding refresh cadence and cost (edits → re-embed when?).
- Notion AI's caching + model-mix economics.
- Hot-workspace (one workspace too big for its shard) handling.
Notion is the canonical example of a single architectural decision (the block model) driving every subsequent engineering pain and every subsequent product superpower — instructive both for what to copy when your product needs structural flexibility, and for what to recognise as the cost of that flexibility.
Methodology & sources
Public signals only. Company engineering blog posts, conference talks, job ads, public GitHub, podcasts, product behaviour from the outside. No NDA-covered information; no private conversations. The architecture inferred is my analysis — not endorsed by or affiliated with Notion.
Primary sources:
- Notion engineering blog · notion.so/blog — “The great re-shard”, “Herding elephants”, “Sharding with Citus”
- Notion public job listings (database · platform · AI infra)
- Talks by Ivan Zhao + Notion engineers at Postgres conferences, QCon
- Notion public API · developers.notion.com
- Observation of block-tree via public API, page-load behaviour, AI feature behaviour.
Found this useful? More coming.
One teardown per quarter. Tell me which architecture you'd want analysed next.