Architecture teardown

Vercel — the edge platform that earned its primitive.

Vercel’s entire product velocity hinges on a single architectural primitive: every deploy is immutable, addressable, and atomic. Preview URLs, atomic rollback, time-travel debugging — all derive from this one decision.

ProductVercel

Signal densityVery dense — docs · blog · changelog · Next.js Conf talks

Stack (inferred)Anycast edge (Fastly + own) · CDN · Fluid Compute · Routing Middleware · AI Gateway · Stripe metering

PatternImmutable artefacts · unified runtime · framework-agnostic middleware · Active CPU pricing

The architecture — from public signals

Core component (the architectural bet) Standard service Async / inferred dataflow

The five things you can see from outside.

Immutable atomic deploys

Push → build → produces an artefact → gets a permanent URL that never changes. Production aliases point to one artefact at a time. Atomic rollback is a DNS-like update. Preview URLs fall out for free.

Fluid Compute unified the runtime

Pre-2024 split (Edge Functions vs Serverless Functions) was confusing. Fluid Compute gives full Node compatibility + low cold-start (instances reused across concurrent requests). Middleware is now Fluid Compute under the hood.

Routing Middleware generalised

What was “Next.js middleware” became framework-agnostic Vercel Routing Middleware. Runs for every request before framework code. Usable by SvelteKit, Nuxt, Astro, raw Node.

Active CPU pricing

Pay for CPU time when computing, not wall-clock when waiting on I/O. Rewards efficient code; reflects the Fluid Compute reality that instances stay warm across requests.

AI Gateway as a platform tier

Since 2025: unified API across Anthropic · OpenAI · Google · OSS, observability, fallbacks, zero data retention. Textbook example of the AI gateway decision answered “buy from your platform”.

What to steal. What to avoid copying.

Read these together — the same pattern can be right for one team and wrong for another.

Steal — the patterns that compound

Immutable addressable artefacts as the deploy primitive — whether your platform is K8s, Cloud Run, or bare metal. Preview environments and atomic rollback come for free.
Unified runtime over “pick fast or pick compatible” — the Edge-vs-Serverless split was confusing. Consolidate, even at the cost of 6 months of platform work.
Routing middleware as a generic layer — auth, A/B routing, feature flags live there, not inside each application.
Active CPU as the meter — the cost-attribution conversation gets honest.
Configuration as typed code (vercel.ts) — logic in code; pure data in tiny YAML.

Avoid copying — unless you're them

Edge-first if your workload is database-bound — the function runs near the user but waits for a DB in a different continent. Net-negative latency.
Vendor lock-in via Vercel-specific primitives (KV, Blob, Cron, Queues) — great DX, deepens lock-in. Standard primitives (Postgres, S3, SQS-equivalent) port.
Serverless mental model for long-running workloads — training jobs, batch processing, websocket-heavy services need container platforms.

Thesis

Vercel earned its position by being opinionated about the deploy primitive (immutable · addressable · atomic) and patient about the runtime question (Edge and Serverless co-existed for years before Fluid Compute consolidated them). The right model for internal platform teams to imitate is the same combination: be opinionated about the unbreakable parts; patient about the parts that need to evolve.

Methodology & sources

Public signals only. Company engineering blog posts, conference talks, job ads, public GitHub, podcasts, product behaviour from the outside. No NDA-covered information; no private conversations. The architecture inferred is my analysis — not endorsed by or affiliated with Vercel.

Primary sources:

Vercel engineering blog · vercel.com/blog
Vercel documentation · vercel.com/docs (Fluid Compute · AI Gateway · Routing Middleware)
Vercel changelog · vercel.com/changelog
Next.js Conf 2023 / 2024 / 2025 talks
Vercel public job listings (platform · infrastructure)
Direct observation of build duration, cold-start, atomic rollback timing.

Found this useful? More coming.

One teardown per quarter. Tell me which architecture you'd want analysed next.

All teardowns → Subscribe to Letters Suggest the next teardown