Architecture teardown

Vercel — the edge platform that earned its primitive.

Vercel’s entire product velocity hinges on a single architectural primitive: every deploy is immutable, addressable, and atomic. Preview URLs, atomic rollback, time-travel debugging — all derive from this one decision.

ProductVercel
Signal densityVery dense — docs · blog · changelog · Next.js Conf talks
Stack (inferred)Anycast edge (Fastly + own) · CDN · Fluid Compute · Routing Middleware · AI Gateway · Stripe metering
PatternImmutable artefacts · unified runtime · framework-agnostic middleware · Active CPU pricing

The architecture — from public signals

VERCEL · INFERRED EDGE PLATFORM BUILD PLANE Git push → webhook GitHub · GitLab · Bitbucket Build executor Ephemeral container · per-deploy Artefact registry Immutable EDGE PLANE (GLOBAL) Anycast (Fastly + own) TLS · WAF CDN cache (static · ISR) Per-deploy immutable URLs Fluid Compute Node · Python · Bun · warm reuse Routing Middleware Framework-agnostic AI Gateway Provider abstraction · 0-retention CONTROL PLANE Dashboard + CLI Project · team · domains Deployment API CI/CD · vercel.ts config Observability + logs Per-project drains DNS + auto TLS Custom domains Stripe (metering) Active CPU · bandwidth · calls deploy
Core component (the architectural bet) Standard service Async / inferred dataflow

The five things you can see from outside.

01

Immutable atomic deploys

Push → build → produces an artefact → gets a permanent URL that never changes. Production aliases point to one artefact at a time. Atomic rollback is a DNS-like update. Preview URLs fall out for free.

02

Fluid Compute unified the runtime

Pre-2024 split (Edge Functions vs Serverless Functions) was confusing. Fluid Compute gives full Node compatibility + low cold-start (instances reused across concurrent requests). Middleware is now Fluid Compute under the hood.

03

Routing Middleware generalised

What was “Next.js middleware” became framework-agnostic Vercel Routing Middleware. Runs for every request before framework code. Usable by SvelteKit, Nuxt, Astro, raw Node.

04

Active CPU pricing

Pay for CPU time when computing, not wall-clock when waiting on I/O. Rewards efficient code; reflects the Fluid Compute reality that instances stay warm across requests.

05

AI Gateway as a platform tier

Since 2025: unified API across Anthropic · OpenAI · Google · OSS, observability, fallbacks, zero data retention. Textbook example of the AI gateway decision answered “buy from your platform”.

What to steal. What to avoid copying.

Read these together — the same pattern can be right for one team and wrong for another.

Steal — the patterns that compound

  • Immutable addressable artefacts as the deploy primitive — whether your platform is K8s, Cloud Run, or bare metal. Preview environments and atomic rollback come for free.
  • Unified runtime over “pick fast or pick compatible” — the Edge-vs-Serverless split was confusing. Consolidate, even at the cost of 6 months of platform work.
  • Routing middleware as a generic layer — auth, A/B routing, feature flags live there, not inside each application.
  • Active CPU as the meter — the cost-attribution conversation gets honest.
  • Configuration as typed code (vercel.ts) — logic in code; pure data in tiny YAML.

Avoid copying — unless you're them

  • Edge-first if your workload is database-bound — the function runs near the user but waits for a DB in a different continent. Net-negative latency.
  • Vendor lock-in via Vercel-specific primitives (KV, Blob, Cron, Queues) — great DX, deepens lock-in. Standard primitives (Postgres, S3, SQS-equivalent) port.
  • Serverless mental model for long-running workloads — training jobs, batch processing, websocket-heavy services need container platforms.

What this teardown can't tell you

  • Exact own-infra vs Fastly vs other split at the edge.
  • Build-cache architecture (cache key, hit rate, storage).
  • AI Gateway provider mix, semantic-cache rate, zero-retention economics.
  • Cold-start instance-reuse heuristics (Fluid Compute internals).
Thesis
Vercel earned its position by being opinionated about the deploy primitive (immutable · addressable · atomic) and patient about the runtime question (Edge and Serverless co-existed for years before Fluid Compute consolidated them). The right model for internal platform teams to imitate is the same combination: be opinionated about the unbreakable parts; patient about the parts that need to evolve.

Methodology & sources

Public signals only. Company engineering blog posts, conference talks, job ads, public GitHub, podcasts, product behaviour from the outside. No NDA-covered information; no private conversations. The architecture inferred is my analysis — not endorsed by or affiliated with Vercel.

Primary sources:

  • Vercel engineering blog · vercel.com/blog
  • Vercel documentation · vercel.com/docs (Fluid Compute · AI Gateway · Routing Middleware)
  • Vercel changelog · vercel.com/changelog
  • Next.js Conf 2023 / 2024 / 2025 talks
  • Vercel public job listings (platform · infrastructure)
  • Direct observation of build duration, cold-start, atomic rollback timing.

Found this useful? More coming.

One teardown per quarter. Tell me which architecture you'd want analysed next.

Also on this site