Calculator · GenAI

GenAI Cost‑Per‑Outcome Calculator.

Tune model choice, tokens per call, cache hit rate, monthly volume and success rate — see cost-per-resolved-task and annualised spend update live. Vendor pricing as of 2026-05, sourced from the model providers’ published pages (Anthropic, OpenAI, Bedrock, Vertex). Aligned to the FinOps for AI cost-per-outcome framing.

Inputs

Modelclaude-sonnet-4-6
Per-1M tokens. Input / Output. Reflects public list pricing 2026-05.
Input tokens per call2,400
Includes system prompt, conversation, retrieved RAG context. Typical agent call: 8k-15k.
Output tokens per call350
Typical chat: 200-500. Structured extraction: 50-200. Long generation: 1500+.
Prompt-cache hit rate60%
Anthropic / Bedrock / OpenAI all bill cached input at ~10% of read price. RAG with shared context = high hit rate.
Calls per month500,000
Success rate (eval-graded)78%
Percentage of calls that produce an acceptable outcome (per your eval set). Lower = retries inflate cost-per-outcome.
Sector lens

How the calculator works

For every call: cost = input×(1−hit)×in_price + input×hit×cached_price + output×out_price. Cost-per-resolved-task divides by your eval-graded success rate — a call that doesn’t resolve still costs full tokens, plus the retry. Monthly = per-call × volume.

Cached input is billed at ~10% of standard input across Anthropic, Bedrock, OpenAI and Vertex prompt caching. High cache-hit rate (common in RAG with shared system+corpus context) drops cost dramatically — this is the largest free lever most teams haven’t pulled.

Pricing is approximate, rounded, and reflects public list-price as of 2026-05. Enterprise contracts, batch APIs, regional pricing and reserved-capacity discounts shift these materially. Use this for shape, not for procurement.

Also on this site