GenAI Cost‑Per‑Outcome Calculator.
Tune model choice, tokens per call, cache hit rate, monthly volume and success rate — see cost-per-resolved-task and annualised spend update live. Vendor pricing as of 2026-05, sourced from the model providers’ published pages (Anthropic, OpenAI, Bedrock, Vertex). Aligned to the FinOps for AI cost-per-outcome framing.
Inputs
How the calculator works
For every call: cost = input×(1−hit)×in_price + input×hit×cached_price + output×out_price.
Cost-per-resolved-task divides by your eval-graded success rate — a call that doesn’t resolve still costs full tokens, plus the retry. Monthly = per-call × volume.
Cached input is billed at ~10% of standard input across Anthropic, Bedrock, OpenAI and Vertex prompt caching. High cache-hit rate (common in RAG with shared system+corpus context) drops cost dramatically — this is the largest free lever most teams haven’t pulled.
Pricing is approximate, rounded, and reflects public list-price as of 2026-05. Enterprise contracts, batch APIs, regional pricing and reserved-capacity discounts shift these materially. Use this for shape, not for procurement.