0/2 prices verified

Cerebras API Pricing — April 2026

Cerebras delivers the fastest token generation in the industry using wafer-scale silicon. Best fit for real-time applications where output speed is the primary constraint.

Model Pricing

Model	Input ($/M tokens)	Output ($/M tokens)	Context	Use Cases
Llama 3.3 70B unverified	$0.85	$1.20	8K	General-purpose production workloads
Llama 3.1 8B unverified	$0.10	$0.10	8K	General-purpose production workloads

How Cerebras compares to other providers

OpenAI Pricing Anthropic Pricing Google Pricing Mistral Pricing Meta (via Together.ai) Pricing Cohere Pricing xAI Pricing DeepSeek Pricing Together.ai Pricing Fireworks.ai Pricing Groq Pricing Amazon Bedrock Pricing NVIDIA NIM Pricing

Running Cerebras in production? Clawback audits your spend and shows you exactly where you can cut costs without sacrificing quality.

Audit your Cerebras spend →