Last verified: April 13, 2026 2/5 prices verified

Fireworks.ai API Pricing — April 2026

Fireworks.ai focuses on low-latency open-model hosting, offering fast inference for Llama and other open-weight models at competitive token prices.

Model Pricing

Model	Input ($/M tokens)	Output ($/M tokens)	Context	Use Cases
Llama 3.3 70B Instruct	$0.90	$0.90	128K	General-purpose production workloads
DeepSeek R1 unverified	$3.00	$8.00	128K	Complex reasoning, math, code
Qwen 2.5 72B Instruct unverified	$0.90	$0.90	32K	General-purpose production workloads
Llama 3.1 8B Instruct	$0.10	$0.10	128K	General-purpose production workloads
Mixtral 8x22B Instruct unverified	$0.90	$0.90	65K	General-purpose production workloads

Running Fireworks.ai in production? Clawback audits your spend and shows you exactly where you can cut costs without sacrificing quality.