Last verified: April 13, 2026 2/5 prices verified

Fireworks.ai API Pricing — April 2026

Fireworks.ai focuses on low-latency open-model hosting, offering fast inference for Llama and other open-weight models at competitive token prices.

Model Pricing

Model Input ($/M tokens) Output ($/M tokens) Context Use Cases
Llama 3.3 70B Instruct $0.90 $0.90 128K General-purpose production workloads
DeepSeek R1 unverified $3.00 $8.00 128K Complex reasoning, math, code
Qwen 2.5 72B Instruct unverified $0.90 $0.90 32K General-purpose production workloads
Llama 3.1 8B Instruct $0.10 $0.10 128K General-purpose production workloads
Mixtral 8x22B Instruct unverified $0.90 $0.90 65K General-purpose production workloads

How Fireworks.ai compares to other providers

Running Fireworks.ai in production? Clawback audits your spend and shows you exactly where you can cut costs without sacrificing quality.

Audit your Fireworks.ai spend →