Last verified: April 13, 2026 4/7 prices verified

Together.ai API Pricing — April 2026

Together.ai provides serverless inference for major open-weight models, with pricing that makes large-scale open-source deployment economically viable.

Model Pricing

Model Input ($/M tokens) Output ($/M tokens) Context Use Cases
Llama 3.3 70B Instruct Turbo $0.88 $0.88 128K General-purpose production workloads
Mixtral 8x7B Instruct $0.60 $0.60 32K General-purpose production workloads
Qwen 2.5 72B Instruct Turbo unverified $1.20 $1.20 32K General-purpose production workloads
Llama 3.1 8B Instruct Turbo $0.18 $0.18 128K General-purpose production workloads
Llama 3.1 405B Instruct Turbo $3.50 $3.50 128K General-purpose production workloads
Gemma 2 27B unverified $0.80 $0.80 8K General-purpose production workloads
DeepSeek R1 (Together) unverified $3.00 $7.00 64K Complex reasoning, math, code

How Together.ai compares to other providers

Running Together.ai in production? Clawback audits your spend and shows you exactly where you can cut costs without sacrificing quality.

Audit your Together.ai spend →