Together.ai API Pricing — April 2026
Together.ai provides serverless inference for major open-weight models, with pricing that makes large-scale open-source deployment economically viable.
Model Pricing
| Model | Input ($/M tokens) | Output ($/M tokens) | Context | Use Cases |
|---|---|---|---|---|
| Llama 3.3 70B Instruct Turbo | $0.88 | $0.88 | 128K | General-purpose production workloads |
| Mixtral 8x7B Instruct | $0.60 | $0.60 | 32K | General-purpose production workloads |
| Qwen 2.5 72B Instruct Turbo unverified | $1.20 | $1.20 | 32K | General-purpose production workloads |
| Llama 3.1 8B Instruct Turbo | $0.18 | $0.18 | 128K | General-purpose production workloads |
| Llama 3.1 405B Instruct Turbo | $3.50 | $3.50 | 128K | General-purpose production workloads |
| Gemma 2 27B unverified | $0.80 | $0.80 | 8K | General-purpose production workloads |
| DeepSeek R1 (Together) unverified | $3.00 | $7.00 | 64K | Complex reasoning, math, code |
How Together.ai compares to other providers
OpenAI PricingAnthropic PricingGoogle PricingMistral PricingMeta (via Together.ai) PricingCohere PricingxAI PricingDeepSeek PricingFireworks.ai PricingGroq PricingAmazon Bedrock PricingCerebras PricingNVIDIA NIM Pricing
Running Together.ai in production? Clawback audits your spend and shows you exactly where you can cut costs without sacrificing quality.
Audit your Together.ai spend →