Llama 3.1 405B Instruct Turbo Pricing — $3.50/M input, $3.50/M output
Together.ai / April 2026
Input $3.50/M tokens
Output $3.50/M tokens
Context Window 128K tokens
Largest open-weight Llama model
Typical use cases
General-purpose inference, chat, extraction, structured output
Estimated monthly cost at scale
Assumes 50/50 input/output token split at stated daily volume.
| Daily Volume | Monthly Tokens | Estimated Monthly Cost |
|---|---|---|
| 1M tokens/day | 30M tokens | $105.00 |
| 5M tokens/day | 150M tokens | $525.00 |
| 10M tokens/day | 300M tokens | $1,050.00 |
vs. other Together.ai models
| Model | Input ($/M) | Output ($/M) | Context |
|---|---|---|---|
| Llama 3.3 70B Instruct Turbo | $0.88 | $0.88 | 128K |
| Mixtral 8x7B Instruct | $0.60 | $0.60 | 32K |
| Qwen 2.5 72B Instruct Turbo | $1.20 | $1.20 | 32K |
| Llama 3.1 8B Instruct Turbo | $0.18 | $0.18 | 128K |
| Gemma 2 27B | $0.80 | $0.80 | 8K |
| DeepSeek R1 (Together) | $3.00 | $7.00 | 64K |
Not sure if Llama 3.1 405B Instruct Turbo is the right fit for your workload? Clawback tests cheaper alternatives against your actual prompts and tells you exactly where you're overpaying.
Test if a cheaper model matches Llama 3.1 405B Instruct Turbo quality →