0/3 prices verified

NVIDIA NIM API Pricing — April 2026

NVIDIA NIM provides optimized inference for major open-weight models running on NVIDIA infrastructure, with consistent latency and enterprise SLA options.

Model Pricing

Model Input ($/M tokens) Output ($/M tokens) Context Use Cases
Llama 3.1 70B unverified $0.35 $0.40 128K General-purpose production workloads
Llama 3.1 8B unverified $0.05 $0.05 128K General-purpose production workloads
Mistral 7B Instruct unverified $0.04 $0.04 32K General-purpose production workloads

How NVIDIA NIM compares to other providers

Running NVIDIA NIM in production? Clawback audits your spend and shows you exactly where you can cut costs without sacrificing quality.

Audit your NVIDIA NIM spend →