The cheapest LLM APIs, right now.
Verified token pricing from every major provider — sorted, filtered, and visualized. No fabricated numbers: every price links back to its official source.
| Model | Provider | Input $/Mtok | Output $/Mtok | Blended | Relative cost | Context | Notes |
|---|---|---|---|---|---|---|---|
| GLM-4.7-Flash | Zhipu | $0.000 | $0.000 | $0.000 |
128K | Open | |
| Rerank 3.5 | Cohere | $0.020 | $0.020 | $0.020 |
— | ||
| text-embedding-3-small | OpenAI | $0.020 | $0.020 | $0.020 |
8K | ||
| rerank-2.5-lite | Voyage AI | $0.020 | $0.020 | $0.020 |
— | ||
| voyage-4-lite | Voyage AI | $0.020 | $0.020 | $0.020 |
— | ||
| text-embedding-004 | $0.025 | $0.025 | $0.025 |
2K | |||
| GLM-OCR | Zhipu | $0.030 | $0.030 | $0.030 |
128K | Open | |
| rerank-2.5 | Voyage AI | $0.050 | $0.050 | $0.050 |
— | ||
| voyage-4 | Voyage AI | $0.060 | $0.060 | $0.060 |
— | ||
| Granite 4.0 Micro | IBM | $0.017 | $0.112 | $0.065 |
128K | Open | |
| Llama 3.1 8B | Meta | $0.050 | $0.080 | $0.065 |
128K | Open | |
| Baichuan M2-32B | Baichuan | $0.070 | $0.070 | $0.070 |
33K | Open | |
| LFM2 24B A2B | Together | $0.030 | $0.120 | $0.075 |
128K | Open | |
| Nova Micro | Amazon | $0.035 | $0.140 | $0.088 |
128K | ||
| Ministral 3 3B | Mistral | $0.100 | $0.100 | $0.100 |
128K | Open | |
| Pixtral 12B | Mistral | $0.100 | $0.100 | $0.100 |
128K | Open | |
| Nemotron 70B Instruct | NVIDIA | $0.100 | $0.100 | $0.100 |
128K | Open | |
| Reka Edge | Reka | $0.100 | $0.100 | $0.100 |
66K | ||
| GLM-4-32B-0414 | Zhipu | $0.100 | $0.100 | $0.100 |
128K | Open | |
| Granite Embedding 278M Multilingual | IBM | $0.106 | $0.106 | $0.106 |
— | Open | |
| Embed 4 | Cohere | $0.120 | $0.120 | $0.120 |
— | ||
| voyage-4-large | Voyage AI | $0.120 | $0.120 | $0.120 |
— | ||
| voyage-multimodal-3.5 | Voyage AI | $0.120 | $0.120 | $0.120 |
— | ||
| Qwen-Turbo | Alibaba | $0.050 | $0.200 | $0.125 |
1M | ||
| text-embedding-3-large | OpenAI | $0.130 | $0.130 | $0.130 |
8K | ||
| Mistral Small 3.2 24B | Mistral | $0.080 | $0.200 | $0.140 |
128K | Open | |
| Nova Lite | Amazon | $0.060 | $0.240 | $0.150 |
300K | ||
| Granite 4 H Small | IBM | $0.060 | $0.250 | $0.155 |
128K | Open | |
| voyage-code-3 | Voyage AI | $0.180 | $0.180 | $0.180 |
32K | ||
| voyage-context-3 | Voyage AI | $0.180 | $0.180 | $0.180 |
32K | ||
| GPT OSS 20B | Fireworks | $0.070cached $0.035 | $0.300 | $0.185 |
128K | Open | |
| Gemini 2.5 Flash | $0.075 | $0.300 | $0.188 |
1M | |||
| Voxtral Small 24B | Mistral | $0.100 | $0.300 | $0.200 |
128K | Open | |
| DeepSeek V4 Flash | DeepSeek | $0.140cached $0.003 | $0.280 | $0.210 |
1M | ||
| DeepSeek V4 Flash | Fireworks | $0.140cached $0.028 | $0.280 | $0.210 |
1M | ||
| Llama 4 Scout | Meta | $0.110 | $0.340 | $0.225 |
10M | Open | |
| GLM-4.7-FlashX | Zhipu | $0.070cached $0.010 | $0.400 | $0.235 |
128K | Open | |
| Gemini 2.5 Flash-Lite | $0.100 | $0.400 | $0.250 |
1M | |||
| GPT-4.1 nano | OpenAI | $0.100cached $0.050 | $0.400 | $0.250 |
1M | ||
| Qwen-Flash | Alibaba | $0.115 | $0.460 | $0.288 |
1M | ||
| Jamba Mini | AI21 Labs | $0.200 | $0.400 | $0.300 |
256K | Open | |
| Grok 4.1 Fast | xAI | $0.200 | $0.500 | $0.350 |
2M | ||
| Command R 08-2024 | Cohere | $0.150 | $0.600 | $0.375 |
128K | ||
| GPT OSS 120B | Fireworks | $0.150cached $0.015 | $0.600 | $0.375 |
128K | Open | |
| Granite 4 H Medium | IBM | $0.150 | $0.600 | $0.375 |
128K | Open | |
| Mistral Small 4 | Mistral | $0.150 | $0.600 | $0.375 |
128K | ||
| gpt-oss-120B | Together | $0.150 | $0.600 | $0.375 |
128K | Open | |
| Llama 3.1 8B Instant | Groq | $0.050 | $1.00 | $0.525 |
128K | 840 TPS Open | |
| GPT-OSS 20B | Groq | $0.075 | $1.00 | $0.537 |
128K | 1000 TPS Open | |
| Mixtral 8x7B Instruct | Mistral | $0.540 | $0.540 | $0.540 |
32K | Open | |
| Llama 4 Scout | Groq | $0.110 | $1.00 | $0.555 |
10M | 594 TPS Open | |
| Gemma-4-31B-it-Pearl | Together | $0.280 | $0.860 | $0.570 |
128K | Open | |
| GPT-OSS 120B | Groq | $0.150 | $1.00 | $0.575 |
128K | 500 TPS Open | |
| Codestral | Mistral | $0.300 | $0.900 | $0.600 |
256K | ||
| Codestral 2508 | Mistral | $0.300 | $0.900 | $0.600 |
256K | ||
| GLM-4.6V | Zhipu | $0.300cached $0.050 | $0.900 | $0.600 |
128K | Open | |
| Qwen3 32B | Groq | $0.290 | $1.00 | $0.645 |
128K | 662 TPS Open | |
| GLM-4.5-Air | Zhipu | $0.200cached $0.030 | $1.10 | $0.650 |
128K | Open | |
| DeepSeek V4 Pro | DeepSeek | $0.435cached $0.004 | $0.870 | $0.652 |
1M | ||
| Llama 3.3 70B | Meta | $0.590 | $0.790 | $0.690 |
128K | Open | |
| MiniMax 2.5 | Fireworks | $0.300cached $0.030 | $1.20 | $0.750 |
128K | Open | |
| MiniMax 2.7 | Fireworks | $0.300cached $0.060 | $1.20 | $0.750 |
128K | Open | |
| MiniMax M3 | Fireworks | $0.300cached $0.060 | $1.20 | $0.750 |
1M | Open | |
| Granite 4 H Large | IBM | $0.300 | $1.20 | $0.750 |
128K | Open | |
| MiniMax-M2 | MiniMax | $0.300cached $0.030 | $1.20 | $0.750 |
205K | Open | |
| MiniMax-M2.1 | MiniMax | $0.300cached $0.030 | $1.20 | $0.750 |
205K | Open | |
| MiniMax-M2.5 | MiniMax | $0.300cached $0.030 | $1.20 | $0.750 |
205K | Open | |
| MiniMax-M2.7 | MiniMax | $0.300cached $0.060 | $1.20 | $0.750 |
205K | Open | |
| MiniMax-M3 | MiniMax | $0.300cached $0.060 | $1.20 | $0.750 |
1M | Open | |
| MiniMax M2.5 | Together | $0.300cached $0.060 | $1.20 | $0.750 |
128K | Open | |
| MiniMax M3 | Together | $0.300cached $0.060 | $1.20 | $0.750 |
1M | Open | |
| Llama 3.3 70B Versatile | Groq | $0.590 | $1.00 | $0.795 |
128K | 394 TPS Open | |
| Qwen-Plus | Alibaba | $0.400 | $1.20 | $0.800 |
131K | ||
| Qwen 3.6 27B | Groq | $0.600 | $1.00 | $0.800 |
128K | 500 TPS Open | |
| Qwen3.7-Plus | Together | $0.320 | $1.28 | $0.800 |
128K | Open | |
| Gemini 3.1 Flash-Lite | $0.250 | $1.50 | $0.875 |
1M | |||
| Command R 03-2024 | Cohere | $0.500 | $1.50 | $1.00 |
128K | ||
| Qwen 3.7 Plus | Fireworks | $0.400cached $0.080 | $1.60 | $1.00 |
128K | Open | |
| Mistral Large 3 | Mistral | $0.500 | $1.50 | $1.00 |
128K | ||
| GPT-4.1 mini | OpenAI | $0.400cached $0.200 | $1.60 | $1.00 |
1M | ||
| Sonar | Perplexity | $1.00 | $1.00 | $1.00 |
200K | ||
| Llama 3.3 70B | Together | $1.04 | $1.04 | $1.04 |
128K | Open | |
| Devstral 2 2512 | Mistral | $0.400 | $2.00 | $1.20 |
256K | Open | |
| Mistral Medium 3 | Mistral | $0.400 | $2.00 | $1.20 |
128K | ||
| Gemini 3.1 Flash | $0.300 | $2.50 | $1.40 |
1M | |||
| Reka Flash | Reka | $0.800 | $2.00 | $1.40 |
128K | ||
| GLM-4.5 | Zhipu | $0.600cached $0.110 | $2.20 | $1.40 |
128K | Open | |
| GLM-4.6 | Zhipu | $0.600cached $0.110 | $2.20 | $1.40 |
128K | Open | |
| GLM-4.7 | Zhipu | $0.600cached $0.110 | $2.20 | $1.40 |
128K | Open | |
| Command A | Cohere | $1.00 | $2.00 | $1.50 |
256K | ||
| NVIDIA Nemotron 3 Ultra | Fireworks | $0.600cached $0.120 | $2.40 | $1.50 |
128K | Open | |
| Grok Build 0.1 | xAI | $1.00 | $2.00 | $1.50 |
256K | ||
| QwQ-Plus | Alibaba | $0.800 | $2.40 | $1.60 |
131K | ||
| Qwen 3.6 Plus | Fireworks | $0.500cached $0.100 | $3.00 | $1.75 |
128K | Open | |
| Kimi K2.5 | Fireworks | $0.600cached $0.100 | $3.00 | $1.80 |
256K | Open | |
| Kimi K2.5 | Moonshot | $0.600cached $0.100 | $3.00 | $1.80 |
262K | Open | |
| Grok 4.3 | xAI | $1.25 | $2.50 | $1.88 |
1M | ||
| Nova Pro | Amazon | $0.800 | $3.20 | $2.00 |
300K | ||
| Llama Nemotron Ultra 253B | NVIDIA | $0.600 | $3.60 | $2.10 |
128K | Open | |
| Nemotron 3 Ultra | NVIDIA | $0.600cached $0.120 | $3.60 | $2.10 |
128K | Open | |
| NVIDIA Nemotron 3 Ultra | Together | $0.600cached $0.200 | $3.60 | $2.10 |
128K | Open | |
| Qwen3.5-397B-A17B | Together | $0.600cached $0.350 | $3.60 | $2.10 |
128K | Open | |
| GLM-5 | Zhipu | $1.00cached $0.200 | $3.20 | $2.10 |
128K | Open | |
| Kimi K2.6 | Fireworks | $0.950cached $0.160 | $4.00 | $2.48 |
256K | Open | |
| Kimi K2.7 Code | Fireworks | $0.950cached $0.190 | $4.00 | $2.48 |
256K | Open | |
| Kimi K2.6 | Moonshot | $0.950cached $0.160 | $4.00 | $2.48 |
262K | Open | |
| Kimi K2.7 Code | Moonshot | $0.950cached $0.190 | $4.00 | $2.48 |
262K | Open | |
| Kimi K2.7 Code | Together | $0.950cached $0.190 | $4.00 | $2.48 |
256K | Open | |
| Qwen3.7-Max | Together | $1.25cached $0.130 | $3.75 | $2.50 |
128K | Open | |
| GLM-5-Turbo | Zhipu | $1.20cached $0.240 | $4.00 | $2.60 |
128K | Open | |
| GLM-5V-Turbo | Zhipu | $1.20cached $0.240 | $4.00 | $2.60 |
128K | Open | |
| DeepSeek V4 Pro | Fireworks | $1.74cached $0.145 | $3.48 | $2.61 |
1M | ||
| DeepSeek V4 Pro | Together | $1.74cached $0.200 | $3.48 | $2.61 |
1M | ||
| GPT-5.4 mini | OpenAI | $0.750cached $0.075 | $4.50 | $2.63 |
1M | ||
| o3-mini | OpenAI | $1.10cached $0.550 | $4.40 | $2.75 |
200K | ||
| o4-mini | OpenAI | $1.10cached $0.550 | $4.40 | $2.75 |
200K | ||
| GLM 5.1 | Fireworks | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| GLM 5.2 | Fireworks | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| GLM-5.2 | Together | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| GLM-5.1 | Zhipu | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| GLM-5.2 | Zhipu | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| Qwen3-Max | Alibaba | $1.20 | $4.80 | $3.00 |
262K | ||
| Claude Haiku 4.5 | Anthropic | $1.00cached $0.100 | $5.00 | $3.00 |
200K | ||
| Magistral Medium | Mistral | $2.00 | $5.00 | $3.50 |
128K | ||
| Mixtral 8x22B Instruct | Mistral | $2.00 | $6.00 | $4.00 |
64K | Open | |
| Pixtral Large 2411 | Mistral | $2.00 | $6.00 | $4.00 |
128K | Open | |
| Reka Core | Reka | $2.00 | $6.00 | $4.00 |
128K | ||
| Grok 4.20 | xAI | $2.00 | $6.00 | $4.00 |
256K | ||
| Mistral Medium 3.5 | Mistral | $1.50 | $7.50 | $4.50 |
128K | ||
| Kimi K2.7 Code HighSpeed | Moonshot | $1.90cached $0.380 | $8.00 | $4.95 |
262K | Open | |
| Jamba Large | AI21 Labs | $2.00 | $8.00 | $5.00 |
256K | Open | |
| GPT-4.1 | OpenAI | $2.00cached $0.500 | $8.00 | $5.00 |
1M | ||
| Sonar Deep Research | Perplexity | $2.00 | $8.00 | $5.00 |
200K | ||
| Sonar Reasoning Pro | Perplexity | $2.00 | $8.00 | $5.00 |
200K | ||
| Gemini 3.5 Flash | $1.50 | $9.00 | $5.25 |
1M | |||
| Gemini 2.5 Pro | $1.25 | $10.00 | $5.63 |
2M | |||
| Yi Large | 01.AI | $3.00 | $9.00 | $6.00 |
32K | ||
| Command R+ 08-2024 | Cohere | $2.50 | $10.00 | $6.25 |
128K | ||
| Gemini 3.1 Pro | $2.00 | $12.00 | $7.00 |
2M | |||
| Nova Premier | Amazon | $2.50 | $12.50 | $7.50 |
1M | ||
| GPT-5.4 | OpenAI | $2.50cached $0.250 | $15.00 | $8.75 |
1M | ||
| Claude Sonnet 4.6 | Anthropic | $3.00cached $0.300 | $15.00 | $9.00 |
200K | ||
| Sonar Pro | Perplexity | $3.00 | $15.00 | $9.00 |
200K | ||
| Grok 4 | xAI | $3.00 | $15.00 | $9.00 |
256K | ||
| GPT-Realtime-2 | OpenAI | $4.00cached $0.400 | $24.00 | $14.00 |
128K | ||
| Claude Opus 4.5 | Anthropic | $5.00cached $0.500 | $25.00 | $15.00 |
200K | ||
| Claude Opus 4.6 | Anthropic | $5.00cached $0.500 | $25.00 | $15.00 |
200K | ||
| Claude Opus 4.7 | Anthropic | $5.00cached $0.500 | $25.00 | $15.00 |
200K | ||
| Claude Opus 4.8 | Anthropic | $5.00cached $0.500 | $25.00 | $15.00 |
200K | ||
| Claude Mythos 5 | Anthropic | $10.00 | $20.00 | $15.00 |
200K | ||
| GPT-5.5 | OpenAI | $5.00cached $0.500 | $30.00 | $17.50 |
270K | ||
| GPT-Image-2 | OpenAI | $8.00cached $2.00 | $30.00 | $19.00 |
128K | ||
| Claude Fable 5 | Anthropic | $10.00cached $1.00 | $50.00 | $30.00 |
200K |
Understanding the table
Blended cost is the average of input and output price per 1M tokens — a quick way to compare models when your usage is a mix of both. The colored bar shows each model's blended cost relative to the most expensive model in the table.
Green = cheap (<$1/Mtok blended) · Gold = mid ($1–$15) · Red = expensive (>$15, typically frontier models). Cached input prices (where available) are shown under the input price — caching can cut costs 50–98%.
Hosting providers like Groq and Together list the same open-weight models at their own prices, so you can compare e.g. Llama 3.3 70B on Groq ($0.59/$0.79) vs. Together ($1.04/$1.04). The TPS badge marks Groq's inference throughput.
Use the Cost Calculator to estimate your monthly spending, or the Compare page to side-by-side any models.
Best LLM for your use case
Best LLM API for Coding
Compare LLM API pricing and capabilities for coding tasks. Find the best model for code generation, …
Best LLM API for Chatbots
Find the best LLM API for chatbots and conversational AI. Compare pricing for models optimized for d…
Cheapest LLM APIs by Token Cost
The cheapest LLM APIs ranked by token cost. Find budget-friendly models under $1/Mtok for high-volum…
Best LLM APIs for Long Context Windows
LLM APIs with the largest context windows. Compare models that support 100K+ tokens for document ana…
Best Multimodal LLM APIs
Compare multimodal LLM APIs that accept text, images, video, and audio. Find the best vision-capable…
Best LLM APIs for Reasoning Tasks
LLM APIs specialized for reasoning and complex problem solving. Compare pricing for models optimized…
Free pricing API
Get all 156 models and 24 providers as structured JSON. No API key, no rate limit, CORS-enabled.
curl https://modelpricewatch.com/api/v1/models.json | jq '.data[] | select(.category=="flagship")'API Documentation →