Cheapest LLM APIs by Token Cost
The cheapest LLM APIs ranked by token cost. Find budget-friendly models under $1/Mtok for high-volume production workloads.
Cost calculator for this use case
🥇 Granite 4.0 Micro
$—
🥈 LFM2 24B A2B
$—
🥉 GLM-OCR
$—
Full ranking — top 15 models
| # | Model | Provider | Input $/Mtok | Output $/Mtok | Blended | Context | |
|---|---|---|---|---|---|---|---|
| 1 | Granite 4.0 Micro | IBM | $0.017 | $0.112 | $0.065 | 128K | → |
| 2 | LFM2 24B A2B | Together | $0.030 | $0.120 | $0.075 | 128K | → |
| 3 | GLM-OCR | Zhipu | $0.030 | $0.030 | $0.030 | 128K | → |
| 4 | Nova Micro | Amazon | $0.035 | $0.140 | $0.088 | 128K | → |
| 5 | Qwen-Turbo | Alibaba | $0.050 | $0.200 | $0.125 | 1M | → |
| 6 | Nova Lite | Amazon | $0.060 | $0.240 | $0.150 | 300K | → |
| 7 | Granite 4 H Small | IBM | $0.060 | $0.250 | $0.155 | 128K | → |
| 8 | Baichuan M2-32B | Baichuan | $0.070 | $0.070 | $0.070 | 33K | → |
| 9 | GPT OSS 20B | Fireworks | $0.070 | $0.300 | $0.185 | 128K | → |
| 10 | GLM-4.7-FlashX | Zhipu | $0.070 | $0.400 | $0.235 | 128K | → |
| 11 | Gemini 2.5 Flash | $0.075 | $0.300 | $0.188 | 1M | → | |
| 12 | Mistral Small 3.2 24B | Mistral | $0.080 | $0.200 | $0.140 | 128K | → |
| 13 | Gemini 2.5 Flash-Lite | $0.100 | $0.400 | $0.250 | 1M | → | |
| 14 | Ministral 3 3B | Mistral | $0.100 | $0.100 | $0.100 | 128K | → |
| 15 | Voxtral Small 24B | Mistral | $0.100 | $0.300 | $0.200 | 128K | → |
How models are selected
Budget-category models, sorted by input price per million tokens.
Prices are per million tokens (Mtok), sourced directly from official provider pricing pages and verified by our automated scraper pipeline that runs 3x daily. "Blended cost" is the average of input and output pricing — a quick proxy for typical 50/50 usage patterns.