LIVE Cheapest: GLM-4.7-Flash $0.000/Mtok in 153 models tracked Updated Jun 24, 2026
Jun 24, 2026
ModelPriceWatch$/Mtok
Pricing / Open Weights

Open Weight Models

77 open-weight models hosted across inference providers. Compare hosting prices — the model weights are free, you pay only for compute.

OPEN
77
Open Weight Models
$
$0.000
Cheapest Input /Mtok
TPS
1000
Fastest Inference
CTX
10M
Largest Context
77 open-weight models · sorted by input price (cheapest first)
Model Hosted by Parameters Input $/Mtok Output $/Mtok Context Notes
GLM-4.7-Flash Zhipu $0.000 $0.000 128K
Granite 4.0 Micro IBM $0.017 $0.112 128K
LFM2 24B A2B Together 24B (A2B) $0.030 $0.120 128K
GLM-OCR Zhipu $0.030 $0.030 128K
Llama 3.1 8B Instant Groq 8B $0.050 $1.00 128K 840 TPS
Llama 3.1 8B Meta 8B $0.050 $0.080 128K
Granite 4 H Small IBM $0.060 $0.250 128K
Baichuan M2-32B Baichuan 32B $0.070 $0.070 33K
GPT OSS 20B Fireworks 20B $0.070 $0.300 128K cached $0.035
GLM-4.7-FlashX Zhipu $0.070 $0.400 128K cached $0.010
GPT-OSS 20B Groq 20B $0.075 $1.00 128K 1000 TPS
Mistral Small 3.2 24B Mistral 24B $0.080 $0.200 128K
Ministral 3 3B Mistral 3B $0.100 $0.100 128K
Pixtral 12B Mistral 12B $0.100 $0.100 128K
Voxtral Small 24B Mistral 24B $0.100 $0.300 128K
Nemotron 70B Instruct NVIDIA 70B $0.100 $0.100 128K
GLM-4-32B-0414 Zhipu 32B $0.100 $0.100 128K
Granite Embedding 278M Multilingual IBM 278M $0.106 $0.106
Llama 4 Scout Groq 17B (16 experts) $0.110 $1.00 10M 594 TPS
Llama 4 Scout Meta 17B (16 experts) $0.110 $0.340 10M
GPT OSS 120B Fireworks 120B $0.150 $0.600 128K cached $0.015
GPT-OSS 120B Groq 120B $0.150 $1.00 128K 500 TPS
Granite 4 H Medium IBM $0.150 $0.600 128K
gpt-oss-120B Together 120B $0.150 $0.600 128K
Jamba Mini AI21 Labs $0.200 $0.400 256K
GLM-4.5-Air Zhipu $0.200 $1.10 128K cached $0.030
Gemma-4-31B-it-Pearl Together 31B $0.280 $0.860 128K
Qwen3 32B Groq 32B $0.290 $1.00 128K 662 TPS
MiniMax 2.5 Fireworks $0.300 $1.20 128K cached $0.030
MiniMax 2.7 Fireworks $0.300 $1.20 128K cached $0.060
MiniMax M3 Fireworks $0.300 $1.20 1M cached $0.060
Granite 4 H Large IBM $0.300 $1.20 128K
MiniMax-M2 MiniMax $0.300 $1.20 205K cached $0.030
MiniMax-M2.1 MiniMax $0.300 $1.20 205K cached $0.030
MiniMax-M2.5 MiniMax $0.300 $1.20 205K cached $0.030
MiniMax-M2.7 MiniMax $0.300 $1.20 205K cached $0.060
MiniMax-M3 MiniMax $0.300 $1.20 1M cached $0.060
MiniMax M2.5 Together $0.300 $1.20 128K cached $0.060
MiniMax M3 Together $0.300 $1.20 1M cached $0.060
GLM-4.6V Zhipu $0.300 $0.900 128K cached $0.050
Qwen3.7-Plus Together $0.320 $1.28 128K
Qwen 3.7 Plus Fireworks $0.400 $1.60 128K cached $0.080
Devstral 2 2512 Mistral $0.400 $2.00 256K
Qwen 3.6 Plus Fireworks $0.500 $3.00 128K cached $0.100
Mixtral 8x7B Instruct Mistral 46.7B (8x7B MoE) $0.540 $0.540 32K
Llama 3.3 70B Versatile Groq 70B $0.590 $1.00 128K 394 TPS
Llama 3.3 70B Meta 70B $0.590 $0.790 128K
Kimi K2.5 Fireworks $0.600 $3.00 256K cached $0.100
NVIDIA Nemotron 3 Ultra Fireworks $0.600 $2.40 128K cached $0.120
Qwen 3.6 27B Groq 27B $0.600 $1.00 128K 500 TPS
Kimi K2.5 Moonshot $0.600 $3.00 262K cached $0.100
Llama Nemotron Ultra 253B NVIDIA 253B $0.600 $3.60 128K
Nemotron 3 Ultra NVIDIA $0.600 $3.60 128K cached $0.120
NVIDIA Nemotron 3 Ultra Together $0.600 $3.60 128K cached $0.200
Qwen3.5-397B-A17B Together 397B (A17B) $0.600 $3.60 128K cached $0.350
GLM-4.5 Zhipu $0.600 $2.20 128K cached $0.110
GLM-4.6 Zhipu $0.600 $2.20 128K cached $0.110
GLM-4.7 Zhipu $0.600 $2.20 128K cached $0.110
Kimi K2.6 Fireworks $0.950 $4.00 256K cached $0.160
Kimi K2.7 Code Fireworks $0.950 $4.00 256K cached $0.190
Kimi K2.6 Moonshot $0.950 $4.00 262K cached $0.160
Kimi K2.7 Code Moonshot $0.950 $4.00 262K cached $0.190
Kimi K2.7 Code Together $0.950 $4.00 256K cached $0.190
GLM-5 Zhipu $1.00 $3.20 128K cached $0.200
Llama 3.3 70B Together 70B $1.04 $1.04 128K
GLM-5-Turbo Zhipu $1.20 $4.00 128K cached $0.240
GLM-5V-Turbo Zhipu $1.20 $4.00 128K cached $0.240
Qwen3.7-Max Together $1.25 $3.75 128K cached $0.130
GLM 5.1 Fireworks $1.40 $4.40 128K cached $0.260
GLM 5.2 Fireworks $1.40 $4.40 128K cached $0.260
GLM-5.2 Together $1.40 $4.40 128K cached $0.260
GLM-5.1 Zhipu $1.40 $4.40 128K cached $0.260
GLM-5.2 Zhipu $1.40 $4.40 128K cached $0.260
Kimi K2.7 Code HighSpeed Moonshot $1.90 $8.00 262K cached $0.380
Jamba Large AI21 Labs $2.00 $8.00 256K
Mixtral 8x22B Instruct Mistral 141B (8x22B MoE) $2.00 $6.00 64K
Pixtral Large 2411 Mistral 124B $2.00 $6.00 128K

Open weight models have freely available model weights — anyone can download and run them. You pay only for the compute to serve inference. Hosting providers like Groq (custom LPU chips, fastest inference), Together AI (200+ models, competitive pricing), and Fireworks offer per-token pricing without lock-in.

The same model can have very different prices depending on the host. For example, Llama 3.3 70B costs $0.59/$0.79 on Groq but $1.04/$1.04 on Together. Use the Compare page to side-by-side the same model across providers.