Pricing

Per-token model pricing

Prepaid credit is consumed per token at the rates below. We pass on our volume discounts — you pay 20% less than going directly to each provider.

Add credit ← Back to plans

Model	Provider	Context	Input per 1M	Output per 1M	Cache read / write
Claude Sonnet 4 Anthropic	Anthropic	200K	$2.40 $3.00	$12.00 $15.00	R $0.24 W $3.00
GPT-4.1 OpenAI	OpenAI	1M	$1.60 $2.00	$6.40 $8.00	$0.40
Gemini 2.5 Pro Google	Google	2M	$1.00 $1.25	$4.00 $5.00	—
DeepSeek V3 DeepSeek	DeepSeek	128K	$0.22 $0.27	$0.88 $1.10	R $0.06 W $0.22
Llama 3.3 70B Meta (self-hosted)	Meta (self-hosted)	128K	$0.47 $0.59	$0.63 $0.79	—
Qwen 2.5 Coder Alibaba	Alibaba	128K	$0.40 $0.50	$1.20 $1.50	$0.12

Input

Tokens sent to the model — your prompt, code context, and instructions.

Output

Tokens generated by the model — code, explanations, and diffs.

Cache read

Reusing cached context is significantly cheaper than re-sending it.

Cache write

First-time context storage. Some providers charge separately, others unify read/write.

Prices are per 1 million tokens (1M = 1,000,000). Rates are rounded to the nearest cent. Credits never expire. BuildStax rates are 20% below provider list price.