Loading...
Loading...
Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use. Built on the Qwen3 architecture, it supports a native context length of 256K tokens (extendable to 1M with Yarn) and performs strongly in tasks involving function calls, browser use, and structured code completion. This model is optimized for instruction-following without “thinking mode”, and integrates well with OpenAI-compatible tool-use formats.
Quality Index
20.0
186th of 442
Top 43%
Coding Index
19.4
152nd of 352
Top 43%
Math Index
29.0
187th of 268
Top 70%
Price/1M
$0.90
473rd cheapest
190% above median
Top 70%
Speed
26 tok/s
Top 60%
TTFT
1.44s
Context Window
160K
144th largest
Top 41%
Input
$0.45
per 1M tokens
Output
$2.25
per 1M tokens
Blended
$0.90
per 1M tokens
Cheaper than 30% of models. Median price is $0.31/1M tokens.
Daily
$0.90
Monthly
$27.00
26
tokens/sec
Faster than 40% of models
1.44
seconds
Faster than 18% of models
1.44
seconds
Faster than 31% of models
Market Median
46 tok/s
42% slower
Median TTFT
0.42s
243% slower
Throughput/Dollar
29
tok/s per $/1M
Speed Comparison
Context Window
160K
tokens
Larger than 59% of models
Max Output
33K
tokens
20% of context
1.0M
981
24-48 GB
A6000 / M3 Ultra