Loading...
Loading...
The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of overall performance, this model is second only to Qwen3.5-397B-A17B. Its text capabilities significantly outperform those of Qwen3-235B-2507, and its visual capabilities surpass those of Qwen3-VL-235B.
Quality Index
41.6
36th of 442
Top 8%
Coding Index
34.7
50th of 352
Top 14%
Price/1M
$1.10
492nd cheapest
255% above median
Top 73%
Speed
156 tok/s
Top 12%
TTFT
0.98s
Context Window
262K
61st largest
Top 25%
Input
$0.40
per 1M tokens
Output
$3.20
per 1M tokens
Blended
$1.10
per 1M tokens
Cheaper than 27% of models. Median price is $0.31/1M tokens.
Daily
$1.10
Monthly
$33.00
156
tokens/sec
Faster than 88% of models
0.98
seconds
Faster than 30% of models
13.80
seconds
Faster than 19% of models
Market Median
46 tok/s
242% faster
Median TTFT
0.42s
134% slower
Throughput/Dollar
142
tok/s per $/1M
Speed Comparison
Context Window
262K
tokens
Larger than 75% of models
Max Output
66K
tokens
25% of context
611.8K
452
Multi-GPU
8x A100 / H100