Loading...
Loading...
Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual use, while remaining robust on alignment and formatting. Compared with prior Qwen3 instruct variants, it focuses on higher throughput and stability on ultra-long inputs and multi-turn dialogues, making it well-suited for RAG, tool use, and agentic workflows that require consistent final answers rather than visible chain-of-thought. The model employs scaling-efficient training and decoding to improve parameter efficiency and inference speed, and has been validated on a broad set of public benchmarks where it reaches or approaches larger Qwen3 systems in several categories while outperforming earlier mid-sized baselines. It is best used as a general assistant, code helper, and long-context task solver in production settings where deterministic, instruction-following outputs are preferred.
Índice de Qualidade
20.1
185th de 442
Top 42%
Índice de Código
15.3
191st de 352
Top 54%
Índice de Matemática
66.3
103rd de 268
Top 39%
Preço/1M
$0.88
465th mais barato
182% acima da mediana
Top 69%
Velocidade
172 tok/s
Top 10%
TTFT
0.97s
Janela de Contexto
262K
61st maior
Top 25%
Entrada
$0.50
por 1M tokens
Saída
$2.00
por 1M tokens
Combinado
$0.88
por 1M tokens
Mais barato que 31% dos modelos. Preço mediano é $0.31/1M tokens.
Diário
$0.88
Mensal
$26.25
172
tokens/seg
Mais rápido que 90% dos modelos
0.97
segundos
Mais rápido que 31% dos modelos
0.97
segundos
Mais rápido que 39% dos modelos
Mediana do Mercado
46 tok/s
277% mais rápido
TTFT Mediano
0.42s
133% mais lento
Vazão/Dólar
197
tok/s por $/1M
Comparação de Velocidade
Janela de Contexto
262K
tokens
Maior que 75% dos modelos
885.2K
950
Multi-GPU
8x A100 / H100