Loading...
Loading...
Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, Llama 4 Scout incorporates early fusion for seamless modality integration. It is instruction-tuned for use in multilingual chat, captioning, and image understanding tasks. Released under the Llama 4 Community License, it was last trained on data up to August 2024 and launched publicly on April 5, 2025.
Quality Index
13.5
297th of 442
Top 68%
Coding Index
6.7
296th of 352
Top 85%
Math Index
14.0
220th of 268
Top 83%
Price/1M
$0.29
328th cheapest
6% below median
Top 48%
Speed
129 tok/s
Top 20%
TTFT
0.45s
Context Window
328K
58th largest
Top 16%
Input
$0.17
per 1M tokens
Output
$0.66
per 1M tokens
Blended
$0.29
per 1M tokens
Cheaper than 52% of models. Median price is $0.31/1M tokens.
Daily
$0.29
Monthly
$8.76
129
tokens/sec
Faster than 80% of models
0.45
seconds
Faster than 48% of models
0.45
seconds
Faster than 50% of models
Market Median
46 tok/s
183% faster
Median TTFT
0.42s
7% slower
Throughput/Dollar
442
tok/s per $/1M
Speed Comparison
Context Window
328K
tokens
Larger than 84% of models
Max Output
16K
tokens
5% of context