Loading...
Loading...
gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.
Quality Index
24.5
144th of 442
Top 33%
Coding Index
18.5
158th of 352
Top 45%
Math Index
89.3
29th of 268
Top 12%
Price/1M
$0.09
222nd cheapest
70% below median
Top 33%
Speed
304 tok/s
Top 2%
TTFT
0.45s
Context Window
131K
145th largest
Top 63%
Input
$0.06
per 1M tokens
Output
$0.20
per 1M tokens
Blended
$0.09
per 1M tokens
Cheaper than 67% of models. Median price is $0.31/1M tokens.
Daily
$0.09
Monthly
$2.82
304
tokens/sec
Faster than 98% of models
0.45
seconds
Faster than 48% of models
7.02
seconds
Faster than 23% of models
Market Median
46 tok/s
567% faster
Median TTFT
0.42s
6% slower
Throughput/Dollar
3238
tok/s per $/1M
Speed Comparison
Context Window
131K
tokens
Larger than 37% of models
Max Output
131K
tokens
100% of context
7.1M
4.5K
24-48 GB
A6000 / M3 Ultra