Loading...
Loading...
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's performance in coding and search agents. It is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config) The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek-R1 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows.
Quality Index
28.5
110th of 442
Top 25%
Coding Index
31.9
65th of 352
Top 19%
Math Index
53.7
132nd of 268
Top 50%
Price/1M
$0.63
409th cheapest
102% above median
Top 60%
Speed
0 tok/s
TTFT
0.00s
Context Window
164K
135th largest
Top 41%
Input
$0.34
per 1M tokens
Output
$1.50
per 1M tokens
Blended
$0.63
per 1M tokens
Cheaper than 40% of models. Median price is $0.31/1M tokens.
Daily
$0.63
Monthly
$18.78
0
tokens/sec
Faster than 0% of models
0.00
seconds
Faster than 61% of models
0.00
seconds
Faster than 61% of models
Market Median
46 tok/s
100% slower
Median TTFT
0.42s
100% faster
Throughput/Dollar
0
tok/s per $/1M
Speed Comparison
Context Window
164K
tokens
Larger than 59% of models
Max Output
66K
tokens
40% of context