DeepSeek V3.2 正式版震撼发布：推理能力比肩GPT-5，Agent能力全面进化

两个月前，我们发布了实验性的 DeepSeek-V3.2-ExpAn experimental version of the DeepSeek language model released for testing and feedback.，收到了众多用户的积极反馈和对比测试结果。令人振奋的是，V3.2-Exp 在所有测试场景中均未表现出比 V3.1-Terminus 更差的性能，这充分验证了 DSAA sparse attention mechanism used in DeepSeek models to improve efficiency. 稀疏注意力机制的有效性。感谢广大用户一直以来的支持与反馈，你们的参与为我们的持续创新注入了强大动力。

今天，我们正式发布两个重磅模型：DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 和 DeepSeek-V3.2-SpecialeAn enhanced version of DeepSeek-V3.2 focused on extreme reasoning capabilities and long-form thinking.。官方网页端、移动应用和 API 均已更新为正式版 DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models.，欢迎体验。Speciale 版本目前以临时 API 服务形式开放，供社区进行评测与研究。

新模型的技术报告已同步发布：查看完整技术报告

🚀 推理能力全球领先

DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 旨在平衡推理能力与输出长度，特别适合日常使用场景，包括智能问答和通用 Agent 任务。在公开的推理基准测试中，DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 的表现已达到 GPT-5A competing AI model series compared to Gemini 3.0, noted for differences in hallucination control. 水平，仅略低于 Gemini-3.0-ProA proprietary language model developed by Google used as a performance benchmark.。与 Kimi-K2-ThinkingA language model known for long output generation, used for comparison. 相比，V3.2 的输出长度大幅减少，显著降低了计算开销和用户等待时间。

DeepSeek-V3.2-SpecialeAn enhanced version of DeepSeek-V3.2 focused on extreme reasoning capabilities and long-form thinking. 则将开源模型的推理能力推向极致，探索模型性能的边界。作为 DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 的长思考增强版，它融合了 DeepSeek-Math-V2A mathematical theorem proving model whose capabilities are integrated into DeepSeek-V3.2-Speciale. 的定理证明能力。该模型具备卓越的指令跟随、严谨的数学证明与逻辑验证能力，在主流推理基准测试上的性能表现可媲美 Gemini-3.0-ProA proprietary language model developed by Google used as a performance benchmark.。

更令人瞩目的是，V3.2-Speciale 模型在多项国际顶级竞赛中斩获金牌：

IMO 2025The International Mathematical Olympiad competition held in 2025.（国际数学奥林匹克）
CMO 2025The Chinese Mathematical Olympiad competition held in 2025.（中国数学奥林匹克）
ICPC World Finals 2025The International Collegiate Programming Contest World Finals held in 2025.（国际大学生程序设计竞赛全球总决赛）
IOI 2025The International Olympiad in Informatics competition held in 2025.（国际信息学奥林匹克）

其中，ICPC 与 IOI 的成绩分别达到了人类选手第二名与第十名的水平，展现了强大的推理与编程能力。

温馨提示：在高度复杂任务上，Speciale 模型表现大幅优于标准版本，但消耗的 Tokens 也显著更多，成本更高。目前，DeepSeek-V3.2-SpecialeAn enhanced version of DeepSeek-V3.2 focused on extreme reasoning capabilities and long-form thinking. 仅供研究使用，不支持工具调用，暂未针对日常对话与写作任务进行专项优化。

🤖 思考融入工具调用：Agent能力全面升级

不同于过往版本在思考模式下无法调用工具的局限，DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 是我们推出的首个将思考融入工具使用的模型，同时支持思考模式与非思考模式的工具调用。我们创新性地提出了一种大规模 Agent 训练数据合成方法，构造了大量「难解答，易验证」的强化学习任务（1800+ 环境，85,000+ 复杂指令），大幅提高了模型的泛化能力。

如表2所示，DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 模型在智能体评测中达到了当前开源模型的最高水平，大幅缩小了开源模型与闭源模型的差距。值得强调的是，V3.2 并没有针对这些测试集的工具进行特殊训练，因此我们相信，V3.2 在真实应用场景中能够展现出更强的泛化性能。

📦 开源模型下载

DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models.

HuggingFace: https://huggingface.co/deepseek-ai/DeepSeek-V3.2
ModelScope: https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.2

DeepSeek-V3.2-SpecialeAn enhanced version of DeepSeek-V3.2 focused on extreme reasoning capabilities and long-form thinking.

HuggingFace: https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale
ModelScope: https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.2-Speciale

🌐 网页端、APP 与 API 更新

DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 是我们当前正式提供服务的模型，官网网页、APP、API 模型均已由 DeepSeek-V3.2-ExpAn experimental version of the DeepSeek language model released for testing and feedback. 升级为正式版 DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models.，使用方式保持不变。

同时，为了方便社区评测与研究，我们非正式部署了 DeepSeek-V3.2-SpecialeAn enhanced version of DeepSeek-V3.2 focused on extreme reasoning capabilities and long-form thinking. 的 API 服务。API 用户可以通过设置 base_url="https://api.deepseek.com/v3.2_speciale_expires_on_20251215" 访问该模型。该模型 API 价格不变，仅支持思考模式下的对话功能，不支持工具调用等功能，最大输出长度默认为 128K，服务支持时间截止至北京时间 2025年12月15日23:59。

🔧 思考模式下的工具调用

本次 API 更新支持了 DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 思考模式下的工具调用能力。在思考模式下，模型能够经过多轮的思考 + 工具调用，最终给出更详尽准确的回答。

在回答问题的过程中，模型会进行多次思考与工具调用。用户需要回传思维链内容（reasoning_content）给 API，以便模型继续思考。当开始新的用户问题时，需要删除之前的思维链，保留其他内容发送给 API。

更详细的使用方法请参考 API 文档。

DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models. 的思考模式还增加了对 Claude CodeAn AI coding assistant developed by Anthropic that provides code completion and explanations. 的支持，用户可以通过将模型名改为 deepseek-reasoner，或在 Claude CodeAn AI coding assistant developed by Anthropic that provides code completion and explanations. CLI 中按 Tab 键开启思考模式。但需要注意的是，思考模式未充分适配 ClineA component that uses non-standard tool calling, not fully compatible with DeepSeek-V3.2 thinking mode.、RooCodeA component that uses non-standard tool calling, not fully compatible with DeepSeek-V3.2 thinking mode. 等使用非标准工具调用的组件，建议用户在使用此类组件时继续使用非思考模式。

Data Analysis

特性	DeepSeek-V3.2The underlying advanced architecture powering DeepSeek's current core language models.	DeepSeek-V3.2-SpecialeAn enhanced version of DeepSeek-V3.2 focused on extreme reasoning capabilities and long-form thinking.
定位与目标	平衡推理能力与输出长度，适合日常智能问答和通用Agent任务。	探索性能边界，是V3.2的长思考增强版，专注于极致推理能力。
关键能力	1. 推理能力达到GPT-5A competing AI model series compared to Gemini 3.0, noted for differences in hallucination control.水平。 2. 首个将思考融入工具调用的DeepSeek模型。 3. 支持思考与非思考模式的工具调用。	1. 推理能力媲美Gemini-3.0-ProA proprietary language model developed by Google used as a performance benchmark.。 2. 融合DeepSeek-Math-V2A mathematical theorem proving model whose capabilities are integrated into DeepSeek-V3.2-Speciale.的定理证明能力。 3. 卓越的指令跟随、数学证明与逻辑验证。
性能亮点	在公开推理基准测试中略低于Gemini-3.0-ProA proprietary language model developed by Google used as a performance benchmark.；输出长度比Kimi-K2-ThinkingA language model known for long output generation, used for comparison.大幅减少，计算开销更低。	在IMO、CMO、ICPC、IOI等2025年顶级竞赛中斩获金牌，ICPC与IOI成绩达到人类选手第二名与第十名水平。
输出长度/成本	输出长度经过优化，旨在降低计算开销和用户等待时间。	在高度复杂任务上消耗Tokens显著更多，成本更高。
工具调用支持	支持工具调用（思考模式与非思考模式）。	不支持工具调用。
适用场景	日常使用、智能问答、通用Agent任务。	复杂推理、数学证明、逻辑验证、竞赛级问题研究。
可用性与部署	正式版，已更新至官方网页端、移动应用和API。	以临时API服务形式开放，供社区评测与研究，服务截止至2025年12月15日。
开源下载	已在HuggingFace和ModelScope平台发布。	已在HuggingFace和ModelScope平台发布。

Source/Note: 根据DeepSeek官方发布文本（2025年）整理，包含模型对比、性能描述及可用性信息。