DeepSeek如何颠覆AI行业？2026年开源大模型挑战美国巨头

Introduction

In the world of AI, there has been a prevailing notion that developing leading-edge large language models requires significant technical and financial resources. That's one of the main reasons why the U.S. government pledged to support the $500 billion Stargate Project announced by President Donald Trump.

在人工智能领域，长期以来存在一种普遍观念，即开发前沿的大型语言模型需要大量的技术和资金投入。这也是美国政府承诺支持特朗普总统宣布的5000亿美元"星门计划"的主要原因之一。

But Chinese AI development firm DeepSeek has disrupted that notion. On Jan. 20, 2025, DeepSeek released its R1 LLM at a fraction of the cost that other vendors incurred in their own developments. DeepSeek is also providing its R1 models under an open source license, enabling free use.

然而，中国人工智能开发公司深度求索（DeepSeek）打破了这一观念。2025年1月20日，DeepSeek以远低于其他厂商开发成本的投入发布了其R1大语言模型。DeepSeek还在开源许可下提供其R1模型，允许免费使用。

Within days of its release, the DeepSeek AI assistant -- a mobile app that provides a chatbot interface for DeepSeek-R1 -- hit the top of Apple's App Store chart, outranking OpenAI's ChatGPT mobile app. The meteoric rise of DeepSeek in terms of usage and popularity triggered a stock market sell-off on Jan. 27, 2025, as investors cast doubt on the value of large AI vendors based in the U.S., including Nvidia. Microsoft, Meta Platforms, Oracle, Broadcom and other tech giants also saw significant drops as investors reassessed AI valuations.

发布后数日内，DeepSeek AI助手——一款为DeepSeek-R1提供聊天机器人界面的移动应用——登顶苹果App Store排行榜，超越了OpenAI的ChatGPT移动应用。DeepSeek在使用量和受欢迎程度方面的迅猛崛起，在2025年1月27日引发了股市抛售，投资者对包括英伟达在内的美国大型AI供应商的价值产生了怀疑。微软、Meta Platforms、甲骨文、博通等科技巨头也因投资者重新评估AI估值而出现大幅下跌。

What is DeepSeek?

DeepSeek is an AI development firm based in Hangzhou, China. The company was founded by Liang Wenfeng, a graduate of Zhejiang University, in May 2023. Wenfeng also co-founded High-Flyer, a China-based quantitative hedge fund that owns DeepSeek. Currently, DeepSeek operates as an independent AI research lab under the umbrella of High-Flyer. The full amount of funding and the valuation of DeepSeek have not been publicly disclosed.

DeepSeek是一家总部位于中国杭州的人工智能开发公司。该公司由浙江大学毕业生梁文锋于2023年5月创立。梁文锋还联合创立了量化对冲基金幻方量化，该公司拥有DeepSeek。目前，DeepSeek作为幻方量化旗下的独立AI研究实验室运营。DeepSeek的完整融资额和估值尚未公开披露。

DeepSeek focuses on developing open source LLMs. The company's first model was released in November 2023. The company has iterated multiple times on its core LLM and has built out several different variations. However, it wasn't until January 2025 after the release of its R1 reasoning model that the company became globally famous.

DeepSeek专注于开发开源大语言模型。该公司的第一个模型于2023年11月发布。公司已对其核心大语言模型进行了多次迭代，并构建了多个不同变体。然而，直到2025年1月其R1推理模型发布后，该公司才在全球范围内声名鹊起。

The company provides multiple services for its models, including a web interface, mobile application and API access.

该公司为其模型提供多种服务，包括网页界面、移动应用程序和API访问。

OpenAI vs. DeepSeek: A Comparative Analysis

DeepSeek represents the latest challenge to OpenAI, which established itself as an industry leader with the debut of ChatGPT in 2022. OpenAI has helped push the generative AI industry forward with its GPT family of models, as well as its o1 class of reasoning models, which include o3 and o4 mini.

DeepSeek代表了对OpenAI的最新挑战，后者凭借2022年ChatGPT的推出确立了行业领导地位。OpenAI通过其GPT系列模型以及包含o3和o4 mini的o1类推理模型，推动了生成式AI行业的发展。

While the two companies are both developing generative AI LLMs, they have different approaches.

虽然两家公司都在开发生成式AI大语言模型，但它们的策略有所不同。

Key Differences

Aspect	OpenAI	DeepSeek
Founding year	2015	2023
Headquarters	San Francisco, Calif.	Hangzhou, China
Development focus	Broad AI capabilities	Efficient, open source models
Key models	GPT-4o, o1, GPT-5	DeepSeek-V3, DeepSeek-R1, DeepSeek-V3.1
Specialized models	Dall-E (image generation), Whisper (speech recognition)	DeepSeek Coder (coding), Janus Pro (vision model)
API pricing (per million tokens)	GPT-5: $1.25 (input), $10 (output)	DeepSeek-V3.1: $0.56 (input), $1.68 (output)
Open source policy	Limited	Mostly open source
Training approach	Supervised and instruction-based fine-tuning	Reinforcement learning
Development cost	Hundreds of millions of dollars for o1 (estimated)	Less than $6 million for DeepSeek-R1, according to the company

Training Innovations in DeepSeek

DeepSeek uses a different approach to train its R1 models than what is used by OpenAI. The training involved less time, fewer AI accelerators and less cost to develop. DeepSeek's aim is to achieve artificial general intelligence, and the company's advancements in reasoning capabilities represent significant progress in AI development.

DeepSeek采用与OpenAI不同的方法来训练其R1模型。训练所需时间更短、AI加速器更少、开发成本更低。DeepSeek的目标是实现通用人工智能，该公司在推理能力方面的进展代表了AI发展的重大进步。

In a research paper, DeepSeek outlines the multiple innovations it developed as part of the R1 model, including the following:

在一篇研究论文中，DeepSeek概述了其作为R1模型一部分开发的多个创新，包括：

Reinforcement learning. DeepSeek used a large-scale reinforcement learning approach focused on reasoning tasks.

强化学习一种机器学习范式，代理通过与环境的试错交互学习最优行为，成功动作会获得奖励，常用于训练自主决策代理。。 DeepSeek采用了专注于推理任务的大规模强化学习一种机器学习范式，代理通过与环境的试错交互学习最优行为，成功动作会获得奖励，常用于训练自主决策代理。方法。
Reward engineering. Researchers developed a rule-based reward system for the model that outperforms neural reward models that are more commonly used. Reward engineering is the process of designing the incentive system that guides an AI model's learning during training.

奖励工程。 研究人员为该模型开发了一种基于规则的奖励系统，其性能优于更常用的神经奖励模型。奖励工程是设计激励系统的过程，该系统在训练期间指导AI模型的学习。
Distillation. Using efficient knowledge transfer techniques, DeepSeek researchers successfully compressed capabilities into models as small as 1.5 billion parameters.

蒸馏。 利用高效的知识转移技术，DeepSeek研究人员成功地将能力压缩到仅有15亿参数的模型中。
Emergent behavior network. DeepSeek's emergent behavior innovation is the discovery that complex reasoning patterns can develop naturally through reinforcement learning without explicitly programming them.

涌现行为网络。 DeepSeek的涌现行为创新是发现复杂的推理模式可以通过强化学习一种机器学习范式，代理通过与环境的试错交互学习最优行为，成功动作会获得奖励，常用于训练自主决策代理。自然发展，而无需显式编程。
Hybrid architecture training. DeepSeek V3.1 introduced an evolved training methodology, where a single model is trained to support both fast inference and deep reasoning modes. This hybrid approach required developing new chat templates and tokenization strategies, including the introduction of specific thinking tokens (<think> and </think>) that control the model's reasoning behavior.

混合架构训练。 DeepSeek V3.1引入了一种进化的训练方法，即训练单个模型以同时支持快速推理和深度推理模式。这种混合方法需要开发新的聊天模板和分词策略，包括引入控制模型推理行为的特定思考标记（<think>和</think>）。

DeepSeek Large Language Models: Evolution Timeline

Since the company was created in 2023, DeepSeek has released a series of generative AI models. With each new generation, the company has worked to advance both the capabilities and performance of its models:

自2023年公司成立以来，DeepSeek已发布了一系列生成式AI模型。随着每一代新产品的推出，公司都在努力提升其模型的能力和性能：

DeepSeek Coder. Released in November 2023, this is the company's first open source model designed specifically for coding-related tasks.

DeepSeek Coder。 2023年11月发布，这是该公司首个专门为编码相关任务设计的开源模型。
DeepSeek LLM. Released in December 2023, this is the first version of the company's general-purpose model.

DeepSeek LLM。 2023年12月发布，这是该公司通用模型的第一个版本。
DeepSeek-V2. Released in May 2024, this is the second version of the company's LLM, focusing on strong performance and lower training costs.

DeepSeek-V2。 2024年5月发布，这是该公司大语言模型的第二个版本，专注于强大的性能和更低的训练成本。
DeepSeek-Coder-V2. Released in July 2024, this is a 236 billion-parameter model offering a context window of 128,000 tokens, designed for complex coding challenges.

DeepSeek-Coder-V2。 2024年7月发布，这是一个拥有2360亿参数的模型，提供128,000个标记的上下文窗口，专为复杂的编码挑战设计。
DeepSeek-V3. Released in December 2024, DeepSeek-V3 uses a mixture-of-experts architecture, capable of handling a range of tasks. The model has 671 billion parameters with a context length of 128,000.

DeepSeek-V3。 2024年12月发布，DeepSeek-V3采用混合专家架构，能够处理一系列任务。该模型拥有6710亿参数，上下文长度模型能够处理的输入文本的最大token数量，直接影响模型处理长文档和多轮对话的能力。为128,000。
DeepSeek-R1. Released in January 2025, this model is based on DeepSeek-V3 and is focused on advanced reasoning tasks directly competing with OpenAI's o1 model in performance, while maintaining a significantly lower cost structure. Like DeepSeek-V3, the model has 671 billion parameters with a context length of 128,000.

DeepSeek-R1。 2025年1月发布，该模型基于DeepSeek-V3，专注于高级推理任务，在性能上直接与OpenAI的o1模型竞争，同时保持显著更低的成本结构。与DeepSeek-V3一样，该模型拥有6710亿参数，上下文长度模型能够处理的输入文本的最大token数量，直接影响模型处理长文档和多轮对话的能力。为128,000。
DeepSeek-R1-0528. Released in May 2025, the R1-0528 model is an updated version of the original R1 model. The model now supports system prompts, JSON output and function calling, making it more suitable for agentic AI use cases. DeepSeek also claims it's more accurate with reduced hallucination rates compared to the prior release. R1-0528 also benefits from great reasoning depth, averaging 23,000 tokens per question vs. 12,000 in the previous version.

DeepSeek-R1-0528。 2025年5月发布，R1-0528模型是原始R1模型的更新版本。该模型现在支持系统提示、JSON输出和函数调用，使其更适合智能体AI用例。DeepSeek还声称，与之前的版本相比，该模型更准确，幻觉率更低。R1-0528还受益于出色的推理深度，每个问题平均处理23,000个标记，而之前版本为12,000个。
DeepSeek-R1-0528-Qwen3-8B. A smaller, distilled version based on Alibaba's Qwen3 model that is intended for systems with limited computational resources. According to DeepSeek, this 8 billion parameter model matches the performance of the larger Qwen3-235B model.

DeepSeek-R1-0528-Qwen3-8B。 基于阿里巴巴Qwen3模型的更小蒸馏版本，适用于计算资源有限的系统。据DeepSeek称，这个80亿参数的模型与更大的Qwen3-235B模型的性能相当。
Janus-Pro-7B. Released in January 2025, Janus-Pro-7B is a vision model that can understand and generate images.

Janus-Pro-7B。 2025年1月发布，Janus-Pro-7B是一个能够理解和生成图像的视觉模型。
DeepSeek-V3.1. Released in August 2025 as a hybrid model with dual-mode functionality, DeepSeek-V3.1 supports both thinking mode and non-thinking mode within a single model. The model is built on an 840 billion parameter base and supports 128K context length. The model also supports enhanced tool calling and agent capabilities through post-training optimization.

DeepSeek-V3.1。 2025年8月发布，作为具有双模式功能的混合模型，DeepSeek-V3.1在单个模型中同时支持思考模式和非思考模式。该模型基于8400亿参数基础构建，支持128K上下文长度模型能够处理的输入文本的最大token数量，直接影响模型处理长文档和多轮对话的能力。。该模型还通过训练后优化支持增强的工具调用和智能体能力。

Why DeepSeek is Raising Alarms in the U.S.

While there was much hype around the DeepSeek-R1 release, it has raised alarms in the U.S., triggering concerns and a stock market sell-off in tech stocks. On Monday, Jan. 27, 2025, the Nasdaq Composite dropped by 3.4% at market opening, with Nvidia declining by 17% and losing approximately $600 billion in market capitalization.

尽管DeepSeek-R1的发布引起了大量炒作，但它已在美国敲响警钟，引发担忧并导致科技股抛售。2025年1月27日星期一，纳斯达克综合指数开盘下跌3.4%，英伟达下跌17%，市值损失约6000亿美元。

DeepSeek is raising alarms in the U.S. for several reasons, including the following:

DeepSeek在美国引发担忧有以下几个原因：

Cost disruption. DeepSeek claims to have developed its R1 model for less than $6 million. The low-cost development threatens the business model of U.S. tech companies that have invested billions in AI. DeepSeek is also cheaper for users than OpenAI.

成本颠覆。 DeepSeek声称其R1模型的开发成本不到600万美元。这种低成本开发威胁到已投资数十亿美元于AI的美国科技公司的商业模式。对于用户来说，DeepSeek也比OpenAI更便宜。
Technical achievement despite restrictions. The export of the highest-performance AI accelerator and GPU chips from the U.S. is restricted to China. Yet, despite that, DeepSeek has demonstrated that leading-edge AI development is possible without access to the most advanced U.S. technology.

受限下的技术成就。 美国限制向中国出口最高性能的AI加速器和GPU芯片。然而，尽管如此，DeepSeek已经证明，即使无法获得最先进的美国技术，也能进行前沿的AI开发。
Business model threat. In contrast with OpenAI, which is proprietary technology, DeepSeek is open source and free, challenging the revenue model of U.S. companies charging monthly fees for AI services.

商业模式威胁。 与专有技术的OpenAI不同，DeepSeek是开源且免费的，这对美国公司按月收费的AI服务收入模式构成了挑战。
Geopolitical concerns. Being based in China, DeepSeek challenges U.S. technological dominance in AI. Tech investor Marc Andreessen called it AI's "Sputnik moment," comparing it to the Soviet Union's space race breakthrough in the 1950s.

地缘政治担忧。 作为一家中国公司，DeepSeek挑战了美国在AI领域的技术主导地位。科技投资者马克·安德森称之为AI的"斯普特尼克时刻"，将其比作苏联在1950年代太空竞赛中的突破。

Global Regulatory Challenges and Bans

Countries and organizations around the world have already banned DeepSeek, citing ethics, privacy and security issues within the company. Because all user data is stored in China, the biggest concern is the potential for a data leak to the Chinese government. The LLM was also trained with a Chinese worldview -- a potential problem due to the country's authoritarian government.

世界各国和组织已经以该公司存在伦理、隐私和安全问题为由禁止了DeepSeek。由于所有用户数据都存储在中国，最大的担忧是数据可能泄露给中国政府。该大语言模型也是基于中国世界观进行训练的——由于该国的威权政府，这可能是一个潜在问题。

Places where DeepSeek is banned include the following:

禁止DeepSeek的地方包括：

Australian government agencies.

澳大利亚政府机构。
India central government.

印度中央政府。
Italy.

意大利。
NASA.

美国国家航空航天局。
South Korea industry ministry.

韩国产业通商资源部。
Taiwan government agencies.

台湾政府机构。
Texas state government.

德克萨斯州政府。
U.S. Congress.

美国国会。
U.S. Navy.

美国海军。
U.S. Pentagon.

美国五角大楼。

There is also the potential for a broad ban in the European Union (EU). Multiple countries inside the EU have taken action including Italy and Germany. Italy banned DeepSeek in January 2025 based on data privacy concerns. Germany followed suit in June 2025 with the country's data protection commissioner Meike Kamp formally asking Apple and Google to consider blocking the DeepSeek app from their German app stores. The concern is that DeepSeek is transferring user data to China without adequate protection under EU data protection rules. DeepSeek potentially infringes on multiple EU regulations, including the General Data Protection Regulation and the EU AI Act.

欧盟（EU）也有可能实施广泛禁令。欧盟内部多个国家已采取行动，包括意大利和德国。意大利于2025年1月基于数据隐私担忧禁止了DeepSeek。德国于2025年6月效仿，该国数据保护专员迈克·坎普正式要求苹果和谷歌考虑

常见问题（FAQ）

DeepSeek是什么公司？

DeepSeek是一家总部位于中国杭州的人工智能开发公司，成立于2023年5月，专注于开发低成本、开源的大型语言模型，隶属于量化对冲基金幻方量化旗下。

DeepSeek为什么能挑战OpenAI？

DeepSeek通过创新的训练方法显著降低了开发成本，并以开源模式提供其R1等模型，在2025年发布后迅速超越ChatGPT登顶应用商店，引发投资者对美国AI巨头价值的重新评估。

DeepSeek面临哪些国际挑战？

DeepSeek因数据隐私问题面临国际禁令，其崛起引发了地缘政治担忧，美国等国家对其技术扩散和市场竞争地位存在监管挑战。