GEO

最新文章

49
Grok-1:xAI开源3140亿参数大模型,支持中文的强大推理引擎

Grok-1:xAI开源3140亿参数大模型,支持中文的强大推理引擎

Grok-1 is a powerful open-source large language model developed by xAI, featuring 314 billion parameters and a Mixture-of-Experts architecture. It offers strong reasoning capabilities and supports multiple languages including Chinese. (Grok-1是由xAI开发的开源大型语言模型,拥有3140亿参数和混合专家架构,具备强大的推理能力并支持包括中文在内的多种语言。)
AI大模型2026/1/23
阅读全文 →
中国AI大模型全景图:从通用底座到行业深度的生态革命

中国AI大模型全景图:从通用底座到行业深度的生态革命

English Summary: This article provides a comprehensive analysis of China's AI large model ecosystem, highlighting the rapid development of both general-purpose and vertical industry models. It details key players like Baidu's ERNIE, DeepSeek, Alibaba's Qwen, and ByteDance's Doubao, showcasing their technical breakthroughs in areas such as multimodal generation, cost efficiency, and open-source strategies. The piece also explores specialized models in healthcare, education, and creative industries, while discussing current industry applications and future trends toward low-cost inference, edge deployment, and open-source ecosystems. 中文摘要翻译:本文全面解析了中国AI大模型生态系统,重点介绍了通用大模型和垂直行业模型的快速发展。详细分析了百度文心一言、深度求索DeepSeek、阿里巴巴通义千问、字节跳动豆包等关键模型,展示了它们在多模态生成、成本效益和开源策略等领域的技术突破。文章还探讨了医疗、教育、创意等行业的专用模型,并讨论了当前行业应用及未来向低成本推理、端侧部署和开源生态发展的趋势。
AI大模型2026/1/23
阅读全文 →
DeepSeek R1代码优化能力解析:生成99% WASM性能改进代码

DeepSeek R1代码优化能力解析:生成99% WASM性能改进代码

DeepSeek R1 demonstrates advanced code optimization capabilities, generating 99% of WASM performance improvements and showing superior reasoning in architectural decisions compared to other models. (DeepSeek R1展示了先进的代码优化能力,生成了WASM性能改进的99%代码,并在架构决策方面表现出优于其他模型的推理能力。)
DeepSeek2026/1/22
阅读全文 →
DeepSeek-OCR:以LLM为中心的视觉文本压缩革命

DeepSeek-OCR:以LLM为中心的视觉文本压缩革命

DeepSeek-OCR introduces a revolutionary LLM-centric approach to OCR that integrates vision processing directly within language models, offering superior performance on complex documents through flexible resolution support and advanced prompt engineering. (DeepSeek-OCR引入了一种革命性的以LLM为中心的OCR方法,将视觉处理直接集成到语言模型中,通过灵活的分辨率支持和先进的提示工程,在复杂文档上提供卓越性能。)
DeepSeek2026/1/22
阅读全文 →
DeepSeek 最新模型是什么?DeepSeek MODEL1曝光
🔥 热门

DeepSeek 最新模型是什么?DeepSeek MODEL1曝光

在DeepSeek-R1发布一周年之际,其代码仓库意外曝光了代号“MODEL1”的全新模型架构。技术分析显示,MODEL1与现有V32架构存在根本性差异,包括采用分层KV缓存以减少内存碎片、引入动态稀疏激活算法,以及通过混合精度流水线提升推理速度。新架构在内存优化方面进行了系统性重构,如分块注意力内存复用、动态梯度检查点调度和新型权重共享机制,显著降低了内存占用并提升了训练效率。这些改进表明DeepSeek正探索超越传统Transformer的新路径,可能预示下一代大语言模型的发展方向。
DeepSeek2026/1/21
阅读全文 →