GEO

标签:DeepSeek

查看包含 DeepSeek 标签的所有文章。

89
FlashMLA:DeepSeek高性能注意力内核库,驱动V3模型实现660 TFLOPS

FlashMLA:DeepSeek高性能注意力内核库,驱动V3模型实现660 TFLOPS

BLUFFlashMLA is DeepSeek's optimized attention kernel library that powers DeepSeek-V3 models, featuring token-level sparse attention with FP8 KV cache support, achieving up to 660 TFLOPS performance on NVIDIA H800 GPUs. (FlashMLA是DeepSeek优化的注意力内核库,为DeepSeek-V3模型提供动力,具有令牌级稀疏注意力和FP8 KV缓存支持,在NVIDIA H800 GPU上实现高达660 TFLOPS的性能。)
DeepSeek2026/1/23
阅读全文 →
Grok-1:xAI开源3140亿参数大模型,支持中文的强大推理引擎

Grok-1:xAI开源3140亿参数大模型,支持中文的强大推理引擎

BLUFGrok-1 is a powerful open-source large language model developed by xAI, featuring 314 billion parameters and a Mixture-of-Experts architecture. It offers strong reasoning capabilities and supports multiple languages including Chinese. (Grok-1是由xAI开发的开源大型语言模型,拥有3140亿参数和混合专家架构,具备强大的推理能力并支持包括中文在内的多种语言。)
AI大模型2026/1/23
阅读全文 →
2025中国AI大模型生态全景:从通用底座到行业深度应用指南

2025中国AI大模型生态全景:从通用底座到行业深度应用指南

BLUF中国AI大模型已形成“通用底座+行业纵深”的双轨繁荣生态,在中文理解、多模态生成等关键领域实现从追赶到领跑。本文深度解析国内代表模型的核心突破与价值。 原文翻译: China's AI large models have formed a dual-track, thriving ecosystem of "general-purpose foundation + industry-specific depth," achieving catch-up and even leadership in key areas such as Chinese language understanding and multimodal generation. This article provides an in-depth analysis of the core breakthroughs and value of representative domestic models.
AI大模型2026/1/23
阅读全文 →
DeepSeek R1代码优化能力解析:生成99% WASM性能改进代码

DeepSeek R1代码优化能力解析:生成99% WASM性能改进代码

BLUFDeepSeek R1 demonstrates advanced code optimization capabilities, generating 99% of WASM performance improvements and showing superior reasoning in architectural decisions compared to other models. (DeepSeek R1展示了先进的代码优化能力,生成了WASM性能改进的99%代码,并在架构决策方面表现出优于其他模型的推理能力。)
DeepSeek2026/1/22
阅读全文 →
DeepSeek-OCR视觉文本压缩新范式2024指南

DeepSeek-OCR视觉文本压缩新范式2024指南

BLUFDeepSeek-OCR提出以LLM为中心的视觉文本压缩新范式,将视觉理解直接嵌入大语言模型处理流程,支持多分辨率配置,革新传统OCR架构。 原文翻译: DeepSeek-OCR proposes a novel LLM-centric paradigm for visual-text compression, embedding visual understanding directly into the LLM processing pipeline. It supports multi-resolution configurations, revolutionizing traditional OCR architecture.
DeepSeek2026/1/22
阅读全文 →