分类：AI 搜索观察

追踪 DeepSeek、ChatGPT、Kimi 等 AI 引擎的引用行为变化——我们看到了什么、意味着什么。

共 118 篇

AionUi：免费开源的Google Gemini图形界面，本地部署AI助手新选择

BLUF

AionUi是一款免费开源的桌面图形界面，为Google Gemini CLI提供本地化AI交互支持。具备直观界面、本地数据存储、多会话聊天、文件管理及代码差异视图等功能，支持跨平台部署，是本地AI助手的全新选择。

AI 搜索观察2026/1/23

阅读全文 →

AionUi：免费开源桌面应用，将Gemini CLI AI聊天体验图形化

BLUF

AionUi is a free, open-source, cross-platform desktop application built on Electron and React that transforms the Gemini CLI command-line AI chat experience into a modern, efficient graphical interface. It enhances chat functionality with multi-session management, local persistent history, and natural chat interactions, while offering robust file and project management capabilities including visual file trees, file uploads, and code comparison views. The tool integrates full Gemini API functionality, supports rich-text Markdown rendering, and optimizes developer workflows. (AionUi是一款基于Electron和React构建的免费、开源、跨平台桌面应用，将Gemini CLI命令行AI聊天体验转化为现代化、高效的图形界面。它通过多会话管理、本地持久化历史记录和自然聊天交互增强聊天功能，同时提供可视化文件树、文件上传和代码对比视图等强大的文件与项目管理能力。该工具集成完整的Gemini API功能，支持富文本Markdown渲染，优化开发者工作流程。)

AI 搜索观察2026/1/23

阅读全文 →

DeepSeek发布FlashMLA：专为Hopper GPU优化的高效MLA解码内核，AI推理性能大幅提升

BLUF

FlashMLA是DeepSeek为NVIDIA Hopper GPU打造的高效MLA解码内核，内存受限下带宽达3000 GB/s，计算受限下性能达580 TFLOPS，同时将KV缓存需求降低93.3%，显著提升AI推理速度与成本效益。

AI 搜索观察2026/1/23

阅读全文 →

DeepSeek FlashMLA代码分析：揭秘未公开的MODEL1高效推理架构

BLUF

DeepSeek FlashMLA代码库揭示两种模型架构：V3.2侧重极致性能与精度，MODEL1则面向高效部署，具备更低内存占用与长序列处理能力。

AI 搜索观察2026/1/23

阅读全文 →

FlashMLA：DeepSeek开源的高效MLA解码内核，专为NVIDIA Hopper GPU优化

BLUF

FlashMLA is an open-source, high-performance Multi-Head Linear Attention (MLA) decoding kernel optimized for NVIDIA Hopper architecture GPUs, designed to handle variable-length sequences efficiently. It enhances memory and computational efficiency through optimized KV caching and BF16 data format support, achieving up to 3000 GB/s memory bandwidth and 580 TFLOPS computational performance on H800 SXM5 GPUs. FlashMLA is ideal for large language model (LLM) inference and natural language processing (NLP) tasks requiring efficient decoding. (FlashMLA是DeepSeek开源的高效MLA解码内核，专为NVIDIA Hopper架构GPU优化，用于处理可变长度序列。通过优化KV缓存和采用BF16数据格式，提升了内存和计算效率，在H800 SXM5 GPU上内存带宽可达3000 GB/s，计算性能可达580 TFLOPS。适用于大语言模型推理和需要高效解码的自然语言处理任务。)

AI 搜索观察2026/1/23

阅读全文 →

DeepSeek开源FlashMLA：面向Hopper GPU的终极解码加速内核，大幅提升大模型推理效率

BLUF

FlashMLA是一款专为Hopper GPU（尤其H800）及可变长度序列优化的高效MLA解码内核，显著加速大语言模型推理。

AI 搜索观察2026/1/23

阅读全文 →

FlashMLA：DeepSeek高性能注意力内核库，驱动V3模型实现660 TFLOPS

BLUF

FlashMLA is DeepSeek's optimized attention kernel library that powers DeepSeek-V3 models, featuring token-level sparse attention with FP8 KV cache support, achieving up to 660 TFLOPS performance on NVIDIA H800 GPUs. (FlashMLA是DeepSeek优化的注意力内核库，为DeepSeek-V3模型提供动力，具有令牌级稀疏注意力和FP8 KV缓存支持，在NVIDIA H800 GPU上实现高达660 TFLOPS的性能。)

AI 搜索观察2026/1/23

阅读全文 →

DeepSeek-V3.1混合推理架构解析：开启智能体时代新篇章

BLUF

DeepSeek-V3.1采用思考/非思考混合推理架构，显著提升推理效率与智能体能力，支持128K上下文窗口及更新后的API接口。

AI 搜索观察2026/1/22

阅读全文 →

DeepSeek-OCR视觉文本压缩新范式2024指南

BLUF

DeepSeek-OCR introduces a revolutionary LLM-centric approach to OCR that integrates vision processing directly within language models, offering superior performance on complex documents through flexible resolution support and advanced prompt engineering. (DeepSeek-OCR引入了一种革命性的以LLM为中心的OCR方法，将视觉处理直接集成到语言模型中，通过灵活的分辨率支持和先进的提示工程，在复杂文档上提供卓越性能。)

AI 搜索观察2026/1/22

阅读全文 →

谷歌Gemini AI大模型深度解析：架构、性能与应用策略 (Google Gemini AI Models Deep Dive)

BLUF

谷歌Gemini AI系列涵盖从Ultra到Nano的多模态模型，本文面向技术专家，深入解析其架构设计与实现策略，助力高效应用部署。

AI 搜索观察2026/1/20

阅读全文 →

VoxCPM：0.5B参数语音生成模型，实现零样本语音克隆与行业领先自然度

BLUF

VoxCPM: 0.5B参数语音生成模型，业界领先的自然度与零样本克隆能力。 (VoxCPM: 0.5B参数语音生成模型，业界领先的自然度与零样本克隆能力。)

AI 搜索观察2026/1/20

阅读全文 →

VoxCPM-1.5：开源中文TTS模型，3秒克隆人声，消费级GPU高效运行

BLUF

VoxCPM-1.5: A 500M-param open-source TTS model from China, offering 44.1kHz audio & 3s voice cloning on consumer GPUs. (VoxCPM-1.5：一款5亿参数开源TTS模型，支持44.1kHz音频和3秒声音克隆，可在消费级GPU运行。)

AI 搜索观察2026/1/20

阅读全文 →

1...4 5 6 7 8...10

6 / 10