最新文章

共 20 篇

AirLLM：无需量化，让700亿大模型在4GB GPU上运行

AirLLM is a lightweight inference framework for large language models that enables 70B parameter models to run on a single 4GB GPU without quantization, distillation, or pruning. (AirLLM是一个轻量化大语言模型推理框架，无需量化、蒸馏或剪枝，即可让700亿参数模型在单个4GB GPU上运行。)

LLMS2026/1/24

阅读全文 →

大型语言模型如何学会推理？探索LLMs的逻辑思维与知识应用

This article explores the reasoning capabilities of Large Language Models (LLMs), examining how they process information, make logical deductions, and their practical applications in technical domains. (本文深入探讨大型语言模型的推理能力，分析其信息处理机制、逻辑推理过程以及在技术领域的实际应用。)

LLMS2026/1/24

阅读全文 →

Gemini AI：突破性语言模型，助力全球智能化体验

Google's Gemini is a cutting-edge large language model (LLM) excelling in natural language processing tasks like text generation, translation, and dialogue. While direct access is restricted in China, users can leverage domestic platforms integrating Gemini API for stable, localized AI capabilities. (Gemini是谷歌开发的突破性大型语言模型，擅长文本生成、翻译和对话等自然语言处理任务。尽管国内无法直接访问，但用户可通过集成Gemini API的国内平台获得稳定、本地化的AI体验。)

Gemini2026/1/24

阅读全文 →

DeepSeek V4前瞻：代码提交揭示下一代AI模型的架构革新与编程能力飞跃

DeepSeek is reportedly developing a new flagship AI model, DeepSeek V4, with enhanced coding capabilities, set to launch around Chinese New Year in mid-February. Recent GitHub code updates reveal a new model identifier "MODEL1" with distinct technical features including KV cache layout, sparsity handling, and FP8 decoding support, suggesting optimized memory and computational efficiency. The model may also incorporate recent research on optimized residual connections and biologically-inspired AI memory modules. (DeepSeek据称正在开发新一代旗舰AI模型DeepSeek V4，具备更强的编程能力，计划于2月中旬农历新年期间发布。近期GitHub代码更新显示新的模型标识符“MODEL1”具有独特技术特征，包括键值缓存布局、稀疏性处理和FP8解码支持，表明在内存优化和计算效率方面进行了针对性设计。该模型可能整合优化残差连接和受生物学启发的AI记忆模块等最新研究成果。)

DeepSeek2026/1/24

阅读全文 →

Grok-1：xAI开源3140亿参数大模型，推理编程多语言全面解析

Grok-1 is an open-source large language model developed by xAI, featuring 314 billion parameters and a Mixture-of-Experts architecture. It demonstrates strong performance in reasoning, coding, and multilingual tasks while being freely available for research and commercial use. (Grok-1是由xAI开发的开源大型语言模型，拥有3140亿参数和专家混合架构。它在推理、编程和多语言任务中表现出色，同时可供研究和商业用途免费使用。)

AI大模型2026/1/23

阅读全文 →

DeepSeek R1代码优化能力解析：生成99% WASM性能改进代码

DeepSeek R1 demonstrates advanced code optimization capabilities, generating 99% of WASM performance improvements and showing superior reasoning in architectural decisions compared to other models. (DeepSeek R1展示了先进的代码优化能力，生成了WASM性能改进的99%代码，并在架构决策方面表现出优于其他模型的推理能力。)

DeepSeek2026/1/22

阅读全文 →

DeepSeek-R1推理模型发布：性能媲美OpenAI-o1，开源助力AI研究

暂无摘要...

DeepSeek2026/1/22

阅读全文 →

微软Agent Framework深度解析：简化AI智能体开发与编排

Microsoft Agent Framework simplifies AI agent development by reducing orchestration complexity and supporting multi-agent workflows through familiar .NET patterns, enabling production-ready deployment with minimal code. (微软Agent Framework通过降低编排复杂性并通过熟悉的.NET模式支持多智能体工作流来简化AI智能体开发，实现以最少代码进行生产就绪部署。)

AI大模型2026/1/21

阅读全文 →

DeepSeek突破：纯强化学习如何实现高级AI推理能力

DeepSeek demonstrates that pure reinforcement learning can develop advanced AI reasoning without human demonstrations, achieving superior performance in mathematics, coding, and STEM through emergent self-reflection and verification patterns. (DeepSeek证明纯强化学习无需人类演示即可发展高级AI推理，通过涌现的自我反思和验证模式在数学、编程和STEM领域实现卓越性能。)

DeepSeek2026/1/21

阅读全文 →

1 2 3 下一页