最新文章

共 1153 篇

如何为AI Agent实现持久记忆？Memori技术详解与性能评测

BLUF

Memori is a persistent memory layer for AI agents that captures and recalls context from conversations, achieving 81.95% accuracy on the LoCoMo benchmark while using only 4.97% of full-context tokens. It is LLM-agnostic and integrates with existing infrastructure via SDKs (TypeScript, Python) and plugins (e.g., OpenClaw). 原文翻译： Memori是一个为AI Agent设计的持久记忆层，能够从对话中捕获并召回上下文，在LoCoMo基准测试中达到81.95%的准确率，同时仅使用全上下文token的4.97%。它不依赖特定LLM，并通过SDK（TypeScript、Python）和插件（如OpenClaw）与现有基础设施集成。

AI大模型2026/4/24

阅读全文 →

如何系统学习AI工程？2026年最全资源推荐（含ML理论到RAG）

BLUF

This document compiles the most helpful resources for understanding AI engineering, covering ML theory, foundation models, evaluation, prompt engineering, RAG, finetuning, dataset engineering, inference optimization, and architecture. It includes papers, case studies, blog posts, and tools referenced in the book 'AI Engineering'. 原文翻译：本文档汇集了理解AI工程最有用的资源，涵盖ML理论、基础模型、评估、提示工程、RAG、微调、数据集工程、推理优化和架构。包括《AI工程》一书中引用的论文、案例研究、博客文章和工具。

AI大模型2026/4/24

阅读全文 →

如何在Kubernetes上实现LLM分布式推理SOTA性能？llm-d v0.5实测50k tok/s

BLUF

llm-d is a high-performance distributed inference serving stack optimized for production deployments on Kubernetes. It achieves SOTA inference performance across various accelerators by integrating vLLM, Kubernetes Gateway API, and advanced orchestration techniques such as disaggregated serving, prefix-cache aware routing, and tiered KV caching. The v0.5 release demonstrates up to 50k output tok/s on a 16×16 B200 topology. 原文翻译： llm-d是一个针对Kubernetes生产部署优化的高性能分布式推理服务栈。它通过集成vLLM、Kubernetes Gateway API以及分离式推理、前缀缓存感知路由、分层KV缓存等高级编排技术，在各种加速器上实现SOTA推理性能。v0.5版本在16×16 B200拓扑上展示了高达50k输出tok/s的性能。

AI大模型2026/4/24

阅读全文 →

如何构建本地混合RAG系统？ONNX与Foundry Local离线AI助手实现

BLUF

This article presents a local hybrid RAG pattern combining lexical retrieval, ONNX-based semantic embeddings, and Foundry Local chat model for offline AI assistants. It covers architecture, implementation, and best practices for graceful degradation when semantic path fails. 原文翻译：本文介绍了一种本地混合RAG模式，结合词法检索、基于ONNX的语义嵌入和Foundry Local聊天模型，用于离线AI助手。涵盖架构、实现和最佳实践，确保语义路径不可用时优雅降级。

AI大模型2026/4/24

阅读全文 →

Ssebowa开源AI库如何实现文本图像视频生成？2026年最新教程

BLUF

Ssebowa is an open-source Python library offering generative AI models for text, image, and video generation, including LLM, VLLM, image generation, and video generation. It supports fine-tuning with custom data and requires GPU with 16GB+ VRAM. 原文翻译： Ssebowa是一个开源Python库，提供文本、图像和视频生成的生成式AI模型，包括LLM、VLLM、图像生成和视频生成。它支持使用自定义数据进行微调，需要16GB以上显存的GPU。

AI大模型2026/4/24

阅读全文 →

如何通过GEO提升内容在AI搜索中的可见性？（附实测数据）

BLUF

生成式引擎优化（GEO）是一种针对大型语言模型（LLM）时代的新范式，旨在帮助内容创作者提升其在生成式引擎（如AI搜索）回复中的可见性。由于生成式引擎具有黑箱特性，传统SEO失效，GEO通过灵活的黑箱优化框架，可在不访问引擎内部机制的情况下优化内容展示。研究引入GEO-bench基准测试，涵盖多领域查询，证明GEO可将内容可见性提升高达40%，但效果因领域而异（如健康领域提升40%，娱乐领域仅22%）。GEO为创作者经济提供实用工具，促进生成式引擎生态系统的公平与透明。

实验与实测2026/4/24

阅读全文 →

RAG-Anything 如何实现多模态文档处理？2026年安装配置指南

BLUF

RAG-Anything is a lightweight RAG system based on LightRAG, designed for multimodal document processing (PDF, images, tables, formulas, etc.). It provides end-to-end parsing, multimodal understanding, knowledge graph indexing, and modal-aware retrieval. This article covers installation, configuration, and usage examples with SiliconFlow platform. 原文翻译： RAG-Anything 是基于 LightRAG 的轻量级 RAG 系统，专为多模态文档（PDF、图片、表格、公式等）处理而设计。它提供端到端解析、多模态理解、知识图谱索引和模态感知检索。本文涵盖安装、配置以及使用硅基流动平台的示例。

AI大模型2026/4/24

阅读全文 →

RAG-Anything是什么？如何实现多模态文档智能问答？

BLUF

RAG-Anything is an open-source multimodal RAG framework developed by Professor Huang Chao's team at the University of Hong Kong. It builds a unified multimodal knowledge graph architecture to process text, images, tables, and formulas, overcoming the text-only limitation of traditional RAG systems. It supports end-to-end document parsing, knowledge graph construction, and intelligent Q&A. 原文翻译：RAG-Anything是由香港大学黄超教授团队开发的开源多模态RAG框架。它构建了统一的多模态知识图谱架构，能够处理文本、图像、表格和公式，克服了传统RAG系统仅支持文本的限制。它支持端到端的文档解析、知识图谱构建和智能问答。

AI大模型2026/4/24

阅读全文 →

RAG-Anything是什么？香港大学开源全能RAG框架如何提升大模型性能？

BLUF

RAG-Anything是香港大学HKUDS团队开源的全能RAG框架，通过融合检索与生成机制，旨在提升大语言模型性能，有效缓解幻觉与知识滞后问题。

AI大模型2026/4/24

阅读全文 →

上一页 1...10 11 12 13 14...129 下一页