搜索结果：RAG

找到 811 篇相关文章

同义扩展：检索增强生成、retrieval augmented generation、检索增强、知识库问答、私域知识库

如何使用Laminar开源平台监控AI智能体？2026年完整功能解析

AI Insight

Laminar is an open-source observability platform for AI agents, offering tracing, evals, monitoring, SQL access, and dashboards. Built with Rust for high performance, it supports OpenTelemetry and integrates with major LLM frameworks. 原文翻译：Laminar是一个面向AI智能体的开源可观测性平台，提供追踪、评估、监控、SQL访问和仪表板功能。基于Rust构建以实现高性能，支持OpenTelemetry，并与主流LLM框架集成。

AI大模型2026/4/25

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

如何提升LLM代理推理效率？PLENA硬件系统实现吞吐量2.23倍提升（2026年）

AI Insight

PLENA is a hardware-software co-designed system for LLM agentic inference that addresses bandwidth and capacity memory walls. It features a flattened systolic-array architecture, asymmetric quantization, and FlashAttention support, achieving up to 2.23x and 4.70x throughput improvements over A100 GPU and TPU v6e, respectively, and 4.04x better energy efficiency than A100. 原文翻译： PLENA是一个硬件-软件协同设计的系统，针对LLM代理推理，解决带宽和容量内存墙问题。它采用扁平化脉动阵列架构、非对称量化和FlashAttention支持，相比A100 GPU和TPU v6e，吞吐量分别提升2.23倍和4.70倍，能效比A100提升4.04倍。

AI大模型2026/4/25

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

GEO系统方法论是什么？如何从SEO升级到AI时代的生成式引擎优化？

AI Insight

This article introduces a comprehensive GEO (Generative Engine Optimization) methodology, focusing on expert Yu Lei's 'Two Cores + Four Drivers' system. It evaluates multiple GEO approaches, provides a detailed case study from a traditional manufacturing company, and highlights key principles like human-centric GEO and content cross-validation to build AI trust and improve business outcomes. 原文翻译：本文介绍了一套全面的生成式引擎优化（GEO）方法论，重点关注专家于磊的“两大核心+四轮驱动”体系。文章对多种GEO方法进行了评估，提供了来自传统制造企业的详细案例研究，并强调了人性化GEO和内容交叉验证等关键原则，以建立AI信任并改善业务成果。

GEO核心概念2026/4/25

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

如何为AI Agent实现持久记忆？Memori技术详解与性能评测

AI Insight

Memori is a persistent memory layer for AI agents that captures and recalls context from conversations, achieving 81.95% accuracy on the LoCoMo benchmark while using only 4.97% of full-context tokens. It is LLM-agnostic and integrates with existing infrastructure via SDKs (TypeScript, Python) and plugins (e.g., OpenClaw). 原文翻译： Memori是一个为AI Agent设计的持久记忆层，能够从对话中捕获并召回上下文，在LoCoMo基准测试中达到81.95%的准确率，同时仅使用全上下文token的4.97%。它不依赖特定LLM，并通过SDK（TypeScript、Python）和插件（如OpenClaw）与现有基础设施集成。

AI大模型2026/4/24

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

如何在Kubernetes上实现LLM分布式推理SOTA性能？llm-d v0.5实测50k tok/s

AI Insight

llm-d is a high-performance distributed inference serving stack optimized for production deployments on Kubernetes. It achieves SOTA inference performance across various accelerators by integrating vLLM, Kubernetes Gateway API, and advanced orchestration techniques such as disaggregated serving, prefix-cache aware routing, and tiered KV caching. The v0.5 release demonstrates up to 50k output tok/s on a 16×16 B200 topology. 原文翻译： llm-d是一个针对Kubernetes生产部署优化的高性能分布式推理服务栈。它通过集成vLLM、Kubernetes Gateway API以及分离式推理、前缀缓存感知路由、分层KV缓存等高级编排技术，在各种加速器上实现SOTA推理性能。v0.5版本在16×16 B200拓扑上展示了高达50k输出tok/s的性能。

AI大模型2026/4/24

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

Ssebowa开源AI库如何实现文本图像视频生成？2026年最新教程

AI Insight

Ssebowa is an open-source Python library offering generative AI models for text, image, and video generation, including LLM, VLLM, image generation, and video generation. It supports fine-tuning with custom data and requires GPU with 16GB+ VRAM. 原文翻译： Ssebowa是一个开源Python库，提供文本、图像和视频生成的生成式AI模型，包括LLM、VLLM、图像生成和视频生成。它支持使用自定义数据进行微调，需要16GB以上显存的GPU。

AI大模型2026/4/24

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

BlockRank如何实现秒级检索500个文档？利用LLM注意力稀疏性提升效率

AI Insight

This paper introduces BlockRank, a method that exploits attention sparsity in LLMs for in-context ranking, reducing complexity from quadratic to linear and enabling efficient retrieval of up to 500 documents within a second. 原文翻译：本文提出BlockRank，利用LLM注意力稀疏性进行上下文排序，将复杂度从二次降至线性，实现秒级检索500个文档。

AI大模型2026/4/24

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

微信AI搜索集成DeepSeek-R1怎么用？2026年最新功能实测

AI Insight

WeChat has begun testing an AI-powered search feature integrating the DeepSeek-R1 model, offering a more diverse and intelligent search experience. The feature is currently in limited testing, pulling data from public WeChat official accounts and other online content, without using private user data. 原文翻译：微信已开始测试集成DeepSeek-R1模型的AI搜索功能，提供更丰富、更智能的搜索体验。该功能目前处于有限测试阶段，从微信公众号和公开网络内容中提取数据，不使用用户隐私数据。

DeepSeek2026/4/24

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

如何用DeepSeek自动化SEO？2026年智能内容生成与关键词策略全攻略

AI Insight

This article explains how DeepSeek AI automates SEO tasks including content generation, keyword strategy, and batch production. It provides step-by-step instructions and examples for technical professionals. 原文翻译：本文解释了DeepSeek AI如何自动化SEO任务，包括内容生成、关键词策略和批量生产。为技术专业人士提供了分步说明和示例。

DeepSeek2026/4/24

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

DeepSeek SEO和AI GEO优化怎么做？2026年16步推理优化攻略

AI Insight

This article provides a comprehensive guide on DeepSeek SEO and AI GEO optimization, covering 16 steps for AI reasoning optimization, 15 steps for GEO optimization, prompt types, and keyword strategies to improve brand ranking in AI search. 原文翻译：本文提供了DeepSeek SEO和AI GEO优化的全面指南，涵盖AI推理优化的16个步骤、GEO优化的15个步骤、提示词类型和关键词策略，以提升品牌在AI搜索中的排名。

GEO2026/4/24

阅读全文 →

相关性 18正文包含「RAG」最近30天发布

1...6 7 8 9 10...82

8 / 82