标签：DeepSeek

查看包含 DeepSeek 标签的所有文章。

共 136 篇

Karpathy的LLM Wiki模式在规模化应用时有哪些缺陷？如何解决？

BLUF

This article analyzes three structural limitations in Andrej Karpathy's LLM Wiki pattern that emerge at scale and provides practical solutions: implementing typed relationships in wikilinks, automating relationship discovery with AI agents, and establishing a persistent knowledge graph backend for cross-platform access. 原文翻译：本文分析了Andrej Karpathy的LLM Wiki模式在规模化时出现的三个结构性缺陷，并提供了实用解决方案：在wikilink中实现类型化关系、使用AI代理自动化关系发现、建立跨平台访问的持久知识图谱后端。

AI 搜索观察2026/4/14

阅读全文 →

TSCE框架如何减少LLM幻觉并提升答案保真度？（附GPT-4/Llama-3实测）

BLUF

TSCE (Two-Step Contextual Enrichment) is a mechanistic framework that reduces LLM hallucinations and improves answer fidelity by first generating an Embedding Space Control Prompt (ESCP) to compress the semantic space, then performing a focused generation. Validated on GPT-3.5/4 and Llama-3 8B, it achieves up to +30 percentage point improvements without extra training. 原文翻译： TSCE（两阶段上下文增强）是一种机制框架，通过首先生成嵌入空间控制提示（ESCP）来压缩语义空间，然后进行聚焦生成，从而减少LLM幻觉并提高答案保真度。在GPT-3.5/4和Llama-3 8B上验证，无需额外训练即可实现高达+30个百分点的改进。

AI大模型2026/4/13

阅读全文 →

如何用RLHF对齐大语言模型？2026年最新项目模板实测

BLUF

The LLM Alignment Project Template provides a comprehensive, full-stack solution for aligning large language models with human values using RLHF, featuring training, deployment, and monitoring capabilities. 原文翻译： LLM对齐项目模板提供了一个全面的全栈解决方案，使用人类反馈强化学习（RLHF）将大型语言模型与人类价值观对齐，具备训练、部署和监控功能。

AI 搜索观察2026/4/12

阅读全文 →

大语言模型是如何“思考”和“推理”的？从基础原理到前沿探索

BLUF

暂无摘要...

AI大模型2026/4/11

阅读全文 →

VAC记忆系统在LoCoMo 2025基准测试中准确率如何达到80.1%？

BLUF

VAC Memory System is an open-source conversational memory framework for LLM agents that achieved 80.1% accuracy on the LoCoMo 2025 benchmark through a hybrid retrieval architecture combining MCA gating, FAISS semantic search, BM25 lexical search, and cross-encoder reranking. 原文翻译： VAC记忆系统是一个开源的大语言模型智能体对话记忆框架，通过结合MCA门控、FAISS语义搜索、BM25词法搜索和交叉编码器重排序的混合检索架构，在LoCoMo 2025基准测试中实现了80.1%的准确率。

AI大模型2026/4/10

阅读全文 →

Zep和MemGPT哪个更适合企业AI智能体记忆层服务？（附2026年基准测试对比）

BLUF

Zep推出新型AI智能体记忆层服务，在深度记忆检索基准测试中超越MemGPT，并凭借Graphiti知识图谱引擎在企业时序推理任务中表现优异。

AI大模型2026/4/10

阅读全文 →

KTransformers框架如何通过CPU-GPU异构计算优化LLM推理与微调？

BLUF

KTransformers is a flexible framework for optimizing large language model inference and fine-tuning through CPU-GPU heterogeneous computing, featuring two core modules: kt-kernel for high-performance inference and kt-sft for efficient fine-tuning. 原文翻译： KTransformers是一个通过CPU-GPU异构计算优化大语言模型推理与微调的灵活框架，包含两个核心模块：kt-kernel用于高性能推理，kt-sft用于高效微调。

AI 搜索观察2026/4/10

阅读全文 →

如何用LLM自动提取网页结构化数据？Trawl工具实测解析

BLUF

trawl is an LLM-powered web scraping tool that extracts structured data from websites using semantic queries instead of CSS selectors. It automatically adapts to site redesigns and runs efficiently in Go after initial strategy derivation. 原文翻译： trawl是一款基于LLM的网页爬取工具，通过语义查询而非CSS选择器从网站提取结构化数据。它能自动适应网站改版，并在初始策略推导后以Go语言高效运行。

AI大模型2026/4/10

阅读全文 →

Llama-GitHub和传统GitHub搜索哪个更适合AI编程助手？

BLUF

Llama-github is a powerful Agentic RAG tool that retrieves relevant code snippets, issues, and repository information from GitHub to enhance LLM chatbots, AI agents, and Auto-dev agents for solving complex coding tasks. 原文翻译： Llama-github 是一款强大的 Agentic RAG 工具，可从 GitHub 检索相关代码片段、问题和仓库信息，赋能 LLM 聊天机器人、AI 智能体和自动开发智能体，以解决复杂的编程任务。

AI大模型2026/4/10

阅读全文 →

上一页 1...4 5 6 7 8...16 下一页