Prompt Refiner如何优化AI Agent提示词并降低API成本？（附2026年归档说明）

⚠️ 项目归档 (2026年4月)

此项目已不再维护。 由于以下原因，令牌优化的重要性已降低：

This project is no longer maintained. Token optimization is less critical now due to:

令牌成本下降了12倍 (GPT-4: $2.50/百万令牌) (Token costs dropped 12x (GPT-4: $2.50/M tokens))
上下文窗口扩展至 200K-1M 令牌 (Context windows expanded to 200K-1M tokens)
更好的替代方案：缓存（节省90%以上）、机器学习压缩（20倍）、模型路由 (Better alternatives: caching (90%+ savings), ML compression (20x), model routing)

在以下场景仍有用： HTML清理、PII信息脱敏、高吞吐量生产环境、工具压缩。
详细信息及替代方案请参阅 ARCHIVED.md。

Still useful for: HTML cleaning, PII redaction, high-volume production, tool compression.
See ARCHIVED.md for details and alternatives.

🚀 专为AI智能体、RAG应用和聊天机器人设计的轻量级Python库，提供智能上下文管理和自动令牌优化功能。
节省 5-70% 的API成本 - 在函数调用上平均减少57%，在RAG上下文上减少5-15%。

🚀 Lightweight Python library for AI Agents, RAG apps, and chatbots with smart context management and automatic token optimization.
Save 5-70% on API costs - 57% average reduction on function calling, 5-15% on RAG contexts.

🎯 适用场景

RAG应用 • AI智能体 • 聊天机器人 • 文档处理 • 成本优化

RAG Applications • AI Agents • Chatbots • Document Processing • Cost Optimization

为何使用 Prompt Refiner？

构建具备自动令牌优化和智能上下文管理功能的AI智能体、RAG应用和聊天机器人。以下是一个完整示例（完整代码请参阅 examples/quickstart.py）：

Build AI agents, RAG applications, and chatbots with automatic token optimization and smart context management. Here's a complete example (see examples/quickstart.py for full code):

from prompt_refiner import MessagesPacker, SchemaCompressor, ResponseCompressor, StripHTML, NormalizeWhitespace

# 1. 打包消息（使用默认策略自动优化）
packer = MessagesPacker(
    track_tokens=True,
    system="<p>You are a helpful AI assistant.</p>",
    context=(["<div>Installation Guide...</div>"], StripHTML() | NormalizeWhitespace()),
    query="<span>Search for Python books.</span>"
)
messages = packer.pack()

# 2. 压缩工具模式
tool_schema = pydantic_function_tool(SearchBooksInput, name="search_books")
compressed_schema = SchemaCompressor().process(tool_schema)

# 3. 使用压缩后的模式调用LLM
response = client.chat.completions.create(
    model="gpt-4o-mini", messages=messages, tools=[compressed_schema]
)

# 4. 压缩工具响应
tool_response = search_books(**json.loads(tool_call.function.arguments))
compressed_response = ResponseCompressor().process(tool_response)

💡 运行 python examples/quickstart.py 以查看包含真实OpenAI API验证的完整工作流程。

💡 Run python examples/quickstart.py to see the complete workflow with real OpenAI API verification.

核心优势：

Key benefits:

默认策略 - 自动优化（系统提示/查询使用MinimalStrategy，上下文/历史记录使用StandardStrategy） (Default strategies - Automatic refining (MinimalStrategy for system/query, StandardStrategy for context/history))
工具模式压缩 - 在AI智能体函数定义上节省 10-70% 的令牌（平均：57%） (Tool schema compression - Save 10-70% tokens on AI agent function definitions (avg: 57%))
工具响应压缩 - 在智能体工具输出上节省 30-70% 的令牌 (Tool response compression - Save 30-70% tokens on agent tool outputs)
使用 | 组合操作 - 将多个清理器链接成管道 (Compose operations with | - Chain multiple cleaners into a pipeline)
在RAG上下文上节省 5-15% 的令牌 - 自动移除HTML、空白字符、重复内容 (Save 5-15% tokens on RAG contexts - Remove HTML, whitespace, duplicates automatically)
包含所有项目 - 无令牌预算限制，让LLM API处理最终截断 (All items included - No token budget limits, let LLM APIs handle final truncation)
跟踪节省 - 使用内置的节省跟踪功能衡量令牌优化效果 (Track savings - Measure token optimization impact with built-in savings tracking)
生产就绪 - 输出可直接用于OpenAI，无需额外步骤 (Production ready - Output goes directly to OpenAI without extra steps)

✨ 核心特性

✨ Key Features


模块	描述	组件
Cleaner	移除噪音并节省令牌	`StripHTML()`, `NormalizeWhitespace()`, `FixUnicode()`, `JsonCleaner()`
Compressor	激进地减小尺寸	`TruncateTokens()`, `Deduplicate()`
Scrubber	保护敏感数据	`RedactPII()`
Tools	优化AI智能体函数调用（工具模式与响应）	`SchemaCompressor()`, `ResponseCompressor()`
Packer	基于优先级排序的智能消息组合	`MessagesPacker` (聊天API), `TextPacker` (补全API)
Strategy	经过基准测试的预设，便于快速设置	`MinimalStrategy`, `StandardStrategy`, `AggressiveStrategy`

安装

Installation

# 基础安装（轻量级，零依赖）
pip install llm-prompt-refiner

# 包含精确令牌计数功能（可选，安装 tiktoken）
pip install llm-prompt-refiner[token]

示例

Examples

查看 examples/ 文件夹获取详细示例：

Check out the examples/ folder for detailed examples:

strategy/ - 预设策略（Minimal, Standard, Aggressive）及基准测试结果 (Preset strategies (Minimal, Standard, Aggressive) with benchmark results)
cleaner/ - HTML清理、JSON压缩、空白字符规范化、Unicode修复 (HTML cleaning, JSON compression, whitespace normalization, Unicode fixing)
compressor/ - 智能截断、去重 (Smart truncation, deduplication)
scrubber/ - PII信息脱敏（邮箱、电话、信用卡号等） (PII redaction (emails, phones, credit cards, etc.))
tools/ - 面向智能体系统的工具/API输出清理 (Tool/API output cleaning for agent systems)
packer/ - 与OpenAI集成的上下文预算管理 (Context budget management with OpenAI integration)
analyzer/ - 令牌计数与成本节省跟踪 (Token counting and cost savings tracking)

📖 完整文档： examples/README.md

📖 Full documentation: examples/README.md

📊 已验证的有效性

📊 Proven Effectiveness

Prompt Refiner 已在3个全面的基准测试套件中经过严格测试，涵盖函数调用、RAG应用和性能。数据显示如下：

Prompt Refiner has been rigorously tested across 3 comprehensive benchmark suites covering function calling, RAG applications, and performance. Here's what the data shows:

🎯 函数调用基准测试：平均减少57%的令牌

🎯 Function Calling Benchmark: 57% Average Token Reduction

SchemaCompressor 在来自 Stripe、Salesforce、HubSpot、Slack、OpenAI、Anthropic 等公司的 20个真实API模式上进行了测试：

SchemaCompressor was tested on 20 real-world API schemas from Stripe, Salesforce, HubSpot, Slack, OpenAI, Anthropic, and more:


类别	模式数量	平均减少率	最佳表现者
非常冗长 (企业级API)	11	67.4%	HubSpot: 73.2%
复杂 (功能丰富的API)	6	61.7%	Slack: 70.8%
中等 (标准API)	2	13.1%	Weather: 20.1%
简单 (极简API)	1	0.0%	Calculator (已是最简)
总体平均	20	56.9%	—

关键亮点：

Key Highlights:

✨ 所有模式平均减少56.9%（节省15,342个令牌） (✨ 56.9% average reduction across all schemas (15,342 tokens saved))
🔒 100% 无损压缩 - 所有协议字段均被保留（名称、类型、必需性、枚举） (🔒 100% lossless compression - all protocol fields preserved (name, type, required, enum))
✅ 100% 可调用（20/20已验证） - 所有压缩后的模式均能与OpenAI函数调用正确协作 (✅ 100% callable (20/20 validated) - all compressed schemas work correctly with OpenAI function calling)
🏢 企业级API减少70%以上 - HubSpot、Salesforce、OpenAI文件搜索 (🏢 Enterprise APIs see 70%+ reduction - HubSpot, Salesforce, OpenAI File Search)
📊 真实世界模式来自生产环境API，非合成示例 (📊 Real-world schemas from production APIs, not synthetic examples)
⚡ 零API成本 - 使用tiktoken本地处理 (⚡ Zero API cost - local processing with tiktoken)

按类别划分的令牌减少
SchemaCompressor在复杂API上实现60%以上的减少率

成本节省预测
针对不同规模智能体的月度节省估算（基于GPT-4定价）

✅ 功能验证：

✅ Functional Validation:

我们使用真实的OpenAI函数调用测试了所有20个压缩后的模式，以证明它们能正常工作：

We tested all 20 compressed schemas with real OpenAI function calling to prove they work correctly:

100% 可调用 (20/20)：每个压缩后的模式都能成功触发函数调用 (100% callable (20/20): Every compressed schema successfully triggers function calls)
60% 完全相同 (12/20)：大多数产生与原始模式完全相同的参数 (60% identical (12/20): Majority produce exactly the same arguments as original schemas)
40% 不同但有效 (8/20)：压缩后的描述可能影响LLM在有效选项中的选择（例如，默认值、占位符） (40% different but valid (8/20): Compressed descriptions may influence LLM's choice among valid options (e.g., default values, placeholders))
结论：压缩对于生产环境是安全的 - 模式在功能上保持正确 (Bottom line: Compression is safe for production - schemas remain functionally correct)

💰 成本节省示例： 一个中等规模的智能体（10个工具，每天500次调用）使用SchemaCompressor每月可节省 $541。

💰 Cost Savings Example: A medium agent (10 tools, 500 calls/day) saves $541/month with SchemaCompressor.

📖 查看完整基准测试报告： benchmark/README.md#function-calling-benchmark

常见问题（FAQ）

Prompt Refiner 现在还能用吗？项目是不是已经废弃了？

项目已于2026年4月归档，不再维护。但由于令牌成本下降和上下文窗口扩大，其优化重要性已降低。不过，在HTML清理、PII脱敏等特定场景下仍有使用价值。

使用 Prompt Refiner 具体能节省多少API成本？

根据基准测试，平均可节省5-70%的API成本。在函数调用上平均减少57%的令牌，在RAG上下文上可减少5-15%的令牌，从而实现显著的成本优化。

这个库主要适用于哪些AI应用场景？

专为AI智能体、RAG应用和聊天机器人设计，适用于文档处理、成本优化等场景。通过智能上下文管理和自动令牌压缩来优化提示词。

AI Summary (BLUF)