Mira如何自动化企业研究？开源多智能体AI系统详解

在当今信息爆炸的时代，一家公司的信息分散在众多渠道中：官方网站、领英（LinkedIn）页面、新闻报道、新闻稿、招聘信息、社交媒体以及各类评论网站。要获得全面、准确的画像，需要查阅多个来源并判断信息的可信度。这个过程不仅耗时耗力，而且容易出错。

In today's era of information explosion, a company's information is scattered across numerous channels: official websites, LinkedIn pages, news reports, press releases, job postings, social media, and various review sites. Obtaining a comprehensive and accurate profile requires consulting multiple sources and judging the credibility of information. This process is not only time-consuming and labor-intensive but also prone to errors.

Mira 旨在自动化这一复杂过程。它是一个多智能体AI系统，能够自动从公司网站、领英资料和谷歌搜索中收集信息，然后整合成一份结构化的公司档案，并为每个信息点附上置信度评分和清晰的数据来源标注。

Mira aims to automate this complex process. It is a multi-agent AI system that automatically gathers information from company websites, LinkedIn profiles, and Google Search, then assembles it into a structured company profile with confidence scores and clear source attribution for each data point.

该系统具备智能的早期终止功能——一旦所有配置的数据点都达到高置信度评分，系统会自动停止处理，以节省时间和API调用成本。数据源完全可配置，您可以根据需求启用或禁用网站爬取、领英分析和谷歌搜索。

The system features intelligent early termination—once all configured data points reach high confidence scores, it automatically stops processing to save time and API costs. Sources are fully configurable, allowing you to enable or disable website crawling, LinkedIn analysis, and Google Search based on your needs.

Mira 的核心是一个与框架无关的库，可以发布为 npm 包，或直接集成到您的应用程序、数据处理流水线或自定义工作流中。为了展示其实际应用，本代码仓库包含了一个完整的 Next.js 前端应用程序，它使用核心库，提供了一个功能齐全的界面，支持工作区管理、运行研究和查看结果。此外，还包含一个用于处理大型数据集的批量处理工具。

The core of Mira is a framework-agnostic library that can be published as an npm package or integrated directly into your applications, data pipelines, or custom workflows. To demonstrate its practical application, this repository includes a complete Next.js frontend application that consumes the core library and provides a full-featured interface with workspace management for running research and viewing results. Additionally, a bulk processing tool for handling large datasets is also included.

Mira 实战演示

Mira in Action

Mira 应用界面截图

核心特性

Key Features

数据收集

Data Collection

可配置的数据点 – 精确定义需要收集的信息（如公司名称、行业、融资情况等），并为每个数据点提供自定义描述，以实现精确提取。

Configurable Data Points – Define exactly what information to collect (e.g., company name, industry, funding status) with custom descriptions for each point to enable precise extraction.
智能源选择 – 根据您的需求，启用或禁用网站爬取、领英分析和谷歌搜索。

Intelligent Source Selection – Enable or disable website crawling, LinkedIn analysis, and Google Search based on your needs.
数据收集服务层 – 内置服务处理网页抓取、谷歌搜索、领英公司数据获取以及Cookie同意横幅的自动处理。

Service Layer for Data Gathering – Built-in services handle web scraping, Google Search, LinkedIn company data fetching, and automatic handling of cookie consent banners.

智能与编排

Intelligence & Orchestration

多智能体架构 – 由专门的智能体分别负责网站发现、内部页面分析、领英信息获取、谷歌搜索和综合分析，并通过智能编排器进行协调。

Multi-Agent Architecture – Specialized agents handle website discovery, internal page analysis, LinkedIn information retrieval, Google Search, and comprehensive analysis, all coordinated by an intelligent orchestrator.
智能早期终止 – 当所有数据点都达到高置信度评分时，系统自动停止处理，有效节省时间和API成本。

Smart Early Termination – Automatically stops processing when all data points reach high confidence scores, effectively saving time and API costs.
置信度评分与来源追溯 – 每个提取到的事实都包含一个1-5分的置信度评分，并明确标注其信息来源，确保过程的透明度和结果的可信度。

Confidence Scoring & Source Attribution – Each extracted fact includes a confidence score (1-5) and clearly references its source, ensuring process transparency and result trustworthiness.

分析与输出

Analysis & Output

公司分析与标准匹配 – 生成执行摘要，并根据自定义标准评估公司匹配度，提供0-10分的匹配分数和详细的推理过程。

Company Analysis & Criteria Matching – Generate executive summaries and evaluate companies against custom criteria, providing a fit score (0-10) and detailed reasoning.
个性化外联内容生成 – 基于丰富的公司数据，由AI驱动生成个性化的领英消息和电子邮件外联内容，并支持自定义提示词模板。

Personalized Outreach Generation – AI-powered generation of personalized LinkedIn messages and email outreach content based on enriched company data, with customizable prompt templates.

开发者体验

Developer Experience

实时进度事件 – 在执行过程中发出结构化事件，允许您跟踪和展示实时进度。

Realtime Progress Events – Emits structured events during execution, allowing you to track and display live progress.
可组合的核心库 – 与框架无关，可发布为npm包，适用于Node.js/TypeScript项目。

Composable Core Library – Framework-agnostic and publishable as an npm package, ready for Node.js/TypeScript projects.
完整的Next.js前端应用 – 完整的应用程序，展示了如何消费核心库，并提供了包含工作区管理、用户认证和实时进度更新的界面。

Complete Next.js Frontend Application – A full application demonstrating how to consume the core library, featuring an interface with workspace management, user authentication, and live progress updates.
批量处理 – 处理大型数据集，支持断点续传、并发处理和自动结果导出。

Bulk Processing – Process large datasets with resume capability, concurrent processing, and automatic result export.

工作原理

How It Works

Mira 接收一个公司网站URL和您的配置，然后智能地编排多个AI智能体来收集全面的公司信息。您可以完全自定义需要收集的数据和使用的来源。

Mira takes a company's website URL and your configuration, then intelligently orchestrates multiple AI agents to gather comprehensive company information. You can fully customize what data to collect and which sources to use.

配置流程

Configuration

数据点：使用名称和描述定义自定义数据点（例如，"industry": "主要业务领域或市场细分"）。

Data Points: Define custom data points with names and descriptions (e.g., "industry": "Primary business sector or market vertical").
数据源：启用或禁用网站爬取、领英分析和谷歌搜索（着陆页始终会被分析）。

Sources: Enable or disable website crawling, LinkedIn analysis, and Google Search (the landing page is always analyzed).
分析：配置生成执行摘要、匹配度评估和个性化外联内容的选项。

Analysis: Configure options for generating executive summaries, fit assessments, and personalized outreach content.

常见问题（FAQ）

Mira如何确保收集到的公司信息是准确可靠的？

Mira为每个提取的数据点提供1-5分的置信度评分，并明确标注信息来源，通过多智能体交叉验证和智能编排来保证信息的准确性和透明度。

Mira的智能提前终止功能具体能节省多少成本？

当所有配置的数据点都达到高置信度评分时，系统会自动停止处理，避免不必要的API调用和计算资源消耗，从而显著优化时间和经济成本。

Mira可以集成到现有的企业工作流中吗？

可以。Mira核心是一个与框架无关的库，可作为npm包发布，或直接集成到应用程序、数据处理流水线及自定义工作流中，支持批量处理工具。