837 lines
31 KiB
Markdown
837 lines
31 KiB
Markdown
---
|
||
title: "AgentKit v2 架构设计:通用 Agent 平台"
|
||
type: design
|
||
status: draft
|
||
date: 2026-06-05
|
||
origin: brainstorm session
|
||
---
|
||
|
||
# AgentKit v2 架构设计
|
||
|
||
## 1. 定位与目标
|
||
|
||
AgentKit 是一个**通用 Agent 平台**,以独立服务模式部署,提供:
|
||
|
||
1. **通用 Agent 框架** — 类似 OpenClaw/Hermes,非 GEO 专属
|
||
2. **多 Agent 协同编排** — Pipeline + Handoff + 动态路由
|
||
3. **运行时自由增减** — 通过 API 动态创建/删除/更新 Agent 和编排
|
||
4. **LLM 统一管理** — API Key 集中管理、用量统计、成本控制
|
||
5. **知识库连接** — RAG 检索、向量存储
|
||
6. **产出质量管理** — 质量门禁、自动重试
|
||
7. **记忆系统** — Working + Episodic + Semantic 三层记忆
|
||
8. **能力自我进化** — 反思、优化、A/B 测试
|
||
9. **Skill + MCP** — 可插拔技能 + MCP 协议
|
||
10. **意图识别** — 三级路由(关键词 → Embedding → LLM)
|
||
11. **标准化输出** — Schema 校验 + 格式统一
|
||
|
||
### 与现有方案的关系
|
||
|
||
AgentKit 不是重复造轮子,而是**垂直整合的 Agent 平台**:
|
||
|
||
- 核心运行时自研(轻量、可控,当前 BaseAgent 已有基础)
|
||
- MCP 协议用标准 SDK(不重复造轮子)
|
||
- RAG/知识库集成 LlamaIndex 或对接业务现有系统
|
||
- LLM Gateway 参考 LiteLLM 设计但自研(更轻量、用量统计更灵活)
|
||
|
||
差异化竞争力:**自我进化** + **质量管理** + **标准化输出** — 这三项在 LangChain/CrewAI/Dify 中均无完整实现。
|
||
|
||
---
|
||
|
||
## 2. 核心架构
|
||
|
||
### 2.1 整体架构图
|
||
|
||
```
|
||
┌──────────────────────────────────────────────────────────────┐
|
||
│ AgentKit Server (FastAPI) │
|
||
│ │
|
||
│ ┌────────────────────────────────────────────────────────┐ │
|
||
│ │ API Gateway │ │
|
||
│ │ /api/v1/agents /api/v1/tasks /api/v1/skills │ │
|
||
│ │ /api/v1/pipelines /api/v1/llm /api/v1/mcp │ │
|
||
│ └────────────────────────────────────────────────────────┘ │
|
||
│ │
|
||
│ ┌──────────────┐ ┌──────────────┐ ┌───────────────────┐ │
|
||
│ │ Agent Runtime │ │ Orchestrator │ │ LLM Gateway │ │
|
||
│ │ │ │ │ │ │ │
|
||
│ │ AgentFactory │ │ PipelineEngine│ │ Provider Registry │ │
|
||
│ │ AgentPool │ │ HandoffMgr │ │ Model Router │ │
|
||
│ │ Lifecycle │ │ DynamicRoute │ │ Usage Tracker │ │
|
||
│ │ ReAct Engine │ │ │ │ Rate Limiter │ │
|
||
│ └──────────────┘ └──────────────┘ │ Budget Controller │ │
|
||
│ └───────────────────┘ │
|
||
│ ┌──────────────┐ ┌──────────────┐ ┌───────────────────┐ │
|
||
│ │ Skill System │ │ Memory │ │ Evolution │ │
|
||
│ │ │ │ │ │ │ │
|
||
│ │ SkillRegistry│ │ Working(Redis)│ │ Reflector │ │
|
||
│ │ SkillLoader │ │ Episodic(PG) │ │ PromptOptimizer │ │
|
||
│ │ MCP Bridge │ │ Semantic(RAG)│ │ ABTester │ │
|
||
│ └──────────────┘ │ Retriever │ │ QualityGate │ │
|
||
│ └──────────────┘ └───────────────────┘ │
|
||
│ ┌──────────────┐ ┌──────────────┐ ┌───────────────────┐ │
|
||
│ │Intent Router │ │Output Std │ │ Knowledge Base │ │
|
||
│ │ │ │ │ │ │ │
|
||
│ │ 关键词匹配 │ │ Schema 校验 │ │ RAG 检索 │ │
|
||
│ │ Embedding │ │ 格式标准化 │ │ 向量存储 │ │
|
||
│ │ LLM 分类 │ │ 质量评估 │ │ 文档管理 │ │
|
||
│ └──────────────┘ └──────────────┘ └───────────────────┘ │
|
||
│ │
|
||
│ ┌────────────────────────────────────────────────────────┐ │
|
||
│ │ Configuration Store (YAML/DB) │ │
|
||
│ │ Agent 配置 | Skill 配置 | Pipeline 配置 | LLM 配置 │ │
|
||
│ └────────────────────────────────────────────────────────┘ │
|
||
└──────────────────────────────────────────────────────────────┘
|
||
│ │ │ │
|
||
┌────┴────┐ ┌─────┴─────┐ ┌────┴────┐ ┌────┴────┐
|
||
│ Redis │ │ PostgreSQL │ │ LLM │ │ MCP │
|
||
│ +PubSub│ │ +pgvector │ │ APIs │ │ Servers │
|
||
└─────────┘ └───────────┘ └─────────┘ └─────────┘
|
||
```
|
||
|
||
### 2.2 请求处理流程
|
||
|
||
```
|
||
POST /api/v1/tasks
|
||
│
|
||
▼
|
||
API Gateway → 认证/限流
|
||
│
|
||
▼
|
||
Intent Router → 识别意图,匹配 Skill
|
||
│
|
||
▼
|
||
Agent Runtime → 获取/创建 Agent 实例
|
||
│
|
||
▼
|
||
ReAct Engine → Think → Act → Observe 循环
|
||
│ │ │ │
|
||
│ ▼ ▼ ▼
|
||
│ LLM Gateway Tool 观察结果
|
||
│ │
|
||
│ ▼
|
||
│ MCP/Skill/Function
|
||
│
|
||
▼
|
||
Quality Gate → 质量检查
|
||
│
|
||
├── 不合格 → 反馈给 ReAct 循环重试
|
||
│
|
||
▼
|
||
Output Standardizer → Schema 校验 + 格式标准化
|
||
│
|
||
▼
|
||
返回标准化结果 + 记录到 Memory + 记录到 Usage Tracker
|
||
```
|
||
|
||
---
|
||
|
||
## 3. 核心组件设计
|
||
|
||
### 3.1 ReAct Engine(推理-行动循环)
|
||
|
||
这是 AgentKit v2 最关键的改造,让 Agent 从"LLM 调用封装"变为"真正的智能体"。
|
||
|
||
#### 执行循环
|
||
|
||
```python
|
||
class ReActEngine:
|
||
"""ReAct 推理-行动循环引擎"""
|
||
|
||
async def execute(
|
||
self,
|
||
task: TaskMessage,
|
||
skill: Skill,
|
||
llm_gateway: LLMGateway,
|
||
tools: list[Tool],
|
||
memory: Memory | None = None,
|
||
max_steps: int = 10,
|
||
) -> ReActResult:
|
||
# 1. 构建初始消息(Skill Prompt + 任务输入)
|
||
messages = self._build_initial_messages(task, skill, tools)
|
||
|
||
trajectory: list[ReActStep] = []
|
||
|
||
for step in range(max_steps):
|
||
# Think: LLM 推理下一步
|
||
response = await llm_gateway.chat(
|
||
messages=messages,
|
||
agent_name=task.agent_name,
|
||
task_type=task.task_type,
|
||
tools=self._build_tool_schemas(tools), # Function Calling
|
||
tool_choice="auto",
|
||
)
|
||
|
||
if response.has_tool_calls:
|
||
# Act + Observe: 执行 Tool 并反馈结果
|
||
for tool_call in response.tool_calls:
|
||
tool = self._find_tool(tool_call.name, tools)
|
||
result = await tool.safe_execute(**tool_call.arguments)
|
||
messages.append(tool_result_message(tool_call.id, result))
|
||
trajectory.append(ReActStep(
|
||
step=step, action="tool_call",
|
||
tool_name=tool_call.name,
|
||
arguments=tool_call.arguments,
|
||
result=result,
|
||
))
|
||
else:
|
||
# LLM 认为任务完成
|
||
trajectory.append(ReActStep(
|
||
step=step, action="final_answer",
|
||
content=response.content,
|
||
))
|
||
break
|
||
|
||
# 存储轨迹到记忆
|
||
if memory:
|
||
await memory.store_trajectory(task, trajectory)
|
||
|
||
return ReActResult(
|
||
output=self._parse_output(response.content),
|
||
trajectory=trajectory,
|
||
total_steps=len(trajectory),
|
||
total_tokens=sum(s.tokens for s in trajectory),
|
||
)
|
||
```
|
||
|
||
#### 停止条件
|
||
|
||
| 条件 | 说明 |
|
||
|------|------|
|
||
| LLM 不再调用 Tool | LLM 认为任务完成,直接输出最终答案 |
|
||
| 达到 max_steps | 防止无限循环,返回当前最佳结果 |
|
||
| Quality Gate 通过 | 输出满足质量要求,提前终止 |
|
||
| 异常/超时 | LLM 调用失败或超时,返回已有结果 |
|
||
|
||
#### 与当前代码的映射
|
||
|
||
| 当前 | v2 | 变化 |
|
||
|------|-----|------|
|
||
| `ConfigDrivenAgent._handle_llm_generate()` | `ReActEngine.execute()` | 单次 LLM 调用 → 循环推理 |
|
||
| `ConfigDrivenAgent._handle_tool_call()` | ReAct 循环中的 Tool 调用 | 硬编码调用 → LLM 自主选择 |
|
||
| `ConfigDrivenAgent._handle_custom()` | 保留为 ReAct 的"外部 Tool" | custom_handler 变为 Tool |
|
||
| `DynamicSelector` | ReAct + Function Calling | 关键词/LLM 选择 → LLM 自主决策 |
|
||
|
||
---
|
||
|
||
### 3.2 Intent Router(意图路由器)
|
||
|
||
#### 三级路由策略
|
||
|
||
```python
|
||
class IntentRouter:
|
||
"""三级意图路由:关键词 → Embedding → LLM"""
|
||
|
||
def __init__(self, llm_gateway: LLMGateway, embedding_service=None):
|
||
self._keyword_rules: dict[str, KeywordRule] = {}
|
||
self._skill_embeddings: dict[str, list[float]] = {}
|
||
self._llm_gateway = llm_gateway
|
||
|
||
async def route(
|
||
self,
|
||
input_data: dict,
|
||
skills: list[Skill],
|
||
) -> RoutingResult:
|
||
# Level 1: 关键词匹配(零成本,~0ms)
|
||
skill = self._match_keywords(input_data, skills)
|
||
if skill:
|
||
return RoutingResult(skill=skill, method="keyword", confidence=1.0)
|
||
|
||
# Level 2: Embedding 相似度(极低成本,~50ms)
|
||
if self._skill_embeddings:
|
||
result = self._match_embedding(input_data, skills)
|
||
if result and result.confidence > 0.8:
|
||
return result
|
||
|
||
# Level 3: LLM 分类(兜底,~200 tokens,~500ms)
|
||
return await self._classify_with_llm(input_data, skills)
|
||
```
|
||
|
||
#### 成本分析
|
||
|
||
| 路由级别 | 延迟 | Token 消耗 | 成本/次 | 命中率预期 |
|
||
|---------|------|-----------|---------|-----------|
|
||
| 关键词匹配 | ~0ms | 0 | $0 | 60-70% |
|
||
| Embedding | ~50ms | ~100 tokens | ~$0.00001 | 20-25% |
|
||
| LLM 分类 | ~500ms | ~200 tokens | ~$0.00003 | 5-10% |
|
||
|
||
**关键设计**:意图识别只在 Router 层做一次,不是每个 Skill 各自做。8 个 Skill 不需要 8 次意图识别。
|
||
|
||
#### Skill 的意图配置
|
||
|
||
```yaml
|
||
intent:
|
||
keywords: ["生成内容", "写文章", "选题", "generate", "content"]
|
||
description: "用户需要生成SEO/GEO优化内容、推荐选题或撰写文章"
|
||
examples:
|
||
- "帮我写一篇关于AI的文章"
|
||
- "推荐一些选题"
|
||
- "生成品牌内容"
|
||
```
|
||
|
||
- `keywords`:用于 Level 1 关键词匹配
|
||
- `description` + `examples`:用于 Level 3 LLM 分类的 Prompt 构建
|
||
- Embedding 自动从 `description` + `examples` 计算,无需手动配置
|
||
|
||
---
|
||
|
||
### 3.3 LLM Gateway(LLM 统一网关)
|
||
|
||
#### 架构
|
||
|
||
```python
|
||
class LLMGateway:
|
||
"""LLM 统一网关:调用、路由、计量、限流"""
|
||
|
||
def __init__(self, config: LLMConfig):
|
||
self._providers: dict[str, LLMProvider] = {}
|
||
self._usage_tracker = UsageTracker()
|
||
self._rate_limiter = RateLimiter()
|
||
self._budget_controller = BudgetController()
|
||
|
||
async def chat(
|
||
self,
|
||
messages: list[dict],
|
||
model: str, # 模型别名或具体模型名
|
||
agent_name: str = "", # 用于用量追踪
|
||
task_type: str = "", # 用于模型路由
|
||
tools: list[dict] | None = None, # Function Calling schemas
|
||
tool_choice: str = "auto",
|
||
**kwargs,
|
||
) -> LLMResponse:
|
||
# 1. 模型路由:别名 → 实际模型 + Provider
|
||
provider, actual_model = self._resolve_model(model, task_type)
|
||
|
||
# 2. 预算检查
|
||
await self._budget_controller.check(agent_name)
|
||
|
||
# 3. 限流
|
||
await self._rate_limiter.acquire(agent_name, actual_model)
|
||
|
||
# 4. 调用 LLM
|
||
try:
|
||
response = await provider.chat(
|
||
messages=messages,
|
||
model=actual_model,
|
||
tools=tools,
|
||
tool_choice=tool_choice,
|
||
**kwargs,
|
||
)
|
||
except LLMError as e:
|
||
# 5. 降级策略
|
||
fallback = self._get_fallback_model(model)
|
||
if fallback:
|
||
response = await fallback.provider.chat(...)
|
||
else:
|
||
raise
|
||
|
||
# 6. 记录用量
|
||
await self._usage_tracker.record(
|
||
agent_name=agent_name,
|
||
task_type=task_type,
|
||
model=actual_model,
|
||
usage=response.usage,
|
||
cost=self._calculate_cost(actual_model, response.usage),
|
||
latency_ms=response.latency_ms,
|
||
)
|
||
|
||
return response
|
||
```
|
||
|
||
#### Provider 配置
|
||
|
||
```yaml
|
||
# llm_config.yaml
|
||
providers:
|
||
openai:
|
||
api_key: "${OPENAI_API_KEY}" # 环境变量引用
|
||
base_url: "https://api.openai.com/v1"
|
||
models:
|
||
gpt-4o: { max_tokens: 128000, cost_per_1k_input: 0.0025, cost_per_1k_output: 0.01 }
|
||
gpt-4o-mini: { max_tokens: 128000, cost_per_1k_input: 0.00015, cost_per_1k_output: 0.0006 }
|
||
|
||
deepseek:
|
||
api_key: "${DEEPSEEK_API_KEY}"
|
||
base_url: "https://api.deepseek.com/v1"
|
||
models:
|
||
deepseek-chat: { max_tokens: 64000, cost_per_1k_input: 0.00014, cost_per_1k_output: 0.00028 }
|
||
deepseek-reasoner: { max_tokens: 64000, cost_per_1k_input: 0.00055, cost_per_1k_output: 0.00219 }
|
||
|
||
anthropic:
|
||
api_key: "${ANTHROPIC_API_KEY}"
|
||
base_url: "https://api.anthropic.com/v1"
|
||
models:
|
||
claude-sonnet-4-20250514: { max_tokens: 200000, cost_per_1k_input: 0.003, cost_per_1k_output: 0.015 }
|
||
|
||
# 模型别名(Skill 配置中使用别名,Gateway 解析为实际模型)
|
||
model_aliases:
|
||
default: "deepseek-chat"
|
||
fast: "gpt-4o-mini"
|
||
powerful: "claude-sonnet-4-20250514"
|
||
reasoning: "deepseek-reasoner"
|
||
|
||
# 降级策略
|
||
fallbacks:
|
||
deepseek-chat: ["gpt-4o-mini", "gpt-4o"]
|
||
claude-sonnet-4-20250514: ["gpt-4o", "deepseek-chat"]
|
||
|
||
# 预算控制
|
||
budgets:
|
||
default:
|
||
daily_limit: 50.0 # USD
|
||
monthly_limit: 1000.0 # USD
|
||
content_generator:
|
||
daily_limit: 20.0
|
||
monthly_limit: 500.0
|
||
```
|
||
|
||
#### 用量统计 API
|
||
|
||
```
|
||
GET /api/v1/llm/usage?agent_name=content_gen&time_range=today
|
||
|
||
Response:
|
||
{
|
||
"agent_name": "content_gen",
|
||
"time_range": "today",
|
||
"total_tokens": 1250000,
|
||
"total_cost": 0.35,
|
||
"by_model": {
|
||
"deepseek-chat": { "tokens": 1000000, "cost": 0.28, "calls": 45 },
|
||
"gpt-4o-mini": { "tokens": 250000, "cost": 0.07, "calls": 12 }
|
||
},
|
||
"budget": {
|
||
"daily_limit": 20.0,
|
||
"daily_used": 0.35,
|
||
"monthly_limit": 500.0,
|
||
"monthly_used": 8.50
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 3.4 Skill System(技能系统)
|
||
|
||
#### Skill vs Tool
|
||
|
||
| | Tool | Skill |
|
||
|---|---|---|
|
||
| 粒度 | 原子操作 | 业务能力 |
|
||
| 组成 | 函数 + Schema | Prompt + Tool 组合 + 输出 Schema + 质量门禁 |
|
||
| 路由 | 代码硬编码 | Intent Router 动态选择 |
|
||
| 示例 | `retrieve_knowledge` | `content_generation` |
|
||
|
||
#### Skill YAML 完整规范
|
||
|
||
```yaml
|
||
# ── 基本信息 ──────────────────────────
|
||
name: content_generation # 必填,唯一标识
|
||
version: "1.0.0" # 必填
|
||
description: "AI内容生成:支持选题推荐和文章生成" # 必填
|
||
|
||
# ── 意图识别 ──────────────────────────
|
||
intent:
|
||
keywords: ["生成内容", "写文章", "选题", "generate", "content"]
|
||
description: "用户需要生成SEO/GEO优化内容、推荐选题或撰写文章"
|
||
examples:
|
||
- "帮我写一篇关于AI的文章"
|
||
- "推荐一些选题"
|
||
|
||
# ── 执行配置 ──────────────────────────
|
||
execution_mode: react # react | direct | custom
|
||
max_steps: 5 # ReAct 循环最大步数
|
||
|
||
# ── Prompt ──────────────────────────
|
||
prompt:
|
||
identity: "你是一个专业的内容生成助手"
|
||
context: "品牌需要通过优质内容提升在AI搜索引擎中的可见性"
|
||
instructions: |
|
||
根据用户提供的关键词和品牌信息,生成符合要求的内容。
|
||
如果需要知识库信息,先调用 retrieve_knowledge 工具。
|
||
constraints:
|
||
- 内容必须原创
|
||
- 关键词密度适中
|
||
output_format: "JSON: {topics: [{title, reason, keywords}]} 或 {content, word_count}"
|
||
|
||
# ── 工具绑定 ──────────────────────────
|
||
tools:
|
||
- name: retrieve_knowledge
|
||
required: false # 可选工具
|
||
- name: search_web
|
||
required: false
|
||
|
||
# ── LLM 配置 ──────────────────────────
|
||
llm:
|
||
model: "deepseek" # 模型别名,由 LLM Gateway 解析
|
||
temperature: 0.7
|
||
max_tokens: 4000
|
||
|
||
# ── 输入输出 Schema ──────────────────────────
|
||
input_schema:
|
||
type: object
|
||
required: [target_keyword]
|
||
properties:
|
||
target_keyword: { type: string, description: "目标关键词" }
|
||
brand_name: { type: string, description: "品牌名称" }
|
||
|
||
output_schema:
|
||
type: object
|
||
required: [content]
|
||
properties:
|
||
content: { type: string }
|
||
word_count: { type: integer }
|
||
|
||
# ── 质量门禁 ──────────────────────────
|
||
quality_gate:
|
||
required_fields: ["content"]
|
||
min_word_count: 500
|
||
max_retries: 1 # 质量不合格时重试次数
|
||
custom_validator: null # 可选:dotted path 到校验函数
|
||
|
||
# ── 记忆配置 ──────────────────────────
|
||
memory:
|
||
working: { enabled: true }
|
||
episodic: { enabled: true, track_success: true }
|
||
semantic: { enabled: true, knowledge_base_ids_field: "knowledge_base_ids" }
|
||
```
|
||
|
||
#### Skill 注册与发现
|
||
|
||
```python
|
||
class SkillRegistry:
|
||
"""Skill 注册中心"""
|
||
|
||
async def register(self, skill_config: SkillConfig) -> Skill:
|
||
"""注册 Skill(从 YAML 或 Dict)"""
|
||
|
||
async def unregister(self, name: str) -> None:
|
||
"""注销 Skill"""
|
||
|
||
async def list_skills(self) -> list[SkillInfo]:
|
||
"""列出所有已注册 Skill"""
|
||
|
||
async def get_skill(self, name: str) -> Skill:
|
||
"""获取 Skill"""
|
||
|
||
async def update_skill(self, name: str, config: SkillConfig) -> Skill:
|
||
"""热更新 Skill 配置"""
|
||
```
|
||
|
||
---
|
||
|
||
### 3.5 Quality Gate + Output Standardizer
|
||
|
||
#### Quality Gate
|
||
|
||
```python
|
||
class QualityGate:
|
||
"""产出质量管理"""
|
||
|
||
async def validate(
|
||
self,
|
||
output: dict,
|
||
skill: Skill,
|
||
) -> QualityResult:
|
||
checks = []
|
||
|
||
# 1. 必填字段检查
|
||
for field in skill.quality_gate.required_fields:
|
||
present = field in output and output[field] is not None
|
||
checks.append(QualityCheck(
|
||
name=f"required_field:{field}",
|
||
passed=present,
|
||
message=f"Field '{field}' is missing" if not present else None,
|
||
))
|
||
|
||
# 2. 数值范围检查
|
||
if skill.quality_gate.min_word_count:
|
||
word_count = len(output.get("content", "").split())
|
||
checks.append(QualityCheck(
|
||
name="min_word_count",
|
||
passed=word_count >= skill.quality_gate.min_word_count,
|
||
message=f"Word count {word_count} < minimum {skill.quality_gate.min_word_count}",
|
||
))
|
||
|
||
# 3. Schema 校验
|
||
if skill.output_schema:
|
||
try:
|
||
jsonschema.validate(output, skill.output_schema)
|
||
checks.append(QualityCheck(name="schema", passed=True))
|
||
except jsonschema.ValidationError as e:
|
||
checks.append(QualityCheck(name="schema", passed=False, message=str(e)))
|
||
|
||
# 4. 自定义校验(可选)
|
||
if skill.quality_gate.custom_validator:
|
||
validator = import_handler(skill.quality_gate.custom_validator)
|
||
result = await validator(output)
|
||
checks.append(QualityCheck(name="custom", passed=result))
|
||
|
||
return QualityResult(
|
||
passed=all(c.passed for c in checks),
|
||
checks=checks,
|
||
can_retry=skill.quality_gate.max_retries > 0,
|
||
)
|
||
```
|
||
|
||
#### Output Standardizer
|
||
|
||
```python
|
||
class OutputStandardizer:
|
||
"""标准化输出"""
|
||
|
||
async def standardize(
|
||
self,
|
||
raw_output: dict,
|
||
skill: Skill,
|
||
) -> StandardOutput:
|
||
# 1. Schema 校验
|
||
validated = self._validate_schema(raw_output, skill.output_schema)
|
||
|
||
# 2. 字段标准化(确保类型一致)
|
||
normalized = self._normalize_types(validated, skill.output_schema)
|
||
|
||
# 3. 添加元数据
|
||
return StandardOutput(
|
||
skill_name=skill.name,
|
||
data=normalized,
|
||
metadata=OutputMetadata(
|
||
version=skill.version,
|
||
produced_at=datetime.now(timezone.utc),
|
||
quality_score=self._calculate_quality_score(normalized, skill),
|
||
),
|
||
)
|
||
```
|
||
|
||
---
|
||
|
||
### 3.6 服务化改造
|
||
|
||
#### API 设计
|
||
|
||
```
|
||
# ── Agent 管理 ──────────────────────────
|
||
POST /api/v1/agents # 创建 Agent 实例
|
||
GET /api/v1/agents # 列出所有 Agent
|
||
GET /api/v1/agents/{name} # 获取 Agent 详情
|
||
DELETE /api/v1/agents/{name} # 删除 Agent
|
||
PUT /api/v1/agents/{name}/config # 更新 Agent 配置(热更新)
|
||
|
||
# ── 任务执行 ──────────────────────────
|
||
POST /api/v1/tasks # 提交任务(Router 自动路由)
|
||
GET /api/v1/tasks/{id} # 查询任务状态
|
||
POST /api/v1/tasks/{id}/cancel # 取消任务
|
||
|
||
# ── Skill 管理 ──────────────────────────
|
||
POST /api/v1/skills # 注册 Skill
|
||
GET /api/v1/skills # 列出所有 Skill
|
||
GET /api/v1/skills/{name} # 获取 Skill 详情
|
||
DELETE /api/v1/skills/{name} # 注销 Skill
|
||
PUT /api/v1/skills/{name} # 更新 Skill 配置
|
||
|
||
# ── Pipeline 编排 ──────────────────────────
|
||
POST /api/v1/pipelines # 创建 Pipeline
|
||
GET /api/v1/pipelines # 列出所有 Pipeline
|
||
POST /api/v1/pipelines/{id}/execute # 执行 Pipeline
|
||
PUT /api/v1/pipelines/{id} # 更新 Pipeline(运行时变更编排)
|
||
|
||
# ── LLM 管理 ──────────────────────────
|
||
GET /api/v1/llm/providers # 列出 LLM 提供商
|
||
GET /api/v1/llm/usage # 查询用量统计
|
||
GET /api/v1/llm/usage/{agent_name} # 按 Agent 查询用量
|
||
POST /api/v1/llm/budgets # 设置预算
|
||
|
||
# ── MCP ──────────────────────────
|
||
GET /api/v1/mcp/tools # 列出 MCP 工具
|
||
POST /api/v1/mcp/tools/{name}/call # 调用 MCP 工具
|
||
|
||
# ── Health ──────────────────────────
|
||
GET /api/v1/health # 健康检查
|
||
```
|
||
|
||
#### AgentPool 生命周期
|
||
|
||
```python
|
||
class AgentPool:
|
||
"""运行时 Agent 实例池"""
|
||
|
||
def __init__(self, llm_gateway, skill_registry, memory_factory):
|
||
self._agents: dict[str, Agent] = {}
|
||
self._llm_gateway = llm_gateway
|
||
self._skill_registry = skill_registry
|
||
self._memory_factory = memory_factory
|
||
|
||
async def create_agent(self, config: AgentConfig) -> Agent:
|
||
"""创建 Agent 实例"""
|
||
agent = Agent(
|
||
config=config,
|
||
llm_gateway=self._llm_gateway,
|
||
skills=[self._skill_registry.get(s) for s in config.skills],
|
||
memory=self._memory_factory.create(config.memory),
|
||
)
|
||
await agent.start()
|
||
self._agents[config.name] = agent
|
||
return agent
|
||
|
||
async def remove_agent(self, name: str) -> None:
|
||
"""停止并移除 Agent"""
|
||
agent = self._agents.pop(name, None)
|
||
if agent:
|
||
await agent.stop()
|
||
|
||
async def update_config(self, name: str, config: AgentConfig) -> None:
|
||
"""热更新 Agent 配置(无需重启)"""
|
||
agent = self._agents[name]
|
||
await agent.update_config(config)
|
||
|
||
async def get_agent(self, name: str) -> Agent | None:
|
||
return self._agents.get(name)
|
||
```
|
||
|
||
#### 与 GEO 项目的集成
|
||
|
||
```
|
||
GEO Backend (Python)
|
||
│
|
||
│ from agentkit_client import AgentKitClient
|
||
│ client = AgentKitClient(base_url="http://agentkit:8000")
|
||
│
|
||
│ # 提交任务
|
||
│ result = await client.submit_task({
|
||
│ "input_data": {"target_keyword": "AI", "brand_name": "BrandX"},
|
||
│ })
|
||
│
|
||
│ # 动态调整编排
|
||
│ await client.update_pipeline("content_production", new_config)
|
||
│
|
||
▼
|
||
AgentKit Server (独立部署)
|
||
│
|
||
├── Intent Router → 匹配 Skill
|
||
├── ReAct Engine → 执行任务
|
||
└── 返回标准化结果
|
||
```
|
||
|
||
---
|
||
|
||
## 4. 与当前代码的映射
|
||
|
||
### 4.1 保留的模块(改造升级)
|
||
|
||
| 当前模块 | v2 对应 | 改造内容 |
|
||
|---------|---------|---------|
|
||
| `BaseAgent` | `Agent` | 加入 ReAct Engine、LLM Gateway 替换 llm_client |
|
||
| `ConfigDrivenAgent` | 删除 | 被 `Agent` + `Skill` 组合取代 |
|
||
| `AgentConfig` | `SkillConfig` | 增加 intent、quality_gate、execution_mode |
|
||
| `ToolRegistry` | `ToolRegistry` | 保持不变 |
|
||
| `FunctionTool` | `FunctionTool` | 保持不变 |
|
||
| `AgentTool` | `AgentTool` | 保持不变 |
|
||
| `MCPTool` | `MCPTool` | 保持不变 |
|
||
| `SequentialChain/ParallelFanOut` | `SequentialChain/ParallelFanOut` | 保持不变 |
|
||
| `DynamicSelector` | 删除 | 被 ReAct + Function Calling 取代 |
|
||
| `WorkingMemory` | `WorkingMemory` | 保持不变 |
|
||
| `EpisodicMemory` | `EpisodicMemory` | 实现 pgvector cosine distance |
|
||
| `SemanticMemory` | `SemanticMemory` | 增强 RAG 集成 |
|
||
| `MemoryRetriever` | `MemoryRetriever` | 保持不变 |
|
||
| `Reflector` | `Reflector` | 保持不变 |
|
||
| `PromptOptimizer` | `PromptOptimizer` | 保持不变 |
|
||
| `ABTester` | `ABTester` | 保持不变 |
|
||
| `EvolutionMixin` | `EvolutionMixin` | 保持不变 |
|
||
| `PipelineEngine` | `PipelineEngine` | 保持不变 |
|
||
| `HandoffManager` | `HandoffManager` | 保持不变 |
|
||
| `DynamicPipeline` | `DynamicPipeline` | 保持不变 |
|
||
| `MCPServer` | `MCPServer` | 增加 SSE 流式响应 |
|
||
| `MCPClient` | `MCPClient` | 增加自动发现 |
|
||
| `PromptTemplate` | `PromptTemplate` | 保持不变 |
|
||
| `PromptSection` | `PromptSection` | 保持不变 |
|
||
| `TaskDispatcher` | `TaskDispatcher` | 保持不变 |
|
||
| `AgentRegistry` | `AgentRegistry` | 保持不变 |
|
||
|
||
### 4.2 新增的模块
|
||
|
||
| v2 模块 | 职责 |
|
||
|---------|------|
|
||
| `ReActEngine` | ReAct 推理-行动循环 |
|
||
| `IntentRouter` | 三级意图路由(关键词 → Embedding → LLM) |
|
||
| `LLMGateway` | LLM 统一网关(调用、路由、计量、限流) |
|
||
| `LLMProvider` | LLM 提供商适配器(OpenAI/DeepSeek/Anthropic) |
|
||
| `UsageTracker` | 用量统计 |
|
||
| `BudgetController` | 预算控制 |
|
||
| `RateLimiter` | 限流 |
|
||
| `QualityGate` | 产出质量管理 |
|
||
| `OutputStandardizer` | 标准化输出 |
|
||
| `SkillRegistry` | Skill 注册中心 |
|
||
| `SkillLoader` | Skill YAML 加载 |
|
||
| `AgentPool` | Agent 实例池 |
|
||
| `AgentKitServer` | FastAPI 服务入口 |
|
||
| `AgentKitClient` | Python SDK 客户端 |
|
||
|
||
### 4.3 删除的模块
|
||
|
||
| 当前模块 | 原因 |
|
||
|---------|------|
|
||
| `ConfigDrivenAgent` | 被 `Agent` + `Skill` 组合取代 |
|
||
| `DynamicSelector` | 被 ReAct + Function Calling 取代 |
|
||
| `StandaloneRunner` | 被 `AgentKitServer` 取代 |
|
||
|
||
---
|
||
|
||
## 5. 实施路线图
|
||
|
||
### Phase 1: 核心引擎升级
|
||
|
||
**目标**:让 Agent 有"思考"能力
|
||
|
||
1. 实现 `ReActEngine`(含 Function Calling 支持)
|
||
2. 实现 `LLMGateway`(统一调用 + 用量统计)
|
||
3. 重构 `Agent` 类(集成 ReAct + LLM Gateway)
|
||
4. 实现 `SkillConfig` 和 `SkillRegistry`
|
||
|
||
**验证标准**:一个 Agent 实例能通过 ReAct 循环自主选择 Tool 完成任务
|
||
|
||
### Phase 2: 意图识别 + 质量管理
|
||
|
||
**目标**:让 Agent 能自动路由和保证输出质量
|
||
|
||
1. 实现 `IntentRouter`(三级路由)
|
||
2. 实现 `QualityGate`
|
||
3. 实现 `OutputStandardizer`
|
||
4. 将 GEO 的 8 个 YAML 配置迁移为 Skill 配置
|
||
|
||
**验证标准**:提交任意任务,Router 自动路由到正确 Skill,输出通过质量检查
|
||
|
||
### Phase 3: 服务化
|
||
|
||
**目标**:让 AgentKit 成为独立部署的服务
|
||
|
||
1. 实现 `AgentKitServer`(FastAPI)
|
||
2. 实现 `AgentPool`
|
||
3. 实现 `AgentKitClient`(Python SDK)
|
||
4. 实现配置热更新 API
|
||
|
||
**验证标准**:GEO 项目通过 HTTP API 调用 AgentKit,无需 import 内部类
|
||
|
||
### Phase 4: 增强与优化
|
||
|
||
**目标**:生产级质量
|
||
|
||
1. 实现 `BudgetController` 和 `RateLimiter`
|
||
2. 实现 Embedding 路由
|
||
3. 实现 MCP SSE 流式响应
|
||
4. 实现 MCP Client 自动发现
|
||
5. 实现流式输出(SSE)
|
||
6. 添加认证/授权
|
||
|
||
**验证标准**:生产环境可用,有完整的监控和成本控制
|
||
|
||
---
|
||
|
||
## 6. 风险与缓解
|
||
|
||
| 风险 | 影响 | 缓解 |
|
||
|------|------|------|
|
||
| ReAct 循环 token 消耗高 | 成本增加 | max_steps 限制 + 小模型路由 + 关键词预路由 |
|
||
| Function Calling 不是所有模型都支持 | 兼容性 | 降级到文本解析模式(解析 LLM 输出中的 Tool 调用) |
|
||
| 服务化增加延迟 | 性能 | 本地缓存 + 异步执行 + 流式输出 |
|
||
| Skill 配置迁移工作量大 | 进度 | 提供迁移脚本,自动转换 AgentConfig → SkillConfig |
|
||
| 多 Agent 协同复杂度 | 可靠性 | 保持现有 Pipeline + Handoff 架构,ReAct 只在单 Agent 内 |
|