16 KiB

Raw Blame History

title: "refactor: AgentKit 架构优化演进 — 对齐业界最佳实践"
status: active
plan_id: "2026-06-16-006"
created: 2026-06-16
depth: deep
origin: "基于对 Codex/Claude Code/Trae/Qoder 的深入分析，优化 AgentKit 架构"

AgentKit 架构优化演进计划

Summary

基于对 Codex CLI、Claude Code、Trae Agent 2.0、Qoder 的深入分析，对 AgentKit 进行架构优化演进。核心变更：(1) 消除"路由"概念改为请求预处理，(2) 专家团从去中心化协作简化为 hub-and-spoke，(3) PlanExec 简化为 Spec-Driven 模式，(4) 聊天记录 SQLite 持久化，(5) 删除旧路由层代码，(6) 新增可验证执行和工具描述精简。

Problem Frame

AgentKit 当前架构存在三类问题：

概念误导：SimpleRouter 仍暗示"路由层"存在，但实际只做 @skill 前缀解析 + greeting fast-path，核心决策已交给 REACT agent loop 中的 LLM
过度设计：ExpertTeam 的去中心化协作（CollaborationPlan + HandoffTransport + SharedWorkspace + 3 种 MergeStrategy）远超业界实践，增加 ~1500 行代码但无对应价值
能力缺口：PlanExecEngine 默认使用 _LLMStepExecutor（纯 LLM 调用无工具），聊天记录不持久化，无自动验证循环

Requirements

R1: 消除"路由"概念，SimpleRouter 重命名为 RequestPreprocessor，语义从"路由决策"变为"请求预处理"
R2: 专家团简化为 hub-and-spoke 模式（Lead Expert + 并行 Task，深度=1），删除 CollaborationPlan/HandoffTransport/SharedWorkspace 的 ExpertTeam 专用逻辑
R3: PlanExecEngine 默认使用 ReActStepExecutor，删除 _LLMStepExecutor
R4: 聊天记录 SQLite 持久化（参考 Codex 的 Thread 持久化）
R5: 删除 CostAwareRouter 及相关代码（HeuristicClassifier、IntentRouter、QualityGate、SemanticRouter）
R6: 新增可验证执行（Test-and-Verify 循环，参考 Codex Cloud）
R7: 工具描述分层注入（核心工具全量 + 扩展工具一行描述 + tool_search 按需获取）
R8: 默认启用上下文压缩
R9: 新增 Spec 文档作为一等公民（参考 Qoder Quest Mode）
R10: 统一事件模型（SQ/EQ 双队列，参考 Codex）

Key Technical Decisions

KTD1: SimpleRouter 重命名为 RequestPreprocessor

决策：将 SimpleRouter 重命名为 RequestPreprocessor，route() 方法重命名为 preprocess()

依据：

Codex 无路由层，Claude Code 无路由层，Trae 移除了 Proposal 阶段——业界共识是没有路由层
SimpleRouter 的 3 个功能（@skill 前缀、greeting regex、default REACT）都是预处理而非路由决策
"路由"概念误导开发者认为存在意图预测层，实际上 LLM 在 agent loop 中自主决策

替代方案：完全删除 SimpleRouter，所有请求直接进入 REACT loop。被否决——greeting fast-path 每次请求节省 ~100 tokens + 500ms，@skill 前缀是用户显式指令需要代码级解析

KTD2: 专家团 hub-and-spoke 模式

决策：ExpertTeam 从去中心化协作简化为 Lead Expert + 并行 Task（深度=1）

依据：

Claude Code：Task 工具深度=1，子 Agent 不能再生子 Agent
Codex：spawn_agent 层级式，结果返回父 Agent
Qoder：多专家并行独立执行，主 Agent 汇总
去中心化协作的通信复杂度 O(N²)，hub-and-spoke 为 O(N)
同一 LLM 扮演不同"专家"不产生真正的观点多样性，等价于多次采样+合并

保留：ExpertConfig/ExpertTemplate/Registry（定义专家 persona）、BEST 合并策略（Lead Agent 选择最佳结果）

删除：CollaborationPlan 的 phase 依赖图、HandoffTransport 的 Agent 间通信、SharedWorkspace 的跨阶段状态共享、VOTE/FUSION 合并策略

KTD3: PlanExec 默认 ReActStepExecutor + Spec-Driven

决策：默认使用 ReActStepExecutor（已实现），删除 _LLMStepExecutor；新增 Spec 文档持久化

依据：

_LLMStepExecutor 不支持工具调用 = 没有执行能力
ReActStepExecutor 已实现并使用 ReActEngine，支持工具调用和多步推理
Qoder Quest Mode：Spec First 是人和 AI 的契约，用户确认后再执行
当前 PlanExec 的计划对用户不可见，用户无法在执行前纠正方向

KTD4: 聊天记录 SQLite 持久化

决策：使用 SQLite 做聊天记录持久化（参考 Codex 的 Thread 持久化）

依据：

Codex 使用 SQLite（轻量、零配置、跨平台），支持 resume/fork/archive
Claude Code 使用 append-only JSONL（更简单但搜索能力弱）
当前 ConversationStore 纯内存，服务重启后丢失
SQLite 支持会话搜索、分页加载，且无需额外服务

Implementation Units

U1. SimpleRouter 重命名为 RequestPreprocessor

Goal: 消除"路由"概念，将 SimpleRouter 重命名为 RequestPreprocessor

Dependencies: 无

Files:

src/agentkit/chat/simple_router.py → 重命名为 src/agentkit/chat/request_preprocessor.py
src/agentkit/chat/skill_routing.py — 更新引用
src/agentkit/server/routes/portal.py — 更新 import 和调用
src/agentkit/server/routes/chat.py — 更新 import 和调用
src/agentkit/server/app.py — 更新 import 和调用
tests/unit/chat/test_simple_router.py → 重命名并更新

Approach:

创建 request_preprocessor.py，类名 RequestPreprocessor，方法 route() → preprocess()
_is_direct_chat() 重命名为 _is_trivial_input()
SkillRoutingResult 保留（它是数据结构，不涉及路由概念）
更新所有调用点
删除旧文件

Patterns to follow: 现有 SimpleRouter 的代码结构

Test scenarios:

@skill:xxx 前缀正确解析为 SKILL_REACT 模式
Greeting regex 匹配返回 DIRECT_CHAT
默认输入返回 REACT 模式
未知 skill 回退到 REACT
preprocess() 方法签名与 route() 兼容

Verification: ruff check src/ && pytest tests/unit/chat/ -v

U2. 删除 CostAwareRouter 及相关代码

Goal: 删除已被 SimpleRouter 替代的旧路由层代码

Dependencies: U1

Files:

src/agentkit/router/ 目录下大部分文件（保留 init.py 和必要的导出）
src/agentkit/server/app.py — 清理注释掉的引用
src/agentkit/chat/cost_aware_router.py — 删除

Approach:

确认 CostAwareRouter 在代码中无活跃引用（app.py 已注释）
删除 cost_aware_router.py、heuristic_classifier.py、intent.py（IntentRouter）、quality_gate.py、semantic_router.py
保留 router/__init__.py 导出必要的类型（如 ExecutionMode，如果前端依赖）
清理 app.py 中的注释引用
更新 router/ 目录的 __init__.py

Test scenarios:

删除后 ruff check 无错误
pytest -m "not integration" 全部通过
无 import 错误

Verification: ruff check src/ && pytest -m "not integration" -x

U3. 专家团简化为 hub-and-spoke

Goal: 将 ExpertTeam 从去中心化协作简化为 Lead Expert + 并行 Task 模式

Dependencies: 无（可与 U1/U2 并行）

Files:

src/agentkit/experts/orchestrator.py — 重写为 hub-and-spoke
src/agentkit/experts/team.py — 简化，移除 CollaborationPlan 依赖
src/agentkit/experts/plan.py — 简化，保留 MergeStrategy.BEST
src/agentkit/core/handoff_transport.py — 移除 ExpertTeam 专用逻辑
src/agentkit/core/shared_workspace.py — 移除 ExpertTeam 专用逻辑
src/agentkit/experts/router.py — 简化为 @team 前缀 + RequestPreprocessor 集成
tests/unit/experts/test_orchestrator.py — 更新

Approach:

重写 TeamOrchestrator：Lead Expert 自主规划 + 并行 spawn Task
删除 CollaborationPlan 的 phase 依赖图，Lead Expert 自主决定执行顺序
删除 HandoffTransport 的 Agent 间通信，Task 结果直接返回 Lead Expert
删除 SharedWorkspace 的跨阶段状态共享，Lead Expert 持有所有状态
保留 MergeStrategy.BEST（Lead Agent 选择最佳结果），删除 VOTE/FUSION
简化 ExpertTeamRouter 为 @team 前缀触发
保留 ExpertConfig/ExpertTemplate/Registry 不变

Technical design:

新 TeamOrchestrator 流程:
1. 用户输入 → @team:xxx 前缀 → ExpertTeamMode
2. Lead Expert 接收任务，自主分解为子任务
3. 并行 spawn Task（每个 Task 是独立 ReActEngine 实例）
4. 等待所有 Task 完成
5. Lead Expert 汇总结果（BEST 策略）
6. 返回最终结果

约束:
- Task 深度=1（Task 不能再 spawn Task）
- Task 之间无通信
- Lead Expert 持有所有状态

Test scenarios:

Lead Expert 正确分解任务为子任务
并行 Task 独立执行并返回结果
Lead Expert 汇总结果
单个 Task 失败不影响其他 Task
所有 Task 失败时回退到 Lead Expert 单独执行
@team 前缀正确触发 ExpertTeamMode

Verification: ruff check src/ && pytest tests/unit/experts/ -v

U4. PlanExec 默认 ReActStepExecutor + 删除 _LLMStepExecutor

Goal: PlanExecEngine 默认使用 ReActStepExecutor，删除 _LLMStepExecutor

Dependencies: 无

Files:

src/agentkit/core/plan_exec_engine.py — 删除 _LLMStepExecutor 和 _LLMStepAgent，默认 step_executor_type="react"
tests/unit/core/test_plan_exec_engine.py — 更新

Approach:

删除 _LLMStepExecutor 和 _LLMStepAgent 类
_create_executor() 方法移除 step_executor_type 参数，始终使用 ReActStepExecutor
清理相关 import

Test scenarios:

PlanExecEngine 默认创建 ReActStepExecutor
ReActStepExecutor 正确执行带工具调用的步骤
步骤失败时触发重规划

Verification: ruff check src/ && pytest tests/unit/core/test_plan_exec_engine.py -v

U5. 聊天记录 SQLite 持久化

Goal: 使用 SQLite 持久化聊天记录，服务重启后不丢失

Dependencies: U1（RequestPreprocessor 重命名完成后更新调用点）

Files:

src/agentkit/chat/sqlite_conversation_store.py — 新建
src/agentkit/server/routes/portal.py — 替换 ConversationStore
src/agentkit/chat/conversation_store.py — 保留作为接口/内存实现

Approach:

新建 SqliteConversationStore，实现与 ConversationStore 相同接口
SQLite 表结构：conversations(id, session_id, role, content, timestamp, metadata)
支持按 session_id 查询、分页加载、搜索
数据库文件路径：~/.agentkit/conversations.db
在 portal.py 中替换 ConversationStore 为 SqliteConversationStore
保留 ConversationStore 作为内存实现（测试用）

Test scenarios:

消息正确持久化到 SQLite
按 session_id 查询返回完整对话
分页加载正确
服务重启后数据不丢失
SQLite 文件不存在时自动创建

Verification: ruff check src/ && pytest tests/unit/chat/test_sqlite_conversation_store.py -v

U6. 可验证执行（Test-and-Verify 循环）

Goal: ReActEngine 执行后可选自动运行项目测试验证结果

Dependencies: U4

Files:

src/agentkit/core/verification_loop.py — 新建
src/agentkit/core/react.py — 集成验证循环
src/agentkit/tools/builtin.py — 新增 run_tests 工具

Approach:

新建 VerificationLoop：执行后运行 pytest/ruff/typecheck，失败则自动重试
最大重试次数可配置（默认 2）
验证结果附加到 ReActResult
新增 run_tests 内置工具，LLM 可主动调用
验证循环默认关闭，通过参数 verification_enabled=True 启用

Test scenarios:

验证循环关闭时行为不变
验证循环开启时，执行后自动运行测试
测试通过时返回成功
测试失败时自动重试
达到最大重试次数后返回失败结果

Verification: ruff check src/ && pytest tests/unit/core/test_verification_loop.py -v

U7. 工具描述分层注入 + tool_search

Goal: 核心工具全量注入，扩展工具只注入名称+一行描述，LLM 可通过 tool_search 获取完整描述

Dependencies: 无

Files:

src/agentkit/core/react.py — 修改 _build_tool_use_prompt
src/agentkit/tools/builtin.py — 新增 tool_search 工具
src/agentkit/tools/search.py — 新建，BM25 工具搜索

Approach:

工具分为 core（read/write/bash/search）和 extended（其余）
core 工具全量注入 prompt
extended 工具只注入 name + one-line description
新增 tool_search 工具：BM25 搜索工具描述，返回完整描述
LLM 在 agent loop 中按需调用 tool_search

Test scenarios:

core 工具全量出现在 prompt 中
extended 工具只出现名称和一行描述
tool_search 正确返回工具完整描述
BM25 搜索相关性排序

Verification: ruff check src/ && pytest tests/unit/tools/test_tool_search.py -v

U8. 默认启用上下文压缩

Goal: ReActEngine 默认启用滑动窗口压缩

Dependencies: 无

Files:

src/agentkit/core/react.py — 修改默认 compressor 参数
src/agentkit/core/compressor.py — 确认滑动窗口实现

Approach:

ReActEngine 的 __init__ 中 compressor 默认值从 None 改为 SlidingWindowCompressor
保留最近 N 轮 + 系统提示 + 工具描述
N 可配置（默认 10）

Test scenarios:

长对话自动压缩
压缩后系统提示和工具描述保留
压缩不影响最近 N 轮对话

Verification: ruff check src/ && pytest tests/unit/core/test_compressor.py -v

U9. Spec 文档作为一等公民

Goal: PlanExec 生成的计划持久化为 Spec 文档，用户可查看、编辑、确认后再执行

Dependencies: U4

Files:

src/agentkit/core/spec_manager.py — 新建
src/agentkit/core/plan_exec_engine.py — 集成 SpecManager
src/agentkit/server/routes/tasks.py — 新增 Spec 相关 API

Approach:

新建 SpecManager：管理 Spec 文档的 CRUD
Spec 文件路径：.agentkit/specs/<plan_id>.yaml
PlanExecEngine 生成计划后，先持久化为 Spec
新增 API：GET /api/v1/specs、GET /api/v1/specs/{id}、PUT /api/v1/specs/{id}、POST /api/v1/specs/{id}/confirm
用户确认后才开始执行

Test scenarios:

计划正确持久化为 Spec 文件
Spec 文件可读取和编辑
未确认的 Spec 不会执行
确认后触发执行

Verification: ruff check src/ && pytest tests/unit/core/test_spec_manager.py -v

U10. 统一事件模型（SQ/EQ 双队列）

Goal: 统一 CLI 和 WebSocket 的事件模型为 SQ/EQ 双队列

Dependencies: U3

Files:

src/agentkit/core/protocol.py — 新增 SQ/EQ 事件类型
src/agentkit/server/routes/portal.py — 对接 EQ
src/agentkit/cli/chat.py — 对接 EQ

Approach:

定义 SubmissionQueue（用户输入）和 EventQueue（Agent 输出）
事件类型：Session/Task/Turn 三级模型
Portal WebSocket 和 CLI 共享同一事件流
前端可以统一渲染

Test scenarios:

SQ 正确接收用户输入
EQ 正确推送 Agent 事件
WebSocket 和 CLI 共享事件流
事件类型正确分类

Verification: ruff check src/ && pytest tests/unit/core/test_protocol.py -v

Scope Boundaries

In Scope

上述 10 个 Implementation Unit
单元测试覆盖

Out of Scope

DockerComputerUseSession 实现（P3，等用户需求验证）
前端组件更新（后续迭代）
Agent 配置热重载（P4）
渐进式上下文加载（P4）
Soul 演变多维度触发（关闭）
OTel 埋点（延后）

Deferred to Follow-Up Work

前端 ExpertTeamView 接入真实数据
SWE-bench 端到端验证
性能监控和成本追踪

Risks & Dependencies

Risk	Impact	Mitigation
U3 专家团重写可能影响现有 @team 功能	中	保留 ExpertConfig/ExpertTemplate/Registry，只重写 Orchestrator
U5 SQLite 在高并发下可能有锁竞争	低	聊天场景写频率低，SQLite WAL 模式足够
U7 tool_search 可能增加 LLM 调用轮次	中	核心工具全量注入，只有扩展工具需要搜索
U9 Spec 文档可能增加用户操作步骤	低	默认自动确认（可配置），不阻塞自动化流程

Phased Delivery

Phase 1（清理收尾）: U1 → U2 → U4 → U8 Phase 2（核心能力）: U5 → U6 → U9 Phase 3（多 Agent）: U3 → U7 → U10

16 KiB Raw Blame History Unescape Escape

AgentKit 架构优化演进计划

Summary

Problem Frame

Requirements

Key Technical Decisions

KTD1: SimpleRouter 重命名为 RequestPreprocessor

KTD2: 专家团 hub-and-spoke 模式

KTD3: PlanExec 默认 ReActStepExecutor + Spec-Driven

KTD4: 聊天记录 SQLite 持久化

Implementation Units

U1. SimpleRouter 重命名为 RequestPreprocessor

U2. 删除 CostAwareRouter 及相关代码

U3. 专家团简化为 hub-and-spoke

U4. PlanExec 默认 ReActStepExecutor + 删除 _LLMStepExecutor

U5. 聊天记录 SQLite 持久化

U6. 可验证执行（Test-and-Verify 循环）

U7. 工具描述分层注入 + tool_search

U8. 默认启用上下文压缩

U9. Spec 文档作为一等公民

U10. 统一事件模型（SQ/EQ 双队列）

Scope Boundaries

In Scope

Out of Scope

Deferred to Follow-Up Work

Risks & Dependencies

Phased Delivery

16 KiB

Raw Blame History