fischer-agentkit/2026-06-05-001-feat-agentkit-tdd-validation-plan.md at 2296d0b2093725243e1d0e2df4ddb50e2952b632

25 KiB

Raw Blame History

title	type	status	date	origin	execution_posture
feat: fischer-agentkit TDD 验证与补全计划	feat	active	2026-06-05	geo/docs/plans/2026-06-04-010-refactor-unified-agent-framework-plan.md	tdd

Summary

对 fischer-agentkit 已实现的 6 大模块进行 TDD 验证：先补全缺失的单元测试覆盖（6 个零覆盖模块 + 4 个薄弱模块），再修复测试中发现的问题（pgvector 向量检索、datetime 弃用、测试基础设施缺失），最后补全 4 个集成测试验证端到端流程。采用真实 Redis/PostgreSQL 服务进行测试，确保验证结果可靠。

Problem Frame

fischer-agentkit 的 6 大模块（Core/Tools/Memory/Evolution/Orchestrator/MCP）代码已全部实现，189 个现有测试全部通过，但存在以下结构性问题：

6 个模块完全无测试：dispatcher、registry、mcp/server、evolution_store、agent_tool、prompts — 代码存在但行为未验证
4 个模块测试薄弱：working_memory（无 Redis mock）、episodic_memory（仅测试衰减公式）、mcp/client（仅间接测试）、handoff（仅无 Redis 场景）
集成测试完全缺失：tests/integration/ 目录为空，无法验证端到端流程
代码质量问题：21 处 datetime.utcnow() 弃用警告、EpisodicMemory pgvector 向量检索标记为 TODO
测试基础设施缺失：无 conftest.py、fixture 在 4 个文件中重复定义

这些问题意味着：虽然代码"能跑"，但核心功能（任务调度、Agent 注册、MCP 服务端、进化持久化）从未被自动化测试验证过。

Requirements

本计划追溯至原始需求文档的以下条目：

需求 ID	需求描述	验证状态
R2	BaseAgent 统一生命周期	部分验证（缺 dispatcher/registry）
R6	Tool 三种类型（Function/Agent/MCP）	AgentTool 未验证
R7	ToolRegistry 注册发现版本管理	基本验证
R8	MCP Server 暴露 Agent 能力	未验证
R9	MCP Client 调用外部工具	仅间接验证
R11	Working Memory Redis	未验证
R12	Episodic Memory 向量检索	未验证（TODO）
R13	Semantic Memory RAG+Graph	基本验证
R14	混合检索策略	部分验证
R15	经验积累自动记录	部分验证
R20	Handoff 任务转交	仅无 Redis 场景
R22	事件驱动替代轮询	未实现（不在本计划范围）

Key Technical Decisions

KTD1. 真实服务测试策略：单元测试和集成测试均使用真实 Redis 和 PostgreSQL（pgvector）服务，通过 docker-compose 启动测试专用容器。理由：fakeredis 不支持所有 Redis 命令（如 Pub/Sub 的完整行为），mock SQLAlchemy session 无法验证真实 SQL 和 pgvector 查询。真实服务测试更可靠，且 GEO 项目已有 pgvector/pg15 和 Redis 7 的 docker 镜像。

KTD2. 测试基础设施先行：先创建 conftest.py 提取公共 fixture，再逐模块补全测试。理由：4 个文件重复定义 _make_task() 等辅助函数，不统一会导致后续测试继续重复。

KTD3. TDD 红绿循环：每个模块先写测试定义期望行为（可能失败），再修复代码使测试通过。对于 EpisodicMemory 的 pgvector TODO，先写测试定义向量检索的期望行为，再实现 cosine distance 排序。

KTD4. datetime.utcnow() 统一修复：在补全测试之前先修复 21 处弃用警告，避免新测试继承技术债务。替换为 datetime.now(timezone.utc)，与项目后期代码（agent_tool.py、pipeline_engine.py 等）保持一致。

KTD5. 测试风格统一为类式：新测试统一使用 class TestXxx 分组 + async def 方法（依赖 asyncio_mode = "auto"），不再使用 @pytest.mark.asyncio 装饰器。与项目较新的测试文件风格一致。

High-Level Technical Design

测试分层架构

flowchart TB
    subgraph Infrastructure["测试基础设施"]
        DC["docker-compose.test.yml<br/>Redis 7 + pgvector/pg15"]
        Conf["conftest.py<br/>公共 fixture"]
        Env[".env.test<br/>测试环境变量"]
    end

    subgraph UnitTests["单元测试 (tests/unit/)"]
        P0["P0: 零覆盖模块<br/>dispatcher, registry<br/>mcp/server, evolution_store<br/>agent_tool, prompts"]
        P1["P1: 薄弱模块<br/>working_memory, episodic_memory<br/>mcp/client, handoff"]
        Fix["代码修复<br/>datetime.utcnow, pgvector TODO"]
    end

    subgraph IntegrationTests["集成测试 (tests/integration/)"]
        AL["test_agent_lifecycle.py<br/>完整生命周期"]
        TC["test_tool_composition.py<br/>工具组合端到端"]
        EL["test_evolution_loop.py<br/>进化闭环"]
        MR["test_mcp_roundtrip.py<br/>MCP 往返"]
    end

    Infrastructure --> UnitTests
    P0 --> Fix
    P1 --> Fix
    UnitTests --> IntegrationTests

测试执行流程

stateDiagram-v2
    [*] --> SetupInfra: 启动测试容器
    SetupInfra --> WriteTests: 编写测试（RED）
    WriteTests --> RunTests: 运行测试
    RunTests --> FixCode: 测试失败 → 修复代码（GREEN）
    FixCode --> RunTests: 重新运行
    RunTests --> WriteTests: 全部通过 → 下一模块
    RunTests --> Integration: 单元测试全部通过
    Integration --> [*]: 集成测试通过

Implementation Units

U1. 测试基础设施搭建

Goal: 创建 docker-compose 测试配置、conftest.py 公共 fixture、.env.test 环境变量，为后续 TDD 提供可靠基础。

Requirements: R2, R11, R12

Dependencies: 无

Files:

fischer-agentkit/docker-compose.test.yml（新建）
fischer-agentkit/.env.test（新建）
fischer-agentkit/tests/conftest.py（新建）
fischer-agentkit/tests/unit/conftest.py（新建）
fischer-agentkit/tests/integration/conftest.py（新建）
fischer-agentkit/pyproject.toml（修改：添加 pytest-docker 或 testcontainers 依赖）

Approach:

创建 docker-compose.test.yml，包含 Redis 7 和 pgvector/pg15 服务，端口避免与 GEO 项目冲突（Redis 6379 → 6381，PostgreSQL 5432 → 5434）
创建 .env.test 声明测试环境变量
创建 tests/conftest.py，提取公共 fixture：
- make_task() — 构建 TaskMessage
- make_result() — 构建 TaskResult
- redis_client — 连接测试 Redis 的 async fixture
- pg_session_factory — 连接测试 PostgreSQL 的 async fixture
- clean_redis — 每个测试前清空 Redis
- clean_db — 每个测试前清空数据库
创建 tests/unit/conftest.py 和 tests/integration/conftest.py，分别提供各自层级的 fixture
在 pyproject.toml 的 dev 依赖中添加 pytest-docker>=0.4 或 testcontainers[postgres,redis]>=4.0
添加 pytest 配置的 env_file = ".env.test" 或通过 fixture 管理环境变量

Patterns to follow: GEO 项目的 geo/docker-compose.yml 中 Redis 和 PostgreSQL 的配置模式

Test scenarios:

docker-compose.test.yml 启动后 Redis 可连接并执行 PING
docker-compose.test.yml 启动后 PostgreSQL 可连接并查询 pgvector 扩展
conftest.py 的 redis_client fixture 可正常执行 set/get 操作
conftest.py 的 pg_session_factory fixture 可创建表并执行查询
make_task() fixture 生成的 TaskMessage 可被 BaseAgent.execute() 接受
clean_redis fixture 在测试间正确隔离数据

Verification: docker compose -f docker-compose.test.yml up -d && pytest tests/ -v 全部通过

U2. datetime.utcnow() 弃用修复

Goal: 将项目中 21 处 datetime.utcnow() 全部替换为 datetime.now(timezone.utc)，消除 DeprecationWarning。

Requirements: 代码质量（非功能性需求）

Dependencies: 无（可与 U1 并行）

Files:

fischer-agentkit/src/agentkit/core/protocol.py（7 处）
fischer-agentkit/src/agentkit/memory/base.py（1 处）
fischer-agentkit/src/agentkit/memory/working.py（3 处）
fischer-agentkit/src/agentkit/memory/episodic.py（2 处）
fischer-agentkit/src/agentkit/evolution/reflector.py（1 处）
fischer-agentkit/src/agentkit/evolution/lifecycle.py（2 处）
fischer-agentkit/tests/unit/test_memory_system.py（4 处）
fischer-agentkit/tests/unit/test_protocol.py（1 处）

Approach:

在每个文件的 import 区域添加 from datetime import timezone（如尚未导入）
将 datetime.utcnow() 替换为 datetime.now(timezone.utc)
将 field(default_factory=lambda: datetime.utcnow()) 替换为 field(default_factory=lambda: datetime.now(timezone.utc))
运行现有 189 个测试确认无回归

Execution note: 先运行测试确认当前基线通过，修改后重新运行确认无回归且无 DeprecationWarning。

Patterns to follow: 项目中已正确使用 datetime.now(timezone.utc) 的文件：agent_tool.py、pipeline_engine.py、registry.py、dispatcher.py、base.py

Test scenarios:

修改后 pytest tests/ -W error::DeprecationWarning 无弃用警告
修改后 189 个现有测试全部通过
TaskMessage.from_dict() 反序列化包含 UTC 时间戳的 JSON 正确

Verification: pytest tests/ -W error::DeprecationWarning -v 全部通过，零警告

U3. 零覆盖模块单元测试（Core 层）

Goal: 为 core/dispatcher.py 和 core/registry.py 补全单元测试，验证任务调度和 Agent 注册发现的核心逻辑。

Requirements: R2

Dependencies: U1

Files:

fischer-agentkit/tests/unit/test_dispatcher.py（新建）
fischer-agentkit/tests/unit/test_registry.py（新建）

Approach:

test_dispatcher.py：
- 测试 TaskDispatcher 在本地模式（无 Redis）下的任务分发
- 测试任务队列的 FIFO 顺序
- 测试任务重试逻辑
- 测试任务取消
- 测试回调机制
- 测试并发分发（多个任务同时入队）
test_registry.py：
- 测试 AgentRegistry 动态注册新 AgentType
- 测试注册重复 AgentType 的处理
- 测试 get_available_agent 的轮询策略
- 测试 Agent 心跳和过期清理
- 测试按能力查询 Agent

Execution note: TDD — 先写测试定义期望行为，运行确认结果，再根据需要调整。

Patterns to follow: 现有 test_base_agent.py 的类式测试风格

Test scenarios:

test_dispatcher.py:

本地模式分发任务到指定 Agent，返回 TaskResult
任务队列按 FIFO 顺序处理
任务执行失败时重试指定次数
取消正在等待的任务返回取消状态
回调函数在任务完成后被调用
多个任务并发分发，结果正确返回

test_registry.py:

动态注册新 AgentType 不报错
注册重复 AgentType 覆盖旧配置
get_available_agent 轮询策略返回不同 Agent
Agent 心跳超时后从可用列表移除
按 supported_tasks 查询匹配的 Agent
空注册表查询返回空列表

Verification: pytest tests/unit/test_dispatcher.py tests/unit/test_registry.py -v 全部通过

U4. 零覆盖模块单元测试（Tools + Prompts 层）

Goal: 为 tools/agent_tool.py 和 prompts/ 模块补全单元测试，验证 Agent 包装为 Tool 和模板渲染的逻辑。

Requirements: R6

Dependencies: U1

Files:

fischer-agentkit/tests/unit/test_agent_tool.py（新建）
fischer-agentkit/tests/unit/test_prompt_template.py（新建）
fischer-agentkit/tests/unit/test_prompt_section.py（新建）

Approach:

test_agent_tool.py：
- 测试 AgentTool 的输入映射（input_mapping）
- 测试 AgentTool 的输出映射（output_mapping）
- 测试 AgentTool 通过 Dispatcher 分发任务
- 测试 AgentTool 超时处理
- 测试 AgentTool 的 schema 自动生成
test_prompt_template.py：
- 测试 PromptTemplate 变量替换 ${key}
- 测试缺失变量的处理
- 测试模板渲染结果
test_prompt_section.py：
- 测试 PromptSection 的条件渲染
- 测试多 Section 组合渲染

Execution note: TDD — AgentTool 的轮询等待机制（1 秒间隔）在测试中需要 mock asyncio.sleep 加速。

Patterns to follow: 现有 test_tool_composition.py 的 Mock 模式

Test scenarios:

test_agent_tool.py:

AgentTool 正确映射输入参数到 TaskMessage
AgentTool 正确映射 TaskResult 到输出 dict
AgentTool 通过 Dispatcher 分发任务并等待结果
AgentTool 超时后抛出 TimeoutError
AgentTool 的 input_schema 从 input_mapping 推断
AgentTool 的 output_schema 从 output_mapping 推断

test_prompt_template.py:

${name} 变量替换为实际值
缺失变量时抛出 KeyError 或保留原始占位符
多变量模板正确替换所有变量
空模板渲染返回空字符串

test_prompt_section.py:

条件为 True 的 Section 包含在渲染结果中
条件为 False 的 Section 排除在渲染结果外
多 Section 按顺序组合渲染
无条件 Section 始终包含

Verification: pytest tests/unit/test_agent_tool.py tests/unit/test_prompt_template.py tests/unit/test_prompt_section.py -v 全部通过

U5. 零覆盖模块单元测试（MCP Server + Evolution Store）

Goal: 为 mcp/server.py 和 evolution/evolution_store.py 补全单元测试，验证 MCP 服务端点和进化持久化逻辑。

Requirements: R8, R15

Dependencies: U1

Files:

fischer-agentkit/tests/unit/test_mcp_server.py（新建）
fischer-agentkit/tests/unit/test_evolution_store.py（新建）

Approach:

test_mcp_server.py：
- 使用 httpx.AsyncClient + ASGITransport 测试 FastAPI 端点
- 测试 /tools/list 返回 ToolRegistry 中注册的工具
- 测试 /tools/call 调用指定工具并返回结果
- 测试调用不存在的工具返回错误
- 测试 /resources/read 端点
- 测试 JSON-RPC 2.0 协议格式
test_evolution_store.py：
- 测试 EvolutionStore 记录进化变更
- 测试按 agent_name 查询变更历史
- 测试回滚操作
- 测试变更状态管理（active/rolled_back）

Execution note: MCP Server 测试使用 httpx.AsyncClient + ASGITransport，无需启动真实 HTTP 服务器。

Patterns to follow: 现有 test_mcp_transport.py 的 httpx_mock 模式；FastAPI 官方推荐的 AsyncClient 测试模式

Test scenarios:

test_mcp_server.py:

/tools/list 返回已注册工具的名称和 schema
/tools/call 调用 FunctionTool 返回正确结果
/tools/call 调用不存在的工具返回 JSON-RPC 错误
/resources/read 返回可用资源列表
JSON-RPC 2.0 请求格式正确解析
JSON-RPC 2.0 响应包含 jsonrpc/version/id 字段

test_evolution_store.py:

记录 prompt 类型的进化变更
记录 strategy 类型的进化变更
按 agent_name 查询返回该 Agent 的所有变更
回滚操作将变更状态设为 rolled_back
回滚后查询返回 rolled_back 状态
空存储查询返回空列表

Verification: pytest tests/unit/test_mcp_server.py tests/unit/test_evolution_store.py -v 全部通过

U6. 薄弱模块补强测试（Memory 层）

Goal: 为 WorkingMemory 和 EpisodicMemory 补全真实服务测试，验证 Redis 存取和 pgvector 向量检索。实现 EpisodicMemory 的 pgvector cosine distance 排序（当前标记为 TODO）。

Requirements: R11, R12, R14

Dependencies: U1, U2

Files:

fischer-agentkit/tests/unit/test_working_memory.py（新建）
fischer-agentkit/tests/unit/test_episodic_memory.py（新建）
fischer-agentkit/tests/unit/test_memory_retriever.py（新建）
fischer-agentkit/src/agentkit/memory/episodic.py（修改：实现 pgvector cosine distance）

Approach:

test_working_memory.py（真实 Redis）：
- 测试 store/retrieve/delete 基本操作
- 测试 TTL 自动过期
- 测试 get_context() 格式化输出
- 测试不同 Agent 实例的 key 隔离
- 测试 Redis 连接失败时的降级处理
test_episodic_memory.py（真实 pgvector）：
- 测试 store 写入任务经验并生成 embedding
- 测试 search 按语义相似度检索（pgvector cosine distance）
- 测试 search 按时间衰减排序
- 测试 search 混合排序（语义 + 时间衰减）
- 测试 delete 删除指定记录
test_memory_retriever.py：
- 测试三层记忆并行检索
- 测试权重融合排序
- 测试 Token 预算管理（截断超限结果）
实现 pgvector cosine distance：
- 在 episodic.py 的 search 方法中，将 # TODO: 使用 pgvector 的 cosine distance 排序 替换为真实的 pgvector 查询
- 使用 embedding <=> :query_embedding 操作符进行 cosine distance 排序
- 结合时间衰减因子：最终得分 = 语义相似度 × 时间衰减

Execution note: TDD — 先写 EpisodicMemory 的向量检索测试（期望行为），运行确认失败（TODO 未实现），再实现 pgvector cosine distance 排序使测试通过。

Patterns to follow: GEO 项目的 backend/app/services/knowledge/retriever.py 中 HybridRetriever 的 RRF 融合排序模式

Test scenarios:

test_working_memory.py:

store + retrieve 返回相同值
TTL 过期后 retrieve 返回空
get_context() 返回格式化的上下文字符串
不同 Agent 的 working_memory key 互不干扰
delete 后 retrieve 返回空
存储复杂对象（嵌套 dict）正确序列化/反序列化

test_episodic_memory.py:

store 写入记录后可按 agent_name 查询
search 按语义相似度返回最相关记录（cosine distance）
search 时间衰减：近期记录排名高于远期
search 混合排序：语义相似 + 时间衰减综合排序
delete 删除指定 ID 的记录
空 store 的 search 返回空列表

test_memory_retriever.py:

并行查询三层记忆，结果合并
按权重融合排序（向量 0.5 + 关键词 0.2 + 图谱 0.3）
Token 预算管理：总 token 不超过预算时保留所有结果
Token 预算管理：超过预算时截断低分结果
某层记忆无结果时不影响其他层

Verification: pytest tests/unit/test_working_memory.py tests/unit/test_episodic_memory.py tests/unit/test_memory_retriever.py -v 全部通过，且 EpisodicMemory 的 TODO 已实现

U7. 薄弱模块补强测试（MCP Client + Handoff）

Goal: 为 MCPClient 和 HandoffManager 补全测试，验证 MCP 客户端工具发现和 Handoff 的 Redis Pub/Sub 机制。

Requirements: R9, R20

Dependencies: U1, U2

Files:

fischer-agentkit/tests/unit/test_mcp_client.py（新建）
fischer-agentkit/tests/unit/test_handoff.py（新建）

Approach:

test_mcp_client.py：
- 测试 MCPClient 通过 Transport 连接远程 Server
- 测试 list_tools() 返回工具列表
- 测试 call_tool() 调用远程工具
- 测试 MCPClient 直接 HTTP 模式（无 Transport）
- 测试连接失败时的错误处理
test_handoff.py（真实 Redis）：
- 测试 HandoffManager 通过 Redis Pub/Sub 发送转交请求
- 测试目标 Agent 监听并接收转交消息
- 测试转交消息携带上下文
- 测试无 Redis 时的降级处理（本地模式）
- 测试多个 Agent 同时监听不同频道

Execution note: Handoff 测试使用真实 Redis Pub/Sub，需要确保测试间频道隔离。

Patterns to follow: 现有 test_mcp_transport.py 的 HTTP mock 模式

Test scenarios:

test_mcp_client.py:

通过 Transport 调用 list_tools 返回工具名称列表
通过 Transport 调用 call_tool 返回工具执行结果
直接 HTTP 模式调用工具
连接不存在的 Server 抛出连接错误
call_tool 传入无效参数返回错误响应
JSON-RPC 2.0 请求格式正确

test_handoff.py:

send_handoff 通过 Redis Pub/Sub 发送消息
listen_for_handoffs 接收到转交消息
转交消息包含 source_agent、target_agent、context、reason
无 Redis 时 HandoffManager 降级为本地调用
不同 Agent 监听不同频道互不干扰
转交消息序列化/反序列化正确

Verification: pytest tests/unit/test_mcp_client.py tests/unit/test_handoff.py -v 全部通过

U8. 集成测试补全

Goal: 补全 4 个集成测试文件，验证端到端流程：Agent 完整生命周期、工具组合、进化闭环、MCP 往返。

Requirements: R2, R6, R8, R9, R15, R16, R18, R20

Dependencies: U1, U3, U4, U5, U6, U7

Files:

fischer-agentkit/tests/integration/test_agent_lifecycle.py（新建）
fischer-agentkit/tests/integration/test_tool_composition.py（新建）
fischer-agentkit/tests/integration/test_evolution_loop.py（新建）
fischer-agentkit/tests/integration/test_mcp_roundtrip.py（新建）

Approach:

test_agent_lifecycle.py：
- 启动 Agent → 发送任务 → 接收结果 → 停止 Agent 的完整流程
- 验证 on_task_start/on_task_complete 钩子调用顺序
- 验证任务失败时 on_task_failed 钩子触发
- 验证 Memory 在任务执行中的存取
test_tool_composition.py：
- SequentialChain：两个工具顺序执行，前一个输出作为后一个输入
- ParallelFanOut：三个工具并行执行，结果合并
- DynamicSelector：LLM 根据任务选择工具
- AgentTool：将 Agent 包装为 Tool 并调用
test_evolution_loop.py：
- 反思 → 优化 → A/B 测试 → 应用/回滚完整闭环
- 验证 EvolutionStore 持久化进化记录
- 验证 A/B 测试效果提升后自动应用
- 验证 A/B 测试效果下降后自动回滚
test_mcp_roundtrip.py：
- 启动 MCP Server → MCP Client 连接 → list_tools → call_tool → 结果返回
- 验证 Server 暴露的 Tool 与 ToolRegistry 一致
- 验证 Client 调用的结果与直接调用 Tool 一致

Execution note: 集成测试使用真实 Redis 和 PostgreSQL，标记为 @pytest.mark.integration，可通过 pytest -m "not integration" 跳过。

Patterns to follow: 现有 test_u8_geo_integration.py 的端到端测试模式

Test scenarios:

test_agent_lifecycle.py:

ConfigDrivenAgent 从 YAML 加载 → 启动 → 执行任务 → 返回 TaskResult → 停止
BaseAgent 生命周期钩子按序调用：start → on_task_start → handle_task → on_task_complete → stop
任务执行失败时 on_task_failed 触发，TaskResult 状态为 FAILED
Agent 执行任务时 WorkingMemory 自动存取上下文
Agent 执行任务后 EpisodicMemory 自动记录经验

test_tool_composition.py:

SequentialChain 顺序执行两个 FunctionTool，第二个接收第一个的输出
ParallelFanOut 并行执行三个 FunctionTool，结果合并
DynamicSelector 根据 LLM 判断选择合适工具
AgentTool 包装 Agent 并通过 Dispatcher 分发任务

test_evolution_loop.py:

执行 5 次任务后 Reflector 生成反思
PromptOptimizer 从成功案例生成 few-shot 示例
ABTester 分流测试，实验组效果提升后自动应用
ABTester 分流测试，实验组效果下降后自动回滚
EvolutionStore 记录所有变更，支持查询历史

test_mcp_roundtrip.py:

MCP Server 启动后 Client 可 list_tools
Client call_tool 返回与直接调用 Tool 相同的结果
Server 暴露的工具列表与 ToolRegistry 注册一致
JSON-RPC 2.0 协议端到端正确

Verification: pytest tests/integration/ -v 全部通过

Scope Boundaries

In Scope

补全 6 个零覆盖模块的单元测试
补强 4 个薄弱模块的单元测试
实现 EpisodicMemory 的 pgvector cosine distance 排序（当前 TODO）
修复 21 处 datetime.utcnow() 弃用警告
创建测试基础设施（docker-compose.test.yml、conftest.py）
补全 4 个集成测试文件

Deferred for Later

MIPROv2 多目标 Prompt 优化（R16 高级特性）
Bayesian Optimization 策略调优（R17 高级特性）
Pipeline 事件驱动替代轮询（R22）
MCP Client 自动发现远程工具并注册到本地 ToolRegistry（R9 高级特性）
MCP Server SSE 流式响应（R8 高级特性）
EvolutionMixin 与 BaseAgent 的自动集成（R15 增强）
AgentTool 轮询改为事件驱动
CI/CD 配置
mypy/pyright 类型检查配置

Outside This Project's Identity

GEO 业务系统的完整迁移（U8）
前端 Agent 管理界面
A2A Protocol 支持

Risks & Dependencies

Risk	Impact	Mitigation
pgvector cosine distance 实现可能需要调整表结构	需要数据库迁移	先写测试定义期望行为，实现时如需迁移则同步更新 docker-compose.test.yml 的 init-db 脚本
真实服务测试需要 docker 环境	CI 环境可能无 docker	提供 pytest marker 标记集成测试，无 docker 时可跳过；单元测试中 Redis/PG 相关测试也用 marker 标记
AgentTool 轮询等待在测试中耗时	测试执行缓慢	mock asyncio.sleep 加速，或设置短超时
现有测试可能因 conftest.py 重构而受影响	fixture 命名冲突	conftest.py 使用新 fixture 名，逐步迁移旧测试
pytest-httpx 未在 pyproject.toml 中声明	依赖缺失	在 U1 中添加到 dev 依赖

System-Wide Impact

测试执行时间：从当前 ~3 秒增加到预计 ~30 秒（真实服务 + 集成测试）
开发依赖：新增 pytest-docker/testcontainers、pytest-httpx
Docker 需求：开发环境需安装 Docker 以运行测试
CI/CD：后续需配置 GitHub Actions 运行 docker-compose 启动测试服务

25 KiB Raw Blame History Unescape Escape

Summary

Problem Frame

Requirements

Key Technical Decisions

High-Level Technical Design

测试分层架构

测试执行流程

Implementation Units

U1. 测试基础设施搭建

U2. datetime.utcnow() 弃用修复

U3. 零覆盖模块单元测试（Core 层）

U4. 零覆盖模块单元测试（Tools + Prompts 层）

U5. 零覆盖模块单元测试（MCP Server + Evolution Store）

U6. 薄弱模块补强测试（Memory 层）

U7. 薄弱模块补强测试（MCP Client + Handoff）

U8. 集成测试补全

Scope Boundaries

In Scope

Deferred for Later

Outside This Project's Identity

Risks & Dependencies

System-Wide Impact

25 KiB

Raw Blame History