10 KiB

Raw Blame History

title	status	date	type	origin
fix: P0 安全修复与代码质量优化	active	2026-06-21	fix	ocr 代码审查 + 全面项目质量评估报告

P0 安全修复与代码质量优化计划

Summary

基于 open-code-review 对 auth 功能分支的审查（9 文件 15 评论）和全面项目质量评估，本计划修复 4 个 P0 级安全/Bug 问题、1 个文档不一致问题、1 个核心 UX 缺失，并补强集成测试覆盖。所有修复均有明确的代码证据和 ocr 审查评论支撑。

Problem Frame

ocr 审查发现 whoami 冷启动路径存在 2 个安全漏洞（被禁用用户绕过 + 令牌放大风险），1 个数据不一致 Bug（list_active_by_provider 未过滤过期会话），1 个测试崩溃 Bug（asyncio.run 在事件循环内）。此外，CLAUDE.md/AGENTS.md 中路由系统描述与实际实现严重不匹配，前端缺少基础的"停止生成"功能。

Requirements

R1: whoami 冷启动必须检查 is_active，被禁用用户不能获取新 access token
R2: whoami 冷启动必须实现 refresh token 轮换，防止令牌放大攻击
R3: list_active_by_provider 必须过滤过期会话，与 docstring 承诺一致
R4: 集成测试中 asyncio.run() 必须修复，跨用户撤销测试必须可运行
R5: CLAUDE.md/AGENTS.md 路由系统描述必须与 RequestPreprocessor 实际实现匹配
R6: 前端聊天界面必须支持"停止生成"功能，用户可中途取消 LLM 输出
R7: 集成测试必须补齐 ocr 指出的覆盖缺口（404 边界、密码修改端到端、会话撤销验证）

Key Technical Decisions

KTD-1: whoami 冷启动 refresh token 处理策略

决策: 采用方案 B — 仅签发短期 access token，不创建新 refresh token。

理由: 客户端已有有效 refresh token，冷启动只需 access token 即可恢复会话。调用 svc.rotate() 会增加复杂度且改变客户端持有的 refresh token，而方案 B 更简单、安全（原 refresh token 仍受重用检测保护）。不调用 create_token_pair，改为单独签发 access token。

KTD-2: 停止生成的实现方式

决策: 前端通过 WebSocket 发送 cancel 消息（协议已定义），后端通过 CancellationToken 取消正在执行的 Agent 任务。

理由: WebSocket 协议中已有 cancel 消息类型（见 AGENTS.md），后端 BaseAgent 已实现 CancellationToken 协作式取消。只需前端添加按钮和发送逻辑，无需新增后端端点。

Scope Boundaries

In Scope

whoami 冷启动安全修复（is_active + token 策略）
list_active_by_provider 过滤修复
asyncio.run 测试修复 + 集成测试补强
CLAUDE.md/AGENTS.md 路由文档更新
前端"停止生成"按钮

Deferred to Follow-Up Work

多模态输入支持（C1，需独立计划）
用户反馈机制 thumbs up/down（C3，需独立计划）
Prompt Caching（C9，需 LLM Gateway 改造）
多租户隔离逻辑（C7，需独立计划）
: Any 类型清理（213 处，大规模重构）
吞异常清理（109 处，需逐文件审查）
LLM 模块测试覆盖（0.06 比率，需独立计划）

Implementation Units

U1. 修复 whoami 冷启动安全漏洞

Goal: 修复 whoami 冷启动路径的 2 个安全漏洞：缺失 is_active 检查 + 令牌放大风险。

Requirements: R1, R2

Dependencies: 无

Files:

Modify: src/agentkit/server/routes/auth.py（whoami 路由）
Modify: src/agentkit/server/auth/jwt_utils.py（如需新增单独签发 access token 的函数）
Test: tests/integration/auth/test_auth_routes.py

Approach:

在 whoami 路由中，将 if row is None 改为 if row is None or not bool(row["is_active"])，返回 401
移除冷启动路径中的 create_token_pair 调用，改为仅签发 access token（使用 jwt_utils 中现有的 access token 创建逻辑）
不创建新 refresh token，不调用 svc.rotate()，客户端保留原 refresh token

Patterns to follow: /auth/refresh 路由中的 is_active 检查模式（auth.py 第 514 行）

Test scenarios:

Happy path: 被禁用用户（is_active=0）用 refresh token 调用 whoami → 401
Happy path: 活跃用户用 refresh token 调用 whoami → 200 + 新 access token + 无新 refresh token
Edge case: 活跃用户用 access token 调用 whoami → 200 + access_token 为 None（行为不变）
Error path: 不存在的用户 ID（token 被篡改）→ 401

Verification: pytest tests/integration/auth/test_auth_routes.py::TestWhoamiColdStart -v 全部通过

U2. 修复 list_active_by_provider 过滤过期会话

Goal: 使 SQL 查询与 docstring 承诺一致，过滤掉已过期的会话。

Requirements: R3

Dependencies: 无

Files:

Modify: src/agentkit/server/auth/session_service.py（list_active_by_provider 方法）
Test: tests/unit/auth/test_session_service.py

Approach:

在 SQL 查询中添加 AND expires_at > ? 条件
传入当前 UTC 时间的 ISO 格式字符串作为参数
需要在方法顶部导入 datetime, timezone（如未导入）

Patterns to follow: list_all 方法中的 SQL 构造模式

Test scenarios:

Happy path: 有 1 个活跃会话 + 1 个过期会话 → 仅返回活跃会话
Happy path: 所有会话均过期 → 返回空列表
Edge case: 无会话 → 返回空列表
Edge case: 会话 expires_at 恰好等于当前时间 → 不返回（边界 > 而非 >=）

Verification: pytest tests/unit/auth/test_session_service.py -k list_active_by_provider -v 通过

U3. 修复 asyncio.run 测试崩溃 + 补强集成测试

Goal: 修复 test_revoke_other_user_session_returns_404 中的 asyncio.run 崩溃，并补齐 ocr 指出的 6 个测试覆盖缺口。

Requirements: R4, R7

Dependencies: U1

Files:

Modify: tests/integration/auth/test_auth_routes.py
Modify: tests/integration/auth/test_admin_routes.py

Approach:

将 _login_sync_create_user 改为 async def，调用方 test_revoke_other_user_session_returns_404 也改为 async def
移除未使用的 auth_db_with_admin fixture
补充测试：
- test_non_admin_cannot_list_all_sessions（admin 路由 403 测试）
- test_revoke_session_belonging_to_different_user_returns_404（admin 撤销不匹配用户会话）
- test_admin_list_sessions_for_nonexistent_user_returns_404
- test_admin_revoke_nonexistent_session_returns_404
- 密码修改测试：验证当前会话存活 + 用新密码登录
- 会话撤销测试：撤销后用原 token 调用 whoami → 401
- 全局会话列表：验证两个用户都会出现

Patterns to follow: 现有 test_admin_routes.py 中的测试模式

Test scenarios:

Happy path: asyncio 修复后 test_revoke_other_user_session_returns_404 正常运行
Happy path: 所有新增测试通过
Error path: 非管理员访问 /admin/sessions → 403
Error path: admin 撤销不存在会话 → 404
Integration: 密码修改 → 旧密码登录失败 → 新密码登录成功

Verification: pytest tests/integration/auth/ -v 全部通过，无崩溃

U4. 更新路由系统文档

Goal: 使 CLAUDE.md 和 AGENTS.md 中的路由系统描述与实际 RequestPreprocessor 实现匹配。

Requirements: R5

Dependencies: 无

Files:

Modify: CLAUDE.md（Request Flow 部分）
Modify: AGENTS.md（Request Flow 部分）

Approach:

将 Request Flow 部分从 "CostAwareRouter (3-layer)" 改为 "RequestPreprocessor (2-layer)"
更新层级描述：
- Layer 0: @skill:xxx 前缀 → 显式技能选择
- Layer 1: 正则快速路径（问候/闲聊/身份/知识/算术/翻译）→ DIRECT_CHAT
- Layer 2: 其他全部 → REACT（LLM 在 agent loop 中自主决策）
移除对 HeuristicClassifier、SemanticRouter、Capability matching、Vickrey Auction 的描述
添加设计决策说明（引用 request_preprocessor.py docstring 中的 rationale）
更新 ExecutionMode 描述（移除 TEAM_COLLAB 从路由层触发，改为 @team 前缀触发）

Patterns to follow: src/agentkit/chat/request_preprocessor.py 的 docstring

Test scenarios:

Test expectation: none — 纯文档变更，无行为变化

Verification: 文档中不再出现 "CostAwareRouter"、"HeuristicClassifier"、"SemanticRouter"（作为当前架构描述），grep 确认

U5. 前端添加"停止生成"按钮

Goal: 用户可在 LLM 长输出时中途取消生成。

Requirements: R6

Dependencies: 无

Files:

Modify: src/agentkit/server/frontend/src/components/chat/ChatInput.vue（添加停止按钮）
Modify: src/agentkit/server/frontend/src/stores/chat.ts（添加 cancel 状态和发送逻辑）
Test: src/agentkit/server/frontend/src/components/chat/__tests__/ChatInput.test.ts（如存在）

Approach:

在 chat store 中添加 isGenerating 状态（true 时显示停止按钮，false 时显示发送按钮）
添加 cancelGeneration() action，通过 WebSocket 发送 cancel 消息
在 ChatInput 组件中，根据 isGenerating 切换按钮：发送按钮 ↔ 停止按钮
停止按钮点击时调用 cancelGeneration()
收到 final_answer 或 error 事件时，自动将 isGenerating 置为 false

Patterns to follow: 现有 ChatInput.vue 中发送按钮的实现模式

Test scenarios:

Happy path: 空闲状态显示发送按钮，生成中显示停止按钮
Happy path: 点击停止按钮 → 发送 cancel 消息 → isGenerating 变为 false
Edge case: 生成中输入框禁用或允许继续输入（取决于现有 UX 模式）
Integration: 收到 final_answer 后 isGenerating 自动重置

Verification: npm run typecheck 通过，手动验证按钮切换和取消功能

Risks & Dependencies

风险	影响	缓解措施
U1 修改 whoami 可能影响现有冷启动流程	中	集成测试覆盖所有 whoami 场景
U5 修改 chat store 可能影响 WebSocket 状态机	中	仅添加状态，不修改现有流转
U4 文档更新可能遗漏其他引用 CostAwareRouter 的位置	低	grep 全仓库搜索 CostAwareRouter

System-Wide Impact

U1: 影响 whoami 端点行为，前端 auth store 的冷启动流程需验证兼容性
U2: 影响 admin 会话查询结果，数据更准确
U3: 仅测试代码，无生产影响
U4: 仅文档，无代码影响
U5: 影响聊天 UI 交互，需验证 WebSocket cancel 消息处理

10 KiB Raw Blame History Unescape Escape

P0 安全修复与代码质量优化计划

Summary

Problem Frame

Requirements

Key Technical Decisions

KTD-1: whoami 冷启动 refresh token 处理策略

KTD-2: 停止生成的实现方式

Scope Boundaries

In Scope

Deferred to Follow-Up Work

Implementation Units

U1. 修复 whoami 冷启动安全漏洞

U2. 修复 list_active_by_provider 过滤过期会话

U3. 修复 asyncio.run 测试崩溃 + 补强集成测试

U4. 更新路由系统文档

U5. 前端添加"停止生成"按钮

Risks & Dependencies

System-Wide Impact

10 KiB

Raw Blame History