chiguyong
92fc38de7e
fix(review): Wave 3 code review fixes
...
P1: bash/shell tool name mismatch. PhasePolicy whitelist used "bash" but
ShellTool registers as "shell". The bash_command_filter was dead code
(never matched the real tool name). Fixed in phase.py whitelist,
react.py filter check, agentkit.yaml config, and all tests.
P1: AdvancePhaseTool missing import in tools/__init__.py. Was in
__all__ but never imported. Added the import.
P2: chat.py phase policy error message echoed verbatim to WS client.
Truncated to 200 chars to match nearby error paths and avoid leaking
config internals.
P2: policy_from_config rebuilt PhasePolicy 3x via full-field copy.
Replaced with dataclasses.replace() so new PhasePolicy fields are not
silently dropped in future reconstructions.
ce-code-review (mode:agent) step of LFG pipeline.
2026-06-30 05:08:03 +08:00
chiguyong
ce6eb004a0
refactor(advance_phase): simplify previous_phase capture
...
Remove over-engineered _previous_value static method that did index
math on a hardcoded phase list. Instead, capture the previous phase
before the transition — clearer intent, fewer lines, same behavior.
ce-simplify-code step of LFG pipeline.
2026-06-30 05:06:38 +08:00
chiguyong
7869caddc7
feat(U4): G6 PLAN_EXEC wiring at chat WebSocket path
...
- PLAN_EXEC branch builds PhasePolicy from ServerConfig.plan_exec
- Empty config → default_policy(); enabled=False → falls back to REACT
- Bad config → error event sent, returns early (no engine constructed)
- ReActEngine created with phase_policy; AdvancePhaseTool registered
- phase_changed events emitted on phase transitions (PLAN_EXEC only)
- REST send_message with execution_mode=plan_exec → HTTP 501 (KTD4)
- REWOO/REFLEXION/TEAM_COLLAB still fall back to REACT (no regression)
- 9 unit tests covering REST 501, characterization, happy path, edge cases, error path, phase_changed events
2026-06-30 00:16:39 +08:00
chiguyong
c0a44b413b
feat(U3): G6 AdvancePhaseTool + ReActEngine phase enforcement
...
- AdvancePhaseTool calls engine.advance_phase(), returns new phase or error
- ReActEngine.__init__ accepts phase_policy param (None = no enforcement, backward compat)
- _current_phase + _steps_in_phase fields track state machine
- advance_phase() transitions PLANNING → BUILDING → VERIFICATION → DELIVERY
- _check_phase_permission() returns structured error dict if tool blocked
- _execute_tool checks phase before dispatch (advance_phase name bypasses)
- Auto-advance safety net via _maybe_auto_advance() + auto_advance_after_steps
- Phase reset in reset() method
- 27 unit tests covering characterization, permission, transitions, auto-advance, tool integration
2026-06-30 00:07:58 +08:00
chiguyong
abf758fa9c
feat(U2): G6 PhaseState + PhasePolicy + ServerConfig.plan_exec
...
- PhaseState enum (PLANNING/BUILDING/VERIFICATION/DELIVERY) with next_of/from_string
- PhasePolicy dataclass with whitelist + bash_command_filter + auto_advance_after_steps
- default_policy() factory — KTD5 whitelist matching R24 (Planning: search/read_file;
Building: write_file; Delivery: wildcard)
- bash_command_filter blocks rm/mv/cp/>/>> in PLANNING/VERIFICATION phases
- policy_from_config() parses plan_exec YAML section (R26) with override merge
- ServerConfig.plan_exec field + from_dict parsing (extends Wave 1/2 pattern)
- agentkit.yaml gains commented plan_exec section (opt-in)
- 37 unit tests covering PhaseState, default_policy, is_tool_allowed,
bash filter, config parsing, and ServerConfig integration
2026-06-29 23:58:56 +08:00
chiguyong
50885fbc62
feat(U1): G5 SymbolExtractor + ReadFileTool with symbol slicing
...
- SymbolExtractor protocol + SymbolSpan dataclass
- AstSymbolExtractor for Python (stdlib ast, no tree-sitter dep — KTD1)
- RegexSymbolExtractor for TS/JS/Go/Rust/Java (language-aware regex)
- ReadFileTool with path/symbol/start_line/end_line params
- symbol=None returns full file (characterization baseline matching _FakeTool)
- symbol='foo' returns first matching symbol's line range
- symbol not found returns available_symbols list (soft error)
- Unsupported extension returns full file with note
- Manual start_line/end_line overrides symbol
- 66 unit tests covering R22/R23 + characterization + edge cases
2026-06-29 23:54:44 +08:00
chiguyong
23f7448d55
docs(plan): Wave 3 strategic coupling plan (G5/G6)
2026-06-29 23:48:45 +08:00
chiguyong
80b02f58a6
feat(U3): G7 三级 fallback 链路接通 chat REST
...
Test / backend-test (pull_request) Has been cancelled
Details
Test / frontend-unit (pull_request) Has been cancelled
Details
Test / api-e2e (pull_request) Has been cancelled
Details
Test / frontend-e2e (pull_request) Has been cancelled
Details
- 新增 agentkit/server/_fallback_chain.py: execute_with_fallback_chain
Main (ReActEngine) → Recovery (ReflexionEngine) → Emergency (EmergencyRules)
- chat.py send_message 用 chain 包装 react_engine.execute (KTD5)
- ReflexionEngine 内部 ReAct 调用不走 chain (避免递归)
- TaskCancelledError 直接传播, 不进入 Emergency (KTD3)
- soft failure (empty_fallback/verify_failed) 也触发 Recovery
- Recovery 失败/异常 → Emergency 用 EmergencyRules.classify 分类
- ServerConfig.from_dict 读取 fallback_chain.{recovery,emergency}
- 17 个测试覆盖 Main/Recovery/Emergency 三层 + 配置
2026-06-29 23:07:38 +08:00
chiguyong
b1841ce21b
feat(U4): G9 PlanPhase rollback + RollbackExecutor
...
- PlanPhase 新增 validation_command / rollback_command 可选字段 (KTD6 opt-in)
- to_dict 仅在字段非 None 时输出新键,保持既有 dict shape (KTD6 契约)
- 新增 RollbackExecutor (orchestrator/rollback.py) 复用 VerificationLoop
subprocess 模式,绕过 ShellTool 避免 confirm_callback 拦截 (KTD7)
- TeamOrchestrator._run_phase_rollback 实现 R21 顺序:
validation → rollback → checkpoint.save (仅在前者通过时调用)
- ServerConfig.from_dict 读取 rollback.default_timeout
- 20 个测试覆盖 characterization / happy / timeout / git integration / 配置
2026-06-29 22:55:08 +08:00
chiguyong
5b2377469a
feat(U2): G7 Emergency 规则模板 + TaskResult.error_struct
...
新增 EmergencyError 数据类(stable error_code + 中文 message + suggestions
+ retryable + original_error)和 EmergencyRules 规则分类器(纯规则,无 LLM):
- TaskTimeoutError → timeout (retryable)
- LoopDetectedError → loop_detected (retryable)
- LLMProviderError → llm_failure (retryable)
- Exception → internal_error (not retryable)
- TaskCancelledError 不分类(调用方须先检查,否则 ValueError)
TaskResult 新增并行 error_struct 字段(默认 None 保持既有契约)。
to_dict 当 error_struct=None 时不输出该键(byte-for-byte 兼容)。
- fallback.py 既有 3 常量不变(EMPTY_LLM_RESPONSE/MAX_STEPS_REACHED/SHELL_NO_OUTPUT)
- 支持 config 覆盖(suggestions/retryable/message 按 error_code 分组)
- 27 测试覆盖分类/序列化/配置覆盖/契约保持
2026-06-29 22:43:22 +08:00
chiguyong
8d5ccca604
feat(U1): G4 ContextCompressor 辅助 LLM 路由
...
_summarize 优先尝试 auxiliary_model(成本优化的廉价模型,如 qwen-turbo),
失败或返回空内容(Finding 4 反模式)时回退到主模型,主模型失败仍走
_simple_summary 兜底。auxiliary_model=None 时保持既有单模型调用行为。
- ContextCompressor 新增 auxiliary_model 参数
- LLMConfig 新增 auxiliary_model 字段,ServerConfig._build_llm_config 透传
- agentkit.yaml 文档化 llm.auxiliary_model: fast(注释,保留默认行为)
- 测试: 9 场景覆盖成功/空内容/异常/双向失败/aux=main 跳过/审计字段/配置接线
2026-06-29 22:37:14 +08:00
chiguyong
88bfe71d30
docs(plan): Wave 2 medium-coupling plan (G4/G7/G9)
2026-06-29 22:30:51 +08:00
Fischer
78ed93fc81
Merge pull request 'feat(agent): Wave 1 quick wins (G1/G2/G3/G8) + review fixes' ( #4 ) from feat/agent-wave1-quick-wins into main
Deploy to Production / deploy (push) Waiting to run
Details
Test / backend-test (push) Waiting to run
Details
Test / frontend-unit (push) Waiting to run
Details
Test / api-e2e (push) Waiting to run
Details
Test / frontend-e2e (push) Waiting to run
Details
2026-06-29 22:08:56 +08:00
chiguyong
d7ca6e8065
fix(review): W1 ServerConfig from_dict wiring, W3 internal kwargs filter, N3 status docstring
...
Test / backend-test (pull_request) Has been cancelled
Details
Test / frontend-unit (pull_request) Has been cancelled
Details
Test / api-e2e (pull_request) Has been cancelled
Details
Test / frontend-e2e (pull_request) Has been cancelled
Details
Code review fixes for Wave 1:
- W1: ServerConfig.from_dict now wires prompt_cache/streaming/verification sections
from YAML to constructor (previously these params existed but were never read)
- W3: Tool._validate_input filters _-prefixed kwargs (e.g. _skip_dangerous_check)
before jsonschema.validate, preventing additionalProperties:false schemas from
rejecting internal control parameters
- N3: ReActResult.status docstring now lists "empty_fallback" and "verify_failed"
Added test test_internal_kwargs_underscore_prefixed_skipped_by_validation for W3.
2026-06-29 21:58:40 +08:00
chiguyong
cd211c6cd9
feat(U4): G1 verify 失败回灌 ReAct
...
- ReActEngine 新增 max_reinjections 构造参数(默认 1,=0 等价原行为)
- execute()/execute_stream() verify 块从循环后移到循环内 final-answer 检测点:
- verify 通过 → 正常 break
- verify 失败 + reinjections < max + step < max_steps → errors 作为 user 消息回灌 conversation, continue 让 LLM 自纠正
- verify 失败 + 达到 max_reinjections 或 max_steps → 记录 verify log 到 trajectory, trace_outcome="verify_failed", break
- execute_stream 的 final_answer 事件在 verify 通过后才 yield,避免客户端过早收到完成信号
- ReActResult.status 现在传递 trace_outcome(原默认 "success")
- ServerConfig.verification 配置项(max_reinjections)
- test_verify_reinjection.py 10 测试:characterization(max=0)+ 新行为(R1/R2/R3/R14)
2026-06-29 21:35:08 +08:00
chiguyong
0f3f0a7550
feat(U3): G8 delta_flush_interval 调速
...
- ReActEngine 新增 flush_interval_ms 构造参数(默认 0 = 逐 chunk yield 向后兼容)
- execute_stream chunk 循环用 time.monotonic 节流,累积 _flush_buffer 批量 yield
- flush_interval_ms=0 条件短路为 True 逐 chunk yield 保当前行为
- 流结束 mid-interval 最终 flush 剩余 buffer 不丢字符
- ServerConfig.streaming 配置项(flush_interval_ms)
- test_delta_flush.py 覆盖 R11/R12/R14
2026-06-29 20:49:52 +08:00
chiguyong
c4aaef05aa
feat(U2): G2 prompt cache 双块结构
...
- ReActEngine 新增 _build_system_message(stable+volatile) 双块构造
- Anthropic provider 返回 content blocks,stable 块带 cache_control
- 非 Anthropic provider 返回字符串拼接,依赖 stable 前缀命中自动前缀缓存
- execute_stream/execute 记忆注入从 system_prompt 末尾移到 volatile 层
- LLMGateway.get_provider_name_for_model 暴露 provider 检测能力
- anthropic.py _convert_messages 支持 list-type system content 透传
- ServerConfig.prompt_cache 配置项(默认 enable=True)
- ReActEngine.prompt_cache_enable 构造参数(默认 True 保当前行为)
- test_prompt_cache_layers.py 覆盖 R4-R7/R13
2026-06-29 20:47:23 +08:00
chiguyong
c66a7773b5
feat(U1): G3 工具调用 schema 校验
...
- base.py 新增 ToolValidationError(error_code/details)与 _validate_input
- safe_execute 在 execute 前用 jsonschema.validate 校验 kwargs
- input_schema=None 跳过校验保持向后兼容
- _execute_tool 优先捕获 ToolValidationError 保留 error_code
- function_tool._infer_schema 修复 VAR_KEYWORD/VAR_POSITIONAL 误入 schema
- test_tool_schema_validation.py 覆盖 R8-R10
2026-06-29 20:34:14 +08:00
chiguyong
2747bb4e64
chore(prior): malformed tool call handling, auth whitelist, dev scripts, wave1 plan
2026-06-29 20:25:03 +08:00
Fischer
6e65352df8
Merge PR #3 : feat(bitable): 多维表格文件层 + 默认字段 + 表内字段操作 (Stage 1)
...
Deploy to Production / deploy (push) Waiting to run
Details
Test / backend-test (push) Waiting to run
Details
Test / frontend-unit (push) Waiting to run
Details
Test / api-e2e (push) Waiting to run
Details
Test / frontend-e2e (push) Waiting to run
Details
合并 feat/bitable-ui-stage1 到 main — 多维表格 UI 完整性 Stage 1(U1-U6)+ ce-code-review P0/P1 修复
2026-06-29 09:25:30 +08:00
chiguyong
a6e1bf5884
feat(bitable): 多维表格文件层 + 默认字段 + 表内字段操作 + ce-code-review 修复 (Stage 1)
...
Test / backend-test (pull_request) Has been cancelled
Details
Test / frontend-unit (pull_request) Has been cancelled
Details
Test / api-e2e (pull_request) Has been cancelled
Details
Test / frontend-e2e (pull_request) Has been cancelled
Details
实现多维表格 UI 完整性 Stage 1(U1-U6),补齐飞书/twenty 对齐缺失的文件层、
默认字段与表内字段操作能力,并修复 ce-code-review 走查发现的 P0/P1 级问题。
后端(U1-U2):
- 新增 BitableFile 实体(models/db/repository/service/routes),三级层级:文件→数据表→字段/记录
- Schema V2 迁移:bitable_files 表 + tables.file_id 列,幂等(IF NOT EXISTS),保留 V1 孤儿表
- 新建数据表自动创建 5 个默认字段(标题/状态/日期/创建人/创建时间)
- agent-owned 字段在 create_record 时自动填充(按 type+owner 匹配,传 actor_user_id)
- 7 个文件 REST 端点 + IDOR ownership 检查(404-before-403,internal token 旁路)
前端(U3-U5):
- 文件列表页(FileCard 网格 + 新建/重命名/删除)+ 文件详情页(侧栏表格列表 + vxe-table 网格)
- Vue Router 嵌套路由 /bitable → /bitable/:fileId → /bitable/:fileId/:tableId
- 列头菜单(编辑/隐藏/删除字段)+ 末尾 + 列新增字段
- select/multiselect 字段自定义单元格编辑器 + Tag 展示
- Pinia store 扩展 file 状态与动作,深链直访回退 getFile,fileId 切换 watch
测试(U6):
- 文件 CRUD(12 例)+ 默认字段(10 例)单元测试
- 3 个 E2E spec(视图加载、文件流、字段操作),后端不可用时优雅跳过
ce-code-review 修复(P0/P1):
- P0 路由冲突:GET /files/{file_id} 遮蔽下载端点 → 下载改 /uploads/{filename}
- P0 IDOR:update/delete field/record/view 五端点补 ownership 检查
- P1 is_initialized property 缺失致二次初始化崩溃
- P1 直接 URL 导航失效(files 数组为空)→ selectFile 回退 getFile
- P1 fileId 切换不重载 → 增加 watch
- P1 轮询丢弃最终公式值(wasCalculating 守卫)+ 复用视图 filters
- P1 测试断言 200→201;test_db 无 URL 用例解除 postgres 标记得以执行
- P2 _check_table_ownership 403→404;输入长度校验;upload field-table 一致性校验
- P2 multiselect 浅比较 → 深比较;E2E bitable-view 补 waitForServer 守卫
验证:ruff check 通过;pytest 91 passed/116 skipped;vue-tsc --noEmit 通过。
2026-06-29 04:07:45 +08:00
chiguyong
f476d3339c
Merge branch 'test/calendar-ui-manual-testing' — 修复 agent 创建日历事件后 UI 不刷新 + 三根因文档三部曲 + E2E 测试套件
Deploy to Production / deploy (push) Waiting to run
Details
Test / backend-test (push) Waiting to run
Details
Test / frontend-unit (push) Waiting to run
Details
Test / api-e2e (push) Waiting to run
Details
Test / frontend-e2e (push) Waiting to run
Details
2026-06-29 02:23:20 +08:00
chiguyong
5c15238a5a
fix(calendar): 修复 agent 创建日历事件后 UI 不刷新 + 文档化三根因三部曲
...
Test / backend-test (pull_request) Has been cancelled
Details
Test / frontend-unit (pull_request) Has been cancelled
Details
Test / api-e2e (pull_request) Has been cancelled
Details
Test / frontend-e2e (pull_request) Has been cancelled
Details
代码修复 (ce-debug):
- CalendarService.create_event 注入 notify_callback,成功后广播 calendar_event_created WS 消息
- app.py 调整 _calendar_ws_sender 闭包定义顺序,注入 CalendarService(与 ReminderScheduler 共享)
- tauri-auth.ts keychain fallback 修复(localStorage 始终作为备份)
- 新增 2 个广播回归测试
文档 (ce-compound + ce-compound-refresh):
- 新增 docs/solutions/ui-bugs/calendar-agent-create-no-refresh.md(第三根因:WS 广播缺失)
- 更新 calendar-capability-and-ui-fixes.md:刷新 test count + 加 Related Issues 前向引用
- 更新 jwt-secret-dev-mode-user-id-mismatch.md:扩展 e2e bullet + 加第三个根因引用
- CONCEPTS.md 新增 Service Broadcast Callback 条目 (Real-Time Fan-Out 节)
测试:
- 新增 E2E 测试套件 (admin/auth-persistence/bitable/calendar/conversation/documents/evolution/settings/skills)
- 新增 tests/e2e/test_api_coverage.py
- CI: .gitea/.github workflows/test.yml
2026-06-29 02:20:33 +08:00
chiguyong
d27681a93c
fix(portal-auth): 修复 dev mode JWT 验证误激活 + README 文档同步
...
## Portal 401 根因修复
问题:AGENTKIT_JWT_SECRET 未设置时,jwt_utils 生成 ephemeral 非空 secret,
该 secret 被传给 AuthMiddleware 后 _is_dev_mode() 返回 False(not "" = False),
导致无 JWT/API key 的请求被拒为 401(17 个 portal 测试失败)。
修复:分离 explicit_jwt_secret 与 jwt_secret —
- explicit_jwt_secret = get_jwt_secret() # None when env unset
- jwt_secret = explicit_jwt_secret or get_or_create_jwt_secret() # for signing
- AuthMiddleware(jwt_secret=explicit_jwt_secret or "") # only explicit activates JWT verify
ephemeral secret 仅供 token 签名 routes,不激活 middleware 的 JWT 验证。
生产环境(AGENTKIT_JWT_SECRET 已设置)行为不变。
验证:
- _is_dev_mode(): False → True
- GET /api/v1/portal/conversations: 401 → 200
- 27 个 portal 测试全部通过(之前 17 失败)
- 232 个测试通过 (portal + auth + calendar),0 失败
## README 文档同步
代码中 CostAwareRouter / RegexRules / HeuristicClassifier / SemanticRouter / LLMClassifier
类已完全删除,仅 RequestPreprocessor 存在。README.md 6 处过时引用同步:
- 第 4 节"意图路由"改为引用 RequestPreprocessor(详见第 7 节)
- 第 7 节重写为"请求预处理(RequestPreprocessor)",按 AGENTS.md 架构描述
- 第 8 节"语义路由"删除(合并入第 7 节历史说明)
- 架构图 CostAwareRouter → RequestPreprocessor,22→28 路由模块
- 模块详解 chat/skill_routing + chat/semantic_router 合并为 chat/request_preprocessor
- 模块详解 router/intent 描述更新为"未接入 chat 流程"
- 目录注释 CostAwareRouter → RequestPreprocessor
- 章节重新编号 1-16 连续(原 1-17 跳过 9)
2026-06-28 15:26:42 +08:00
chiguyong
c9ce15fa4b
fix(code-review): 修复走查发现的 13 High + Medium 安全/可靠性问题
...
代码修复(8 High + 9 Medium):
- portal.py — C1 IDOR 文档 / C2 类型修复 / C3 WS 连接上限 16 / C4 ws_user_id 早初始化 / M silent swallow 日志化
- auth/middleware.py — C5 WS sid 补齐
- calendar_tool.py — C6 偏移量 ±43200 双向校验 + reminder_channels 类型/白名单校验
- sqlite_conversation_store.py — C7 DELETE 事务回滚
- chat.ts (Pinia) — C8 deleteConversation 清理 pending 缓存
- app.py — M except: pass → logger.debug(exc_info=True)
- Scene6Error.vue — M onUnmounted 清理 setTimeout
- DocumentsTab.vue — M Invalid Date 守卫
- ChatSidebar/RightPanel/TopNav.vue — M aria-label 无障碍标签
- SystemMonitorPanel.vue — M v-else 兜底 + active 边框色 + tablist 键盘导航
- CalendarDrawer.vue — M overflow-y: auto
- CalendarGrid.vue — M ResizeObserver 反馈循环防护
- SkillsTab.vue — M onMounted 始终 fetchSkills
文档修复(5 High + 6 Medium):
- portal-platform-security-reliability-fixes.md — D2 测试路径 / D3 Root Cause+Impact 章节 / D4 severity: mixed / 标题中文化 / 12 处绝对路径转相对 / P2 #12 数字口径
- AGENTS.md — D5 路由表 22→28 / 专家模板 5→15 / LiteLLM U15 迁移 / 配置查找 fallback
- README.md — 8 处端口 8000→8001
新增测试:
- tests/unit/calendar/test_calendar_tool.py — ponytail 自检断言
验证:
- ruff check (5 文件) — All checks passed
- vue-tsc --noEmit — exit 0
- git stash baseline 验证 — portal 17 个 401 失败为预存在问题
已知限制(预存在):
- 17 个 portal 测试 401 失败 — 需另起 ce-debug 调查
- README.md 7 处 CostAwareRouter 引用过时 — 文档同步另起任务
2026-06-28 15:06:41 +08:00
chiguyong
8ae8ed4e9b
Merge branch 'feat/calendar-ui-fixes' — 日历能力缺失修复 + UI布局优化 + 会话404处理 + ce-code-review 修复
2026-06-28 14:25:44 +08:00
chiguyong
43e9025c6d
fix(calendar): 日历能力缺失修复 + UI 布局优化 + 会话404处理
...
P0: calendar_tool reminder_rules 未传入 create_event,提醒功能完全失效。P1: chat.ts deleteConversation 未清理 pending + 404 递归保护。P2: app.py 系统提示重复段落 + gui_mode F821 + SystemMonitorPanel flex 布局。P3: portal send_json 快照 + WS connected 清除 is_local + 移除死代码。验证: ruff+pytest 98passed+typecheck 通过。
2026-06-28 14:24:58 +08:00
chiguyong
31c65e01b8
fix(security): P0 安全加固 + 多实例部署一致性 (U1-U4 + U5c)
...
Deploy to Production / deploy (push) Has been cancelled
Details
U1: LLM gateway KB 缓存 fail-closed — 异常时默认禁用缓存防止 KB 数据泄漏
U2: MCP 危险工具黑名单过滤 — 6+1 端点覆盖,防止绕过 chat confirmation
U3: SecretsStore Redis 迁移 — 多 worker 共享凭证,内存降级保留开发模式
U4: channels webhook Redis 状态 — ZSET 滑动窗口限流 + nonce dedup + backpressure
U5c: ce-code-review 修复批次:
- P0: 统一 MCP 黑名单与 publisher.py 一致 (terminal_execute -> terminal, +file_read)
- P1: ZSET 限流 member 加 uuid 后缀避免同时间戳碰撞
- P1: SecretsStore redis 参数 Any -> aioredis.Redis | None (AGENTS.md 合规)
- P1: Redis client 添加 socket_timeout 防止单点故障请求挂死
测试: 171 scoped tests pass, ruff clean
2026-06-26 04:05:33 +08:00
chiguyong
c62d435c43
Merge branch 'feat/portal-platform-evolution' — portal platform evolution (U1-U17 + RAG + channels + LiteLLM + ce-code-review fixes + ce-compound doc)
Deploy to Production / deploy (push) Waiting to run
Details
2026-06-26 01:48:19 +08:00
chiguyong
75e9b58e46
docs(ce-compound): 记录 portal-platform 安全/可靠性修复批次
...
记录 ce-code-review 修复批次(commit 53faa60)的 10 个 P1/P2/P3 修复:
- P1: WeCom 重放、缓存跨用户泄漏、webhook 异常风暴、shutdown 泄漏
- P2: Feishu TTL、无界任务集、配额 N+1、冗余 SHA-256、未用参数
- P3: DIRECT_CHAT 去重
新增 docs/solutions/security-issues/portal-platform-security-reliability-fixes.md
CONCEPTS.md 补充 3 个领域术语:Per-User Cache Namespace、Webhook Signature Freshness、Webhook Backpressure
2026-06-26 01:47:57 +08:00
chiguyong
53faa60472
fix(review): ce-code-review P1+P2 修复 — 安全/可靠性/性能
...
P1 安全与可靠性(4 项):
- wecom: verify_signature 增加时间戳新鲜度校验(5 分钟窗口防重放)
- cache: should_cache 在 per_user_namespace 开启时拒绝 user_id=None
匿名请求,避免跨用户缓存泄漏(安全要求 a/e)
- channels: webhook receive_message 异常兜底,防止 500 触发平台重试风暴
- app: shutdown 调用 close_all_adapters + await _pending_webhook_tasks,
防止 httpx 连接泄漏和丢失 IM 回复
P2 效率与可维护性(5 项):
- feishu: _TOKEN_CACHE_TTL 300 → 6900(2h 减 5min 余量,避免 24x 过频刷新)
- channels: _pending_webhook_tasks 有界化(2x 并发上限时 429 拒绝)
- gateway: quota 检查每 period 单次 get_usage,复用 summary 检查 token+cost
- cache_key: generate_cache_key 合并为单次 SHA-256(消除 8-10 次冗余哈希)
- config: ProviderConfig.get_api_key 移除未用的 secrets_store 参数
P3 去重(1 项):
- channels: _process_inbound_message DIRECT_CHAT 路径提取 _direct_chat 辅助函数
测试:
- test_wecom: 时间戳改用 int(time.time()),新增 test_expired_timestamp_rejected
- test_cache: should_cache 测试覆盖匿名拒绝 + namespace_off 兼容
- test_config_migration: get_api_key 测试适配新签名
- channels/config_migration/quota_enforcement 测试全部通过
2026-06-26 01:40:31 +08:00
chiguyong
1ccaf56b9a
refactor: ce-simplify-code 审查修复 — 去重 + 效率 + 死代码清理
...
3 个审查代理(复用/质量/效率)发现 15 个问题,全部修复:
效率与安全(6 项):
- MCPClient 缓存 MultiServerMCPClient 单例 + aclose(),修复连接/子进程泄漏
- _rate_limits 清理空 IP 条目,修复 X-Forwarded-For 欺骗下内存泄漏
- _seen_nonces 改用 OrderedDict,O(1) 摊销过期清理
- webhook 后台任务加 Semaphore(20) + 任务引用追踪,限制无界并发
- _build_adapter 用 asyncio.gather 并行解密 secrets
- 适配器实例缓存(_adapter_cache),token TTL 缓存跨请求命中
去重(4 项):
- header_get 提取到 channels/base.py,4 个适配器统一 import
- _get_client/close() 移入 MessageAdapter 基类,子类继承
- URLVerificationChallenge 统一到 base.py,feishu/slack/wecom 共用
- Transport ABC 添加 endpoint_url 属性,from_transport 不再访问私有字段
死代码与类型安全(5 项):
- detect_cache_hit 死方法替换为 record_cache_result 公开 API
- execution_mode.value == "direct_chat" 改用枚举比较
- 删除 yielded_any 死变量、重复 from fastapi import Request、
多余 getattr 防御
453 tests passed, ruff clean(预存 F841 非本次引入)
2026-06-25 23:54:14 +08:00
chiguyong
793476cafa
feat(llm): U17 — LiteLLM 语义缓存替换 + per-user/ACL scope 安全隔离
...
- 新增 LitellmCacheManager:配置 litellm.cache 全局,三级后端 fallback
(RedisSemanticCache -> RedisCache -> InMemoryCache),redisvl lazy import
- cache_key 扩展 user_id + kb_acl_hash 参数(安全要求 a/b/e)
- gateway 集成:读取 KB caching_disabled flag(安全要求 c),构建带 scope
的 cache_key,命中时 cost=0
- LLMResponse 新增 cache_hit 字段;LLMRequest 新增 cache 参数
- litellm_provider 透传 cache 参数 + 检测 _hidden_params 缓存命中
- 33 个新测试覆盖 13 场景(含 User A != User B 缓存隔离)
- 旧 InMemoryLLMCache/RedisLLMCache 保留向后兼容
2026-06-25 22:49:59 +08:00
chiguyong
86541d7172
feat(mcp): U16 — langchain-mcp-adapters client replacement + transport deprecation
...
- 重写 MCPClient:URL scheme 自动检测(stdio/http/sse)→ langchain config
- 旧 Transport 注入路径保留(DeprecationWarning),向后兼容
- transport.py 模块级弃用警告
- 28 个新测试覆盖 URL 检测、list_tools、call_tool、legacy 路径、ImportError
- 修复 manager.py / transport.py 预存 F401/F841
2026-06-25 22:04:37 +08:00
chiguyong
069dbc22b1
feat(llm): U15 — LiteLLM unified provider + api_key encrypted secrets migration
2026-06-25 21:41:15 +08:00
chiguyong
13c516a54f
feat(mcp): U14 — Skill/Team MCP publish with admin auth + dangerous-tool opt-in
2026-06-25 21:10:06 +08:00
chiguyong
16c33be295
feat(mcp): U13 — refactor MCPServer to route factory + mount at /api/v1/mcp with auth
2026-06-25 20:58:41 +08:00
chiguyong
8998f94c42
feat(channels): U12 — DingTalk/WeCom/Slack adapters + multi-channel webhook dispatch
2026-06-25 20:45:43 +08:00
chiguyong
4b58e8f661
feat(channels): U11 — Feishu IM adapter end-to-end (webhook + signature + AES-CBC decrypt + chat integration)
2026-06-25 20:24:21 +08:00
chiguyong
5572387c01
feat(channels): U10 — message adapter ABC + AES-256-GCM secrets store + channel CRUD routes
2026-06-25 20:13:37 +08:00
chiguyong
af96cb49bd
docs(plan): deepen portal platform evolution plan — KTD5/7/8/9 expanded, KTD11 added
2026-06-25 20:13:27 +08:00
chiguyong
864bb95a30
feat(server): wire rag_platform components to app.state lifespan
...
Initialize in lifespan() (after bitable, before yield):
- KBStore + ensure_tables() → app.state.kb_store (if database_url available)
- RetrievalEngine + vector_store → app.state.retrieval_engine (if database_url available)
- HitProcessor → app.state.hit_processor (with llm_gateway)
- TaskManager → app.state.task_manager (degraded mode, InMemoryTaskStore)
- KBSettingsStore → app.state.kb_settings_store (singleton)
Each component wrapped in try/except — failures logged but don't block startup.
Follows same pattern as episodic memory initialization.
2026-06-25 20:02:01 +08:00
chiguyong
1f691ca178
feat(frontend): U9 — KB management extension with segment preview, status display, settings
...
New: SegmentPreview.vue, KBSettings.vue
Extended: DocumentUpload.vue (status badges, retry, preview), SearchTest.vue (3 modes), SourceConfig.vue (ACL), KnowledgeBaseView.vue (settings + task history tabs)
API+Store: kb.ts new types/methods, knowledge.ts new state/actions
typecheck: passed
2026-06-25 13:14:58 +08:00
chiguyong
e3ae2f3a56
feat(rag_platform): U8 — TaskIQ async task integration
...
Add tasks.py: TaskManager with vectorize/batch_index tasks, per-user concurrency limits, degraded mode (sync execution without broker), WorkerSweeper for timeout detection, error message sanitization
Add taskiq>=0.11 and taskiq-redis>=0.5 to pyproject.toml
Task parameter schema validation (VectorizeTaskParams, BatchIndexTaskParams)
Tests: 41 new tests, 289 total passing
2026-06-25 12:58:51 +08:00
chiguyong
d026a91f43
feat(rag_platform): U6 — hit processing mode + KB settings
...
Add hit_processing.py: HitProcessor with model_opt (LLM-generated) and direct (concatenated chunks) modes, with in-process cache
Add settings.py: KBSettings/KBSettingsUpdate models, KBSettingsStore with async CRUD
Add KB settings endpoints to kb_management.py: GET/PUT /kb-management/kbs/{kb_id}/settings with owner-only modification
Tests: 43 new tests (25 hit_processing + 18 settings), 293 total passing
2026-06-25 12:44:47 +08:00
chiguyong
5c562dbff3
feat(rag_platform): U5 — rerank + question generation + termbase
...
Add rerank.py: Reranker with Cohere/BGE provider support, data export risk annotation, graceful degradation
Add question_gen.py: LLM-based question generation following ContextualChunker pattern, with caching
Add termbase.py: jieba custom dictionary management, add/remove/load terms
Tests: 58 new tests (14 rerank + 19 question_gen + 25 termbase), 205 total passing
2026-06-25 12:31:43 +08:00
chiguyong
fb9f16d6e5
feat(rag_platform): U4 — dual-index retrieval (pgvector semantic + PG fulltext jieba)
...
Add fulltext.py: jieba tokenization + tsvector write/query
Add retrieval.py: RetrievalEngine with embedding/keywords/blend modes
Update models.py: add RetrievalRequest model
Tests: 35 new tests, 147 total passing
2026-06-25 12:20:48 +08:00
chiguyong
3f9588e673
feat(rag_platform): U3+U7 — rewrite upload endpoint with sanitization + pipeline
...
Rewrite upload_document() to use rag_platform sanitize + DocumentProcessor:
- File type whitelist validation (8 allowed types, reject .exe/.sh)
- File size limit (50MB) + zip bomb detection for ZIP-based formats
- DocumentProcessor.parse() (with content sanitization) + segment()
- Return chunks preview, status="segmenting" (pending vectorization)
Add POST /kb-management/documents/preview endpoint:
- Pre-upload preview with adjustable chunk_size/chunk_overlap
- Same security validation as upload, no document record created
Add POST /kb-management/documents/{id}/vectorize placeholder:
- Returns 503 — full async vectorization deferred to U8 (TaskIQ)
Test: update test_upload_document assertion (status "indexed" → "segmenting")
2026-06-25 12:06:16 +08:00
chiguyong
b55c896794
feat(rag_platform): U3+U7 — document processing pipeline + upload security
...
U3: Document processing pipeline (document_processor.py)
- DocumentProcessor class wrapping parse → segment → vectorize
- parse() uses memory/document_loader.py for multi-format extraction
- segment() uses LlamaIndex SentenceSplitter
- preview() returns chunks for read-only preview (no vectorization)
- vectorize() embeds chunks and stores in pgvector (all-or-nothing)
- process() orchestrates full pipeline with status transitions:
pending → parsing → segmenting → vectorizing → indexed | failed
U7: Upload security & content sanitization (sanitize.py)
- ALLOWED_FILE_TYPES whitelist (pdf/docx/xlsx/pptx/txt/md/csv/html)
- MAX_FILE_SIZE 50MB limit
- validate_file_type() / validate_file_size() guards
- check_zip_bomb() for ZIP-based formats (ratio > 100:1 or > 500MB)
- check_image_bomb() for pixel count > 100MP (PNG/JPEG/GIF header parsing)
- is_safe_ip() SSRF protection (loopback/RFC1918/link-local/ULA denied)
- sanitize_markdown() removes dangerous HTML tags (script/iframe/object/embed)
- sanitize_content() main entry point for text format sanitization
- parse_xml_safe() XXE protection (forbid_dtd/forbid_entities/forbid_external)
Preview API (preview.py)
- PreviewChunk / PreviewResult Pydantic models
- generate_preview() returns read-only segmentation preview
Tests: 112 tests passing (45 new + 67 existing)
- test_sanitize.py: file type/size, markdown sanitization, SSRF, zip/image bomb
- test_document_processor.py: parse/segment, preview, vectorize, failure status
2026-06-25 11:21:42 +08:00
chiguyong
c1a21f57a1
feat(rag_platform): U2 — KB persistence + per-KB ACL
...
Add PostgreSQL-backed KB store replacing in-memory KnowledgeSourceStore:
- models.py: ORM models (KBModel, DocumentModel, KBAclModel) using
SQLAlchemy 2 DeclarativeBase + Mapped style
- store.py: KBStore with async CRUD for KBs and documents,
create_kb creates owner ACL in same transaction
- acl.py: filter_kb_by_user_acl(), grant_access(), revoke_access(),
list_acl() — follows filter_kb_sources_by_department pattern
Schema: rag_platform_kbs, rag_platform_documents, rag_platform_kb_acl
with FK CASCADE on kb_id. UniqueConstraint on (kb_id, user_id).
Tests: 23 unit tests covering KB CRUD, document operations, ACL
filtering, grant/revoke. All 37 rag_platform tests pass.
2026-06-25 11:01:04 +08:00