Commit Graph

360 Commits

Author SHA1 Message Date
chiguyong d7ca6e8065 fix(review): W1 ServerConfig from_dict wiring, W3 internal kwargs filter, N3 status docstring
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Code review fixes for Wave 1:
- W1: ServerConfig.from_dict now wires prompt_cache/streaming/verification sections
  from YAML to constructor (previously these params existed but were never read)
- W3: Tool._validate_input filters _-prefixed kwargs (e.g. _skip_dangerous_check)
  before jsonschema.validate, preventing additionalProperties:false schemas from
  rejecting internal control parameters
- N3: ReActResult.status docstring now lists "empty_fallback" and "verify_failed"

Added test test_internal_kwargs_underscore_prefixed_skipped_by_validation for W3.
2026-06-29 21:58:40 +08:00
chiguyong cd211c6cd9 feat(U4): G1 verify 失败回灌 ReAct
- ReActEngine 新增 max_reinjections 构造参数(默认 1,=0 等价原行为)
- execute()/execute_stream() verify 块从循环后移到循环内 final-answer 检测点:
  - verify 通过 → 正常 break
  - verify 失败 + reinjections < max + step < max_steps → errors 作为 user 消息回灌 conversation, continue 让 LLM 自纠正
  - verify 失败 + 达到 max_reinjections 或 max_steps → 记录 verify log 到 trajectory, trace_outcome="verify_failed", break
- execute_stream 的 final_answer 事件在 verify 通过后才 yield,避免客户端过早收到完成信号
- ReActResult.status 现在传递 trace_outcome(原默认 "success")
- ServerConfig.verification 配置项(max_reinjections)
- test_verify_reinjection.py 10 测试:characterization(max=0)+ 新行为(R1/R2/R3/R14)
2026-06-29 21:35:08 +08:00
chiguyong 0f3f0a7550 feat(U3): G8 delta_flush_interval 调速
- ReActEngine 新增 flush_interval_ms 构造参数(默认 0 = 逐 chunk yield 向后兼容)
- execute_stream chunk 循环用 time.monotonic 节流,累积 _flush_buffer 批量 yield
- flush_interval_ms=0 条件短路为 True 逐 chunk yield 保当前行为
- 流结束 mid-interval 最终 flush 剩余 buffer 不丢字符
- ServerConfig.streaming 配置项(flush_interval_ms)
- test_delta_flush.py 覆盖 R11/R12/R14
2026-06-29 20:49:52 +08:00
chiguyong c4aaef05aa feat(U2): G2 prompt cache 双块结构
- ReActEngine 新增 _build_system_message(stable+volatile) 双块构造
- Anthropic provider 返回 content blocks,stable 块带 cache_control
- 非 Anthropic provider 返回字符串拼接,依赖 stable 前缀命中自动前缀缓存
- execute_stream/execute 记忆注入从 system_prompt 末尾移到 volatile 层
- LLMGateway.get_provider_name_for_model 暴露 provider 检测能力
- anthropic.py _convert_messages 支持 list-type system content 透传
- ServerConfig.prompt_cache 配置项(默认 enable=True)
- ReActEngine.prompt_cache_enable 构造参数(默认 True 保当前行为)
- test_prompt_cache_layers.py 覆盖 R4-R7/R13
2026-06-29 20:47:23 +08:00
chiguyong c66a7773b5 feat(U1): G3 工具调用 schema 校验
- base.py 新增 ToolValidationError(error_code/details)与 _validate_input
- safe_execute 在 execute 前用 jsonschema.validate 校验 kwargs
- input_schema=None 跳过校验保持向后兼容
- _execute_tool 优先捕获 ToolValidationError 保留 error_code
- function_tool._infer_schema 修复 VAR_KEYWORD/VAR_POSITIONAL 误入 schema
- test_tool_schema_validation.py 覆盖 R8-R10
2026-06-29 20:34:14 +08:00
chiguyong 2747bb4e64 chore(prior): malformed tool call handling, auth whitelist, dev scripts, wave1 plan 2026-06-29 20:25:03 +08:00
Fischer 6e65352df8 Merge PR #3: feat(bitable): 多维表格文件层 + 默认字段 + 表内字段操作 (Stage 1)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
合并 feat/bitable-ui-stage1 到 main — 多维表格 UI 完整性 Stage 1(U1-U6)+ ce-code-review P0/P1 修复
2026-06-29 09:25:30 +08:00
chiguyong a6e1bf5884 feat(bitable): 多维表格文件层 + 默认字段 + 表内字段操作 + ce-code-review 修复 (Stage 1)
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
实现多维表格 UI 完整性 Stage 1(U1-U6),补齐飞书/twenty 对齐缺失的文件层、
默认字段与表内字段操作能力,并修复 ce-code-review 走查发现的 P0/P1 级问题。

后端(U1-U2):
- 新增 BitableFile 实体(models/db/repository/service/routes),三级层级:文件→数据表→字段/记录
- Schema V2 迁移:bitable_files 表 + tables.file_id 列,幂等(IF NOT EXISTS),保留 V1 孤儿表
- 新建数据表自动创建 5 个默认字段(标题/状态/日期/创建人/创建时间)
- agent-owned 字段在 create_record 时自动填充(按 type+owner 匹配,传 actor_user_id)
- 7 个文件 REST 端点 + IDOR ownership 检查(404-before-403,internal token 旁路)

前端(U3-U5):
- 文件列表页(FileCard 网格 + 新建/重命名/删除)+ 文件详情页(侧栏表格列表 + vxe-table 网格)
- Vue Router 嵌套路由 /bitable → /bitable/:fileId → /bitable/:fileId/:tableId
- 列头菜单(编辑/隐藏/删除字段)+ 末尾 + 列新增字段
- select/multiselect 字段自定义单元格编辑器 + Tag 展示
- Pinia store 扩展 file 状态与动作,深链直访回退 getFile,fileId 切换 watch

测试(U6):
- 文件 CRUD(12 例)+ 默认字段(10 例)单元测试
- 3 个 E2E spec(视图加载、文件流、字段操作),后端不可用时优雅跳过

ce-code-review 修复(P0/P1):
- P0 路由冲突:GET /files/{file_id} 遮蔽下载端点 → 下载改 /uploads/{filename}
- P0 IDOR:update/delete field/record/view 五端点补 ownership 检查
- P1 is_initialized property 缺失致二次初始化崩溃
- P1 直接 URL 导航失效(files 数组为空)→ selectFile 回退 getFile
- P1 fileId 切换不重载 → 增加 watch
- P1 轮询丢弃最终公式值(wasCalculating 守卫)+ 复用视图 filters
- P1 测试断言 200→201;test_db 无 URL 用例解除 postgres 标记得以执行
- P2 _check_table_ownership 403→404;输入长度校验;upload field-table 一致性校验
- P2 multiselect 浅比较 → 深比较;E2E bitable-view 补 waitForServer 守卫

验证:ruff check 通过;pytest 91 passed/116 skipped;vue-tsc --noEmit 通过。
2026-06-29 04:07:45 +08:00
chiguyong f476d3339c Merge branch 'test/calendar-ui-manual-testing' — 修复 agent 创建日历事件后 UI 不刷新 + 三根因文档三部曲 + E2E 测试套件
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
2026-06-29 02:23:20 +08:00
chiguyong 5c15238a5a fix(calendar): 修复 agent 创建日历事件后 UI 不刷新 + 文档化三根因三部曲
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
代码修复 (ce-debug):
- CalendarService.create_event 注入 notify_callback,成功后广播 calendar_event_created WS 消息
- app.py 调整 _calendar_ws_sender 闭包定义顺序,注入 CalendarService(与 ReminderScheduler 共享)
- tauri-auth.ts keychain fallback 修复(localStorage 始终作为备份)
- 新增 2 个广播回归测试

文档 (ce-compound + ce-compound-refresh):
- 新增 docs/solutions/ui-bugs/calendar-agent-create-no-refresh.md(第三根因:WS 广播缺失)
- 更新 calendar-capability-and-ui-fixes.md:刷新 test count + 加 Related Issues 前向引用
- 更新 jwt-secret-dev-mode-user-id-mismatch.md:扩展 e2e bullet + 加第三个根因引用
- CONCEPTS.md 新增 Service Broadcast Callback 条目 (Real-Time Fan-Out 节)

测试:
- 新增 E2E 测试套件 (admin/auth-persistence/bitable/calendar/conversation/documents/evolution/settings/skills)
- 新增 tests/e2e/test_api_coverage.py
- CI: .gitea/.github workflows/test.yml
2026-06-29 02:20:33 +08:00
chiguyong d27681a93c fix(portal-auth): 修复 dev mode JWT 验证误激活 + README 文档同步
## Portal 401 根因修复

问题:AGENTKIT_JWT_SECRET 未设置时,jwt_utils 生成 ephemeral 非空 secret,
该 secret 被传给 AuthMiddleware 后 _is_dev_mode() 返回 False(not "" = False),
导致无 JWT/API key 的请求被拒为 401(17 个 portal 测试失败)。

修复:分离 explicit_jwt_secret 与 jwt_secret —
- explicit_jwt_secret = get_jwt_secret()  # None when env unset
- jwt_secret = explicit_jwt_secret or get_or_create_jwt_secret()  # for signing
- AuthMiddleware(jwt_secret=explicit_jwt_secret or "")  # only explicit activates JWT verify

ephemeral secret 仅供 token 签名 routes,不激活 middleware 的 JWT 验证。
生产环境(AGENTKIT_JWT_SECRET 已设置)行为不变。

验证:
- _is_dev_mode(): False → True
- GET /api/v1/portal/conversations: 401 → 200
- 27 个 portal 测试全部通过(之前 17 失败)
- 232 个测试通过 (portal + auth + calendar),0 失败

## README 文档同步

代码中 CostAwareRouter / RegexRules / HeuristicClassifier / SemanticRouter / LLMClassifier
类已完全删除,仅 RequestPreprocessor 存在。README.md 6 处过时引用同步:

- 第 4 节"意图路由"改为引用 RequestPreprocessor(详见第 7 节)
- 第 7 节重写为"请求预处理(RequestPreprocessor)",按 AGENTS.md 架构描述
- 第 8 节"语义路由"删除(合并入第 7 节历史说明)
- 架构图 CostAwareRouter → RequestPreprocessor,22→28 路由模块
- 模块详解 chat/skill_routing + chat/semantic_router 合并为 chat/request_preprocessor
- 模块详解 router/intent 描述更新为"未接入 chat 流程"
- 目录注释 CostAwareRouter → RequestPreprocessor
- 章节重新编号 1-16 连续(原 1-17 跳过 9)
2026-06-28 15:26:42 +08:00
chiguyong c9ce15fa4b fix(code-review): 修复走查发现的 13 High + Medium 安全/可靠性问题
代码修复(8 High + 9 Medium):
- portal.py — C1 IDOR 文档 / C2 类型修复 / C3 WS 连接上限 16 / C4 ws_user_id 早初始化 / M silent swallow 日志化
- auth/middleware.py — C5 WS sid 补齐
- calendar_tool.py — C6 偏移量 ±43200 双向校验 + reminder_channels 类型/白名单校验
- sqlite_conversation_store.py — C7 DELETE 事务回滚
- chat.ts (Pinia) — C8 deleteConversation 清理 pending 缓存
- app.py — M except: pass → logger.debug(exc_info=True)
- Scene6Error.vue — M onUnmounted 清理 setTimeout
- DocumentsTab.vue — M Invalid Date 守卫
- ChatSidebar/RightPanel/TopNav.vue — M aria-label 无障碍标签
- SystemMonitorPanel.vue — M v-else 兜底 + active 边框色 + tablist 键盘导航
- CalendarDrawer.vue — M overflow-y: auto
- CalendarGrid.vue — M ResizeObserver 反馈循环防护
- SkillsTab.vue — M onMounted 始终 fetchSkills

文档修复(5 High + 6 Medium):
- portal-platform-security-reliability-fixes.md — D2 测试路径 / D3 Root Cause+Impact 章节 / D4 severity: mixed / 标题中文化 / 12 处绝对路径转相对 / P2 #12 数字口径
- AGENTS.md — D5 路由表 22→28 / 专家模板 5→15 / LiteLLM U15 迁移 / 配置查找 fallback
- README.md — 8 处端口 8000→8001

新增测试:
- tests/unit/calendar/test_calendar_tool.py — ponytail 自检断言

验证:
- ruff check (5 文件) — All checks passed
- vue-tsc --noEmit — exit 0
- git stash baseline 验证 — portal 17 个 401 失败为预存在问题

已知限制(预存在):
- 17 个 portal 测试 401 失败 — 需另起 ce-debug 调查
- README.md 7 处 CostAwareRouter 引用过时 — 文档同步另起任务
2026-06-28 15:06:41 +08:00
chiguyong 8ae8ed4e9b Merge branch 'feat/calendar-ui-fixes' — 日历能力缺失修复 + UI布局优化 + 会话404处理 + ce-code-review 修复 2026-06-28 14:25:44 +08:00
chiguyong 43e9025c6d fix(calendar): 日历能力缺失修复 + UI 布局优化 + 会话404处理
P0: calendar_tool reminder_rules 未传入 create_event,提醒功能完全失效。P1: chat.ts deleteConversation 未清理 pending + 404 递归保护。P2: app.py 系统提示重复段落 + gui_mode F821 + SystemMonitorPanel flex 布局。P3: portal send_json 快照 + WS connected 清除 is_local + 移除死代码。验证: ruff+pytest 98passed+typecheck 通过。
2026-06-28 14:24:58 +08:00
chiguyong 31c65e01b8 fix(security): P0 安全加固 + 多实例部署一致性 (U1-U4 + U5c)
Deploy to Production / deploy (push) Has been cancelled Details
U1: LLM gateway KB 缓存 fail-closed — 异常时默认禁用缓存防止 KB 数据泄漏
U2: MCP 危险工具黑名单过滤 — 6+1 端点覆盖,防止绕过 chat confirmation
U3: SecretsStore Redis 迁移 — 多 worker 共享凭证,内存降级保留开发模式
U4: channels webhook Redis 状态 — ZSET 滑动窗口限流 + nonce dedup + backpressure
U5c: ce-code-review 修复批次:
  - P0: 统一 MCP 黑名单与 publisher.py 一致 (terminal_execute -> terminal, +file_read)
  - P1: ZSET 限流 member 加 uuid 后缀避免同时间戳碰撞
  - P1: SecretsStore redis 参数 Any -> aioredis.Redis | None (AGENTS.md 合规)
  - P1: Redis client 添加 socket_timeout 防止单点故障请求挂死

测试: 171 scoped tests pass, ruff clean
2026-06-26 04:05:33 +08:00
chiguyong c62d435c43 Merge branch 'feat/portal-platform-evolution' — portal platform evolution (U1-U17 + RAG + channels + LiteLLM + ce-code-review fixes + ce-compound doc)
Deploy to Production / deploy (push) Waiting to run Details
2026-06-26 01:48:19 +08:00
chiguyong 75e9b58e46 docs(ce-compound): 记录 portal-platform 安全/可靠性修复批次
记录 ce-code-review 修复批次(commit 53faa60)的 10 个 P1/P2/P3 修复:
- P1: WeCom 重放、缓存跨用户泄漏、webhook 异常风暴、shutdown 泄漏
- P2: Feishu TTL、无界任务集、配额 N+1、冗余 SHA-256、未用参数
- P3: DIRECT_CHAT 去重

新增 docs/solutions/security-issues/portal-platform-security-reliability-fixes.md
CONCEPTS.md 补充 3 个领域术语:Per-User Cache Namespace、Webhook Signature Freshness、Webhook Backpressure
2026-06-26 01:47:57 +08:00
chiguyong 53faa60472 fix(review): ce-code-review P1+P2 修复 — 安全/可靠性/性能
P1 安全与可靠性(4 项):
- wecom: verify_signature 增加时间戳新鲜度校验(5 分钟窗口防重放)
- cache: should_cache 在 per_user_namespace 开启时拒绝 user_id=None
  匿名请求,避免跨用户缓存泄漏(安全要求 a/e)
- channels: webhook receive_message 异常兜底,防止 500 触发平台重试风暴
- app: shutdown 调用 close_all_adapters + await _pending_webhook_tasks,
  防止 httpx 连接泄漏和丢失 IM 回复

P2 效率与可维护性(5 项):
- feishu: _TOKEN_CACHE_TTL 300 → 6900(2h 减 5min 余量,避免 24x 过频刷新)
- channels: _pending_webhook_tasks 有界化(2x 并发上限时 429 拒绝)
- gateway: quota 检查每 period 单次 get_usage,复用 summary 检查 token+cost
- cache_key: generate_cache_key 合并为单次 SHA-256(消除 8-10 次冗余哈希)
- config: ProviderConfig.get_api_key 移除未用的 secrets_store 参数

P3 去重(1 项):
- channels: _process_inbound_message DIRECT_CHAT 路径提取 _direct_chat 辅助函数

测试:
- test_wecom: 时间戳改用 int(time.time()),新增 test_expired_timestamp_rejected
- test_cache: should_cache 测试覆盖匿名拒绝 + namespace_off 兼容
- test_config_migration: get_api_key 测试适配新签名
- channels/config_migration/quota_enforcement 测试全部通过
2026-06-26 01:40:31 +08:00
chiguyong 1ccaf56b9a refactor: ce-simplify-code 审查修复 — 去重 + 效率 + 死代码清理
3 个审查代理(复用/质量/效率)发现 15 个问题,全部修复:

效率与安全(6 项):
- MCPClient 缓存 MultiServerMCPClient 单例 + aclose(),修复连接/子进程泄漏
- _rate_limits 清理空 IP 条目,修复 X-Forwarded-For 欺骗下内存泄漏
- _seen_nonces 改用 OrderedDict,O(1) 摊销过期清理
- webhook 后台任务加 Semaphore(20) + 任务引用追踪,限制无界并发
- _build_adapter 用 asyncio.gather 并行解密 secrets
- 适配器实例缓存(_adapter_cache),token TTL 缓存跨请求命中

去重(4 项):
- header_get 提取到 channels/base.py,4 个适配器统一 import
- _get_client/close() 移入 MessageAdapter 基类,子类继承
- URLVerificationChallenge 统一到 base.py,feishu/slack/wecom 共用
- Transport ABC 添加 endpoint_url 属性,from_transport 不再访问私有字段

死代码与类型安全(5 项):
- detect_cache_hit 死方法替换为 record_cache_result 公开 API
- execution_mode.value == "direct_chat" 改用枚举比较
- 删除 yielded_any 死变量、重复 from fastapi import Request、
  多余 getattr 防御

453 tests passed, ruff clean(预存 F841 非本次引入)
2026-06-25 23:54:14 +08:00
chiguyong 793476cafa feat(llm): U17 — LiteLLM 语义缓存替换 + per-user/ACL scope 安全隔离
- 新增 LitellmCacheManager:配置 litellm.cache 全局,三级后端 fallback
  (RedisSemanticCache -> RedisCache -> InMemoryCache),redisvl lazy import
- cache_key 扩展 user_id + kb_acl_hash 参数(安全要求 a/b/e)
- gateway 集成:读取 KB caching_disabled flag(安全要求 c),构建带 scope
  的 cache_key,命中时 cost=0
- LLMResponse 新增 cache_hit 字段;LLMRequest 新增 cache 参数
- litellm_provider 透传 cache 参数 + 检测 _hidden_params 缓存命中
- 33 个新测试覆盖 13 场景(含 User A != User B 缓存隔离)
- 旧 InMemoryLLMCache/RedisLLMCache 保留向后兼容
2026-06-25 22:49:59 +08:00
chiguyong 86541d7172 feat(mcp): U16 — langchain-mcp-adapters client replacement + transport deprecation
- 重写 MCPClient:URL scheme 自动检测(stdio/http/sse)→ langchain config
- 旧 Transport 注入路径保留(DeprecationWarning),向后兼容
- transport.py 模块级弃用警告
- 28 个新测试覆盖 URL 检测、list_tools、call_tool、legacy 路径、ImportError
- 修复 manager.py / transport.py 预存 F401/F841
2026-06-25 22:04:37 +08:00
chiguyong 069dbc22b1 feat(llm): U15 — LiteLLM unified provider + api_key encrypted secrets migration 2026-06-25 21:41:15 +08:00
chiguyong 13c516a54f feat(mcp): U14 — Skill/Team MCP publish with admin auth + dangerous-tool opt-in 2026-06-25 21:10:06 +08:00
chiguyong 16c33be295 feat(mcp): U13 — refactor MCPServer to route factory + mount at /api/v1/mcp with auth 2026-06-25 20:58:41 +08:00
chiguyong 8998f94c42 feat(channels): U12 — DingTalk/WeCom/Slack adapters + multi-channel webhook dispatch 2026-06-25 20:45:43 +08:00
chiguyong 4b58e8f661 feat(channels): U11 — Feishu IM adapter end-to-end (webhook + signature + AES-CBC decrypt + chat integration) 2026-06-25 20:24:21 +08:00
chiguyong 5572387c01 feat(channels): U10 — message adapter ABC + AES-256-GCM secrets store + channel CRUD routes 2026-06-25 20:13:37 +08:00
chiguyong af96cb49bd docs(plan): deepen portal platform evolution plan — KTD5/7/8/9 expanded, KTD11 added 2026-06-25 20:13:27 +08:00
chiguyong 864bb95a30 feat(server): wire rag_platform components to app.state lifespan
Initialize in lifespan() (after bitable, before yield):
- KBStore + ensure_tables() → app.state.kb_store (if database_url available)
- RetrievalEngine + vector_store → app.state.retrieval_engine (if database_url available)
- HitProcessor → app.state.hit_processor (with llm_gateway)
- TaskManager → app.state.task_manager (degraded mode, InMemoryTaskStore)
- KBSettingsStore → app.state.kb_settings_store (singleton)

Each component wrapped in try/except — failures logged but don't block startup.
Follows same pattern as episodic memory initialization.
2026-06-25 20:02:01 +08:00
chiguyong 1f691ca178 feat(frontend): U9 — KB management extension with segment preview, status display, settings
New: SegmentPreview.vue, KBSettings.vue
Extended: DocumentUpload.vue (status badges, retry, preview), SearchTest.vue (3 modes), SourceConfig.vue (ACL), KnowledgeBaseView.vue (settings + task history tabs)
API+Store: kb.ts new types/methods, knowledge.ts new state/actions

typecheck: passed
2026-06-25 13:14:58 +08:00
chiguyong e3ae2f3a56 feat(rag_platform): U8 — TaskIQ async task integration
Add tasks.py: TaskManager with vectorize/batch_index tasks, per-user concurrency limits, degraded mode (sync execution without broker), WorkerSweeper for timeout detection, error message sanitization
Add taskiq>=0.11 and taskiq-redis>=0.5 to pyproject.toml
Task parameter schema validation (VectorizeTaskParams, BatchIndexTaskParams)

Tests: 41 new tests, 289 total passing
2026-06-25 12:58:51 +08:00
chiguyong d026a91f43 feat(rag_platform): U6 — hit processing mode + KB settings
Add hit_processing.py: HitProcessor with model_opt (LLM-generated) and direct (concatenated chunks) modes, with in-process cache
Add settings.py: KBSettings/KBSettingsUpdate models, KBSettingsStore with async CRUD
Add KB settings endpoints to kb_management.py: GET/PUT /kb-management/kbs/{kb_id}/settings with owner-only modification

Tests: 43 new tests (25 hit_processing + 18 settings), 293 total passing
2026-06-25 12:44:47 +08:00
chiguyong 5c562dbff3 feat(rag_platform): U5 — rerank + question generation + termbase
Add rerank.py: Reranker with Cohere/BGE provider support, data export risk annotation, graceful degradation
Add question_gen.py: LLM-based question generation following ContextualChunker pattern, with caching
Add termbase.py: jieba custom dictionary management, add/remove/load terms

Tests: 58 new tests (14 rerank + 19 question_gen + 25 termbase), 205 total passing
2026-06-25 12:31:43 +08:00
chiguyong fb9f16d6e5 feat(rag_platform): U4 — dual-index retrieval (pgvector semantic + PG fulltext jieba)
Add fulltext.py: jieba tokenization + tsvector write/query
Add retrieval.py: RetrievalEngine with embedding/keywords/blend modes
Update models.py: add RetrievalRequest model
Tests: 35 new tests, 147 total passing
2026-06-25 12:20:48 +08:00
chiguyong 3f9588e673 feat(rag_platform): U3+U7 — rewrite upload endpoint with sanitization + pipeline
Rewrite upload_document() to use rag_platform sanitize + DocumentProcessor:
- File type whitelist validation (8 allowed types, reject .exe/.sh)
- File size limit (50MB) + zip bomb detection for ZIP-based formats
- DocumentProcessor.parse() (with content sanitization) + segment()
- Return chunks preview, status="segmenting" (pending vectorization)

Add POST /kb-management/documents/preview endpoint:
- Pre-upload preview with adjustable chunk_size/chunk_overlap
- Same security validation as upload, no document record created

Add POST /kb-management/documents/{id}/vectorize placeholder:
- Returns 503 — full async vectorization deferred to U8 (TaskIQ)

Test: update test_upload_document assertion (status "indexed" → "segmenting")
2026-06-25 12:06:16 +08:00
chiguyong b55c896794 feat(rag_platform): U3+U7 — document processing pipeline + upload security
U3: Document processing pipeline (document_processor.py)
- DocumentProcessor class wrapping parse → segment → vectorize
- parse() uses memory/document_loader.py for multi-format extraction
- segment() uses LlamaIndex SentenceSplitter
- preview() returns chunks for read-only preview (no vectorization)
- vectorize() embeds chunks and stores in pgvector (all-or-nothing)
- process() orchestrates full pipeline with status transitions:
  pending → parsing → segmenting → vectorizing → indexed | failed

U7: Upload security & content sanitization (sanitize.py)
- ALLOWED_FILE_TYPES whitelist (pdf/docx/xlsx/pptx/txt/md/csv/html)
- MAX_FILE_SIZE 50MB limit
- validate_file_type() / validate_file_size() guards
- check_zip_bomb() for ZIP-based formats (ratio > 100:1 or > 500MB)
- check_image_bomb() for pixel count > 100MP (PNG/JPEG/GIF header parsing)
- is_safe_ip() SSRF protection (loopback/RFC1918/link-local/ULA denied)
- sanitize_markdown() removes dangerous HTML tags (script/iframe/object/embed)
- sanitize_content() main entry point for text format sanitization
- parse_xml_safe() XXE protection (forbid_dtd/forbid_entities/forbid_external)

Preview API (preview.py)
- PreviewChunk / PreviewResult Pydantic models
- generate_preview() returns read-only segmentation preview

Tests: 112 tests passing (45 new + 67 existing)
- test_sanitize.py: file type/size, markdown sanitization, SSRF, zip/image bomb
- test_document_processor.py: parse/segment, preview, vectorize, failure status
2026-06-25 11:21:42 +08:00
chiguyong c1a21f57a1 feat(rag_platform): U2 — KB persistence + per-KB ACL
Add PostgreSQL-backed KB store replacing in-memory KnowledgeSourceStore:
- models.py: ORM models (KBModel, DocumentModel, KBAclModel) using
  SQLAlchemy 2 DeclarativeBase + Mapped style
- store.py: KBStore with async CRUD for KBs and documents,
  create_kb creates owner ACL in same transaction
- acl.py: filter_kb_by_user_acl(), grant_access(), revoke_access(),
  list_acl() — follows filter_kb_sources_by_department pattern

Schema: rag_platform_kbs, rag_platform_documents, rag_platform_kb_acl
with FK CASCADE on kb_id. UniqueConstraint on (kb_id, user_id).

Tests: 23 unit tests covering KB CRUD, document operations, ACL
filtering, grant/revoke. All 37 rag_platform tests pass.
2026-06-25 11:01:04 +08:00
chiguyong 27d0184392 feat(rag_platform): U1 — RAG platform skeleton + LlamaIndex integration
Create src/agentkit/rag_platform/ module with:
- models.py: Pydantic domain models (KB, Document, Chunk, QueryResult)
- indexing.py: PGVectorStore wrapper with explicit table name
  (rag_platform_kb_chunks) for schema isolation from episodic_memory
- pipeline.py: RAGPipeline wrapping LlamaIndex IngestionPipeline
  (SentenceSplitter + embedding + vector store)

Add dependencies: llama-index-core, llama-index-vector-stores-postgres,
llama-index-embeddings-openai, pgvector, jieba.

Tests: 14 unit tests covering models, indexing (URL conversion, table
name isolation, embed_dim), and pipeline (ingest, query, chunk params).
2026-06-25 10:49:35 +08:00
chiguyong 22c89763e2 docs: add long-horizon reliability fixes learning + scrub CONCEPTS.md
- New solution doc: logic-errors/long-horizon-reliability-code-review-fixes.md
  Documents 13 code-review fixes (2 P0, 5 P1, 6 P2) across U1-U7
  long-horizon reliability features (disclosure_level default, resume
  plan_id mismatch, middleware dataclass compat, state offload readback,
  checkpoint dedup, dynamic phase persistence, debate count restore,
  loop detection reset, concurrent resume lock, FAILED phase handling,
  checkpoint cleanup, offload type guard).

- CONCEPTS.md: add Expert Orchestration cluster (Disclosure Level,
  State Offloading, Pipeline Checkpoint, Debate Phase, Resume).
  Scrub Bitable entries to remove implementation specifics per
  vocabulary rules (API paths, library calls, SQL syntax, class names,
  enum values).
2026-06-25 02:40:22 +08:00
chiguyong 71eaf8dc7c docs: add bitable security/reliability patterns solution doc + CONCEPTS.md
Deploy to Production / deploy (push) Has been cancelled Details
- docs/solutions/architecture-patterns/bitable-companion-service-security-reliability-patterns.md
  Knowledge-track doc capturing 10 security/reliability patterns from the
  bitable companion service (SSRF prevention, SQL injection, IDOR, atomic
  task claiming, cache invalidation, composite cursor, batch ops, async
  I/O safety, OOM prevention, internal token auth)

- CONCEPTS.md
  Seeded with 3 core domain nouns: Bitable, Field Ownership, Recalc

- AGENTS.md
  Added discoverability tips for docs/solutions/ and CONCEPTS.md
2026-06-25 01:25:06 +08:00
chiguyong bbbf9cd40a feat(bitable): add bitable companion service with full P0-P2 fixes
Bitable is a multi-dimensional table companion service that runs alongside
the main AgentKit server. It provides structured data storage with formula
fields, views, and ingestion pipelines.

Major components:
- Domain models (Pydantic v2): Table, Field, Record, View, RecalcTask
- SQLAlchemy 2 async ORM with independent bitable PostgreSQL schema
- Formula engine: AST parser, DAG, Kahn topological sort, safe eval
- RecalcWorker: atomic task claiming (FOR UPDATE SKIP LOCKED), topo-order
  processing, stale-threshold reaper for crash recovery
- REST API (/api/v1/bitable): tables, fields, records, views, files
- BitableTool: agent-facing tool with batch chunking (500/batch)
- CLI: agentkit bitable subcommands (create, list, import-excel, etc.)
- Frontend: Vue 3 + vxe-table grid with field management, views, filters
- Ingestion: Excel (openpyxl), database reflection, API collector

Security fixes (ce-code-review P0 + ce-debug P1):
- SQL injection prevention (field_id validation, parameterized queries)
- IDOR protection (_check_table_ownership on all table-level endpoints)
- SSRF prevention (URL scheme + private IP validation in parse_excel_url)
- OOM prevention (streaming file upload, batch delete, batch insert)
- Atomic recalc task claiming (FOR UPDATE SKIP LOCKED)
- Formula engine cache invalidation on field changes
- Composite cursor pagination for non-id sort orders
- Batch upsert (eliminates N+1 queries)
- Sync I/O offloaded to thread pool in async contexts
- Internal token auth (X-Internal-Token, hmac.compare_digest)
- PK unique index enforcement

Test coverage: 88 unit tests (95 skipped without Docker)
2026-06-25 01:09:59 +08:00
chiguyong 567cbc9c9b refactor: simplify code across U1-U7 (bug fix + efficiency + reuse + quality) 2026-06-24 22:35:52 +08:00
chiguyong 0847c0e086 fix(checkpoint): add TTL expiration for memory fallback mode
内存降级模式之前没有 TTL 过期机制,长期运行进程会导致内存泄漏。
现在 list_checkpoints 和 load_plan 在内存模式下会过滤/清除过期数据。

- list_checkpoints: 内存降级分支过滤过期 checkpoint
- load_plan: 内存降级分支检查 TTL 过期,过期则清除并返回 None
- 新增 _is_expired 方法检查 saved_at 是否超过 TTL
- _memory_plans 类型改为 tuple(plan_dict, timestamp) 以支持 TTL
- 新增 5 个 TTL 过期测试覆盖内存模式和 Redis 降级场景
2026-06-24 22:04:55 +08:00
chiguyong fa152e24ac feat(skills): add progressive skill loading with disclosure_level=0 (U5)
When disclosure_level=0, system prompt only injects skill name + description
(summary mode). SkillDetailTool is injected into the tool set, allowing the
LLM to load full instructions on-demand via skill_detail(query). This reduces
context window consumption when many skills are registered.
2026-06-24 21:49:00 +08:00
chiguyong dfd188b1a4 feat(orchestrator): add pipeline checkpoint and crash recovery (U7)
Add PipelineCheckpoint for stage-level crash recovery with Redis-first
+ memory fallback. TeamOrchestrator saves checkpoints after each phase
finalizes and supports resume(plan_id) to continue from the last
completed phase. New POST /api/v1/tasks/{id}/resume endpoint recreates
the team from saved plan and calls resume.
2026-06-24 21:04:18 +08:00
chiguyong 3dfda904d7 feat(core): add middleware pipeline architecture with onion model
U6: Unified middleware protocol (before/after) with MiddlewareChain
implementing onion model execution. Parallel integration (KTD1) —
middleware path controlled by presence of middleware_chain parameter,
existing ReActEngine path unchanged when None.

- New core/middleware.py: RequestContext, Middleware protocol,
  MiddlewareChain (onion model: before outer→inner, after inner→outer)
- 3 example middlewares: SummarizationMiddleware (U3 headroom compression),
  TokenUsageMiddleware, LoopDetectionMiddleware (request-level audit)
- ReActEngine.__init__ accepts middleware_chain parameter
- execute() branches: middleware path when chain present, existing path otherwise
- 22 tests covering ordering, error handling, state passing, backward compat
2026-06-24 20:52:15 +08:00
chiguyong ef84e3fd53 feat(experts): add SharedWorkspace state offloading for long-horizon runs
U4: ExpertTeam accepts redis_client, passes to SharedWorkspace. After phase
completion, full result is written to workspace and in-memory phase.result
is replaced with a 500-char summary + _ref_key. Dependency output reading
resolves offloaded content from workspace on demand, with graceful fallback
to summary on read failure.

Tests: 8 scenarios (offload creation, short content, dependency resolution,
workspace failure fallback, non-offloaded passthrough, redis_client wiring,
memory dict fallback, pipeline integration) — all pass.
2026-06-24 20:32:10 +08:00
chiguyong 122173ec2c feat(core): add headroom-based compression trigger
U3: ContextCompressor now accepts model_context_limit, headroom_threshold,
and min_tokens. should_compress() triggers when token ratio exceeds 0.8 of
model limit OR exceeds min_tokens (8000 fallback). ReActEngine._should_compress
delegates to compressor when available, checks is_available() first.

Tests: 6 scenarios (headroom trigger, min_tokens guard, small model,
unavailable compressor, delegation, fallback) — all pass.
2026-06-24 20:28:14 +08:00
chiguyong 717aad1303 feat(experts): add concurrency limit to TeamOrchestrator parallel phases
U2: Add asyncio.Semaphore to bound concurrent phase execution and debate
argument generation. Default limit=3, configurable via max_concurrent_phases.
Prevents LLM rate-limit spikes when many phases run in the same layer.

Tests: 5 scenarios (happy path, 5-phase edge case, serial mode, failure
release, debate integration) — all pass.
2026-06-24 20:23:30 +08:00
chiguyong 018b342d96 feat(react): add loop detection to prevent repeated identical tool calls
U1: Sliding window hash detection in ReAct loop. When the same tool is
called with identical arguments >= threshold times (default 2), injects
a correction message first, then raises LoopDetectedError if the LLM
doesn't change strategy. Covers both _execute_loop and execute_stream.
2026-06-24 20:12:35 +08:00