fischer-agentkit

Commit Graph

Author	SHA1	Message	Date
chiguyong	42fe7bcbc9	feat(calendar): U3 agent calendar tool for ReAct integration Adds CalendarTool implementing the Tool ABC so the ReAct engine can create, query, update, and delete events autonomously. Resolves event_type_name and tag_names (look up or create), sets source="agent" to distinguish agent-created events from manual ones. - src/agentkit/tools/calendar_tool.py — CalendarTool(Tool) - tests/unit/tools/test_calendar_tool.py — 13 tests covering all actions	2026-06-23 21:56:08 +08:00
chiguyong	d36e45bbe7	feat(calendar): U2 backend service & REST API Add CalendarService business logic layer and 14 REST endpoints: - service.py: event CRUD with RRULE expansion, event types, tags, invitations, non-admin user search (G5/A3), type-level default reminder rule cloning - routes/calendar.py: JWT-authenticated endpoints for events, types, tags, invitations, user search — with ownership checks - 17 new tests (12 service + 5 routes), 33 total calendar tests passing	2026-06-23 21:43:39 +08:00
chiguyong	2ea799f6c4	feat(calendar): U1 backend data model, storage & RRULE expansion Add calendar subsystem foundation mirroring documents/ pattern: - models.py: 8 dataclasses (CalendarEvent with is_invited, EventType, Tag, EventTag, ReminderRule, ReminderDelivery, ExternalCalendarConfig, Invitation) - db.py: aiosqlite bare-connection CRUD for all 8 tables with WAL mode - recurrence.py: RRULE expansion via dateutil.rrule (RFC 5545) - 16 unit tests covering DB CRUD and RRULE edge cases (DST, UNTIL, range) - Add python-dateutil>=2.9 to pyproject.toml	2026-06-23 21:30:39 +08:00
chiguyong	3337589395	fix(review): document-processing code review fixes — validation, tests, formatting Deploy to Production / deploy (push) Waiting to run Details - SkillConfig._validate_v2: validate fallback_strategies against ReWOOEngine.VALID_STRATEGIES (lazy import, #20) - test_skill_config: +4 tests for fallback_strategies validation - test_document_loader: +8 xlsx edge case tests (empty workbook, malformed bytes, column mismatch, row/cell truncation, multi-sheet, file size limit, None cells, #16) - test_execution_modes: fix ReWOOEngine patch path (lazy import -> patch at source) + FakeReWOOEngine.execute return .output attribute - config_driven: ruff formatting (quotes, blank lines after imports) - project_rules: remove stale "known failing test" note (now passes)	2026-06-23 20:21:19 +08:00
chiguyong	a672dddc9a	feat(skills): distinguish agent templates from business skills in UI Deploy to Production / deploy (push) Waiting to run Details The skills tab mixed generic execution-engine templates (react/direct/ rewoo/...) with business-domain skills (monitor/geo_optimizer/...) with no visual or data distinction. Adds a derived `category` field to the SkillInfo/SkillDetail API models and groups the frontend display. Backend: - SkillInfo/SkillDetail: add category (Literal), agent_type, execution_mode, task_mode fields - _skill_to_info: derive category from explicit _ENGINE_TEMPLATE_NAMES set (not name suffix — trend_agent/deai_agent are business skills despite the _agent suffix) - Simplify repetitive hasattr pattern with getattr Frontend: - ISkillInfo/ISkillDetail: add category + mode fields - skills store: agentTemplates/businessSkills computed getters (businessSkills is defensive: anything not explicitly engine template) - SkillsView: group into 执行引擎 / 业务技能 sections with counts - SkillCard: type badge (引擎/技能), category-based icon, mode display, dark-mode-aware accent color Tests: - test_category_derived_from_name_suffix: verifies field exposure - test_category_no_orphans: invariant — every skill has a valid category - test_trend_agent_classified_as_business_skill: regression guard for the _agent suffix misclassification bug Code review (ce-code-review): 2 P1 + 5 P2 findings applied.	2026-06-23 15:55:59 +08:00
chiguyong	47f3bfecfc	feat(documents): add document processing capability (U1-U9) Implements end-to-end document generation, template filling, and reading: - DocumentService: unified business layer for create/query/download - Renderers: Word (Markdown->docx), Excel (Markdown/JSON->xlsx), PDF (Markdown->pdf with CJK font), Template (Jinja2 sandbox .docx fill) - DocumentLoader: read PDF/Word/Excel/Markdown/HTML/text -> Document - DocumentTool: Agent tool with action=create\|read - REST API: /api/v1/documents (create, upload-template, list, download) - Frontend: DocumentPanel, DocumentCard, documents Pinia store, chat store tool_result detection - Security: path traversal guard (Path.resolve + relative_to), SSTI guard (SandboxedEnvironment), API key auth, 50MB upload limit - Bug fixes: template path traversal (400 not 500), TemplateRenderer lazy-load (no external registration dependency) - Tests: 168 tests (unit + security + E2E F1/F2/F3 + bug hunt) - Docs: README section 17, requirements + plan + test-plan docs Requirements R1-R28 verified, F1-F3 user flows pass.	2026-06-23 15:05:01 +08:00
chiguyong	4f261523c2	fix(review): U3 atomic file writes for YAML + .env + skill config All config file writes now use the write-temp + fsync + os.replace pattern (KTD-4) so a crash mid-write leaves the original file intact. - Add src/agentkit/server/utils/atomic_write.py with write_text_atomic - settings.py: _write_yaml_config and _write_env_var use atomic write - skill_service.py: import_skill uses atomic write - skill_service.py: update_skill_config uses atomic write + fcntl.flock around the read-modify-write cycle to serialize concurrent updates - Add 11 unit tests covering happy path, crash safety, concurrency, errors	2026-06-22 17:03:27 +08:00
chiguyong	698a8fafba	fix(review): U7 refresh token hash verification on whoami The whoami route accepted rotated/old refresh tokens for cold-start because it only checked session revocation status, not the token hash. Now when token_type == "refresh", the route computes hash_token(token) and compares it with the session's stored refresh_token_hash using hmac.compare_digest (constant-time). Mismatch returns 401. - Add SessionService.get_stored_refresh_hash(session_id) helper - Add hash verification in whoami route (R9) - Add TestWhoamiTokenHash with 5 integration tests	2026-06-22 16:55:20 +08:00
chiguyong	278d76b381	fix(review): U6 frontend field alignment + CLI top-users field fix	2026-06-22 16:28:44 +08:00
chiguyong	00c8386939	fix(review): U1 Redis quota enforcement — key construction + fail-closed + degradation recovery + async	2026-06-22 16:22:33 +08:00
chiguyong	abe2a66436	fix(review): CLI field names, Pydantic validation, exception chaining	2026-06-22 15:24:31 +08:00
chiguyong	5e977539c7	test(admin): U10 — E2E + security isolation + quota enforcement tests 23 integration tests across 3 files: - test_e2e_admin_flow: 5 end-to-end lifecycle tests (department, user, LLM config, skill management, usage dashboard) - test_security_isolation: 7 department isolation tests + non-admin 403 tests (cross-dept skill/KB access, multi-dept union, admin sees all, removed user loses access, disabled dept, API key client) - test_quota_enforcement: 10 quota tests (token/cost/whitelist limits, multi-dept strictest-wins, real gateway integration, usage recording) 418 admin tests pass, no regressions.	2026-06-21 19:57:49 +08:00
chiguyong	2dd0091bda	feat(admin): U8 — CLI admin command group AdminHttpClient: sync HTTP client with JWT/API key auth, config file support (~/.agentkit/admin_config.yaml), env var fallback. 35+ CLI commands across 7 groups: login, department (CRUD + bind/unbind skill/KB + quotas), user (CRUD + reset-password + assign/remove dept), llm (providers + api-key + fallbacks), skill (list/enable/disable/ import/reload), kb (documents CRUD + sync/rebuild), usage (summary/ timeseries/by-model/top-users/export). All commands support --server-url, --token, --api-key, --json flags. Rich table output by default, raw JSON with --json. Friendly error handling for connection/auth/not-found/conflict errors. 64 new tests, 102 CLI tests pass, no regressions.	2026-06-21 18:56:14 +08:00
chiguyong	09feca3307	feat(admin): U7 — usage dashboard + quota enforcement UsageRecord extended with user_id + department_id (backward compatible). UsageStore Protocol extended: record() accepts user_id/department_id, get_usage() accepts filters, new get_usage_by_user/department methods. RedisUsageStore uses versioned keys (v2) for new records. LLMGateway.chat()/chat_stream() accept user_id, department_ids, db_path. Quota check before provider call: model whitelist + token limit + cost limit (daily). Multi-department uses strictest-wins (any exceed → reject). QuotaExceededError → 429 at route layer. UsageService: summary, timeseries, by-model, top-users, export (CSV/JSON). 5 new admin endpoints under /admin/usage/*. llm_gateway.py routes pass DepartmentContext + db_path to gateway, catch QuotaExceededError → 429 (JSON for /chat, SSE error for /stream). 84 new tests. 441 admin+usage tests pass, no regressions.	2026-06-21 17:23:20 +08:00
chiguyong	fd7f6816b8	feat(admin): U6 — Skill & KB management endpoints + department binding SkillService: enable/disable (persisted in skill_states table, schema v4), import from YAML (with path traversal + name validation), reload from file, update config. GET /skills now filters disabled skills. KbService: list/upload/delete documents with department_id binding. Added department_id field to KnowledgeSource + UploadedDocument. Department visibility: (bound to user depts) ∪ (global = None). 10 new admin endpoints: skill enable/disable/import/reload/update, KB documents CRUD, source sync/rebuild. All guarded by _require_admin. Implemented reload stub in skill_management.py (was no-op). 54 new tests (26 unit + 28 integration). Fixed 4 pre-existing lint errors. 357 admin tests pass, no regressions.	2026-06-21 16:19:51 +08:00
chiguyong	980919fc95	feat(admin): U5 — LLM config admin endpoints + department quotas QuotaService: set/get/list/delete quotas, check_quota (hard reject), is_model_allowed. JSON-serialized limit_value, upsert with ON CONFLICT. LlmConfigService: provider CRUD + set_api_key + fallback management. fcntl.flock file lock prevents concurrent YAML writes. Reuses settings.py helpers (_read_yaml_config, _write_yaml_config, _write_env_var, _mask_api_key). 11 new admin endpoints: provider CRUD, api-key, fallback CRUD, department quotas CRUD. All guarded by _require_admin. 93 new tests (30 quota unit + 32 llm-config unit + 31 integration).	2026-06-21 15:03:38 +08:00
chiguyong	ad65f7a8d7	feat(admin): U1+U2+U4 — schema v3, department service, context filtering U1: Bump _SCHEMA_VERSION to 3, add 5 department tables (departments, user_departments, department_skill_bindings, department_kb_bindings, department_quotas) + 5 ORM models + helpers. U2: DepartmentService (12 async methods: CRUD + bind/unbind skill/KB + count_users). Mount admin_router in app.py. 36 unit + 28 integration tests. U4: DepartmentContext FastAPI dependency (per-route, admin bypasses filtering). filter_skills_by_department / filter_kb_sources_by_department helpers. Applied to GET /skills and GET /kb-management/* routes. 15 integration tests for department isolation. Also includes brainstorm + plan docs. 108 new tests, all pass.	2026-06-21 15:03:27 +08:00
chiguyong	6dca9ba4f2	feat(admin): U3 — user CRUD + password reset + multi-department Add create_user method to LocalAuthProvider (bcrypt hash + INSERT, raises ValueError on duplicate username/email). Add UserService with 9 async methods: create/list/get/update/delete (soft)/reset_password/assign_department/remove_department/list_user_departments. reset_password revokes all sessions via SessionService. delete_user is soft (is_active=0, row preserved). Add 9 user endpoints to routes/admin.py: POST/GET/PATCH/DELETE users, reset-password, assign/remove department, list departments. All guarded by _require_admin. Tests: 40 unit + 37 integration = 77 new tests. Full admin suite 170 tests pass, no regressions.	2026-06-21 13:45:42 +08:00
chiguyong	67c0d67262	fix(auth,chat): P0 security fixes + stop-generation button + doc sync U1: whoami cold-start security — add is_active check (disabled users now get 401, not 200) and replace create_token_pair with create_access_token to avoid minting a discarded refresh token (token-amplification risk). U2: list_active_by_provider now filters expired sessions (expires_at > now) matching its docstring promise; previously only checked revoked = 0. U3: Fix asyncio.run() crash in test_revoke_other_user_session_returns_404 (converted to async). Add U1/U2 verification tests (disabled-user whoami, no-refresh-leak, expired-session filtering, provider filtering) and strengthen admin route tests (404 boundary, non-admin 403 on /admin/sessions). U4: Update CLAUDE.md/AGENTS.md Request Flow — CostAwareRouter 3-layer diagram replaced with actual RequestPreprocessor architecture (@board/@team prefix intercepts then @skill: prefix then trivial-input regex then default REACT). ExecutionMode list expanded to all 7 values. U5: Frontend stop-generation button — ChatInput.vue shows a stop button when isGenerating is true; chat store gains stopGeneration() that sends {type:"cancel"} over WebSocket (backend portal.py already handles cancel). Tests: 120 auth tests pass (unit + integration). ruff clean. vue-tsc clean.	2026-06-21 11:36:58 +08:00
chiguyong	aee7362665	feat(auth): U3/U4/U9 logout-others + whoami cold-start + admin UI + integration tests	2026-06-21 09:08:34 +08:00
chiguyong	9328451050	feat(auth): U7-U10 会话管理 UI + admin API + 测试修复 - U7: 前端 ActiveSessionsPanel + ChangePasswordPanel 组件 - U8: 用户会话管理（查看/撤销/改密）集成到 SettingsView - U9: 管理员会话管理 API + UserSessionsPanel + AdminApiClient - U10: 认证中间件支持 sid 会话验证 + legacy client 兼容 - 修复 test_auth.py 测试夹具：注入 SessionService 单例绑定测试 DB - 修复 wrong-password 断言大小写匹配 - ruff: 清理未使用导入	2026-06-21 08:48:25 +08:00
chiguyong	b418c3dc95	feat(auth): U3 SessionService + validation cache Adds the central business-logic layer for ``auth_sessions`` so routes, the auth middleware, and the admin endpoints can call a single service instead of touching the table directly. Server - session_service.SessionService: CRUD + lifecycle for auth_sessions. - create() enforces the per-user cap (default 10): the oldest active session is evicted with reason=session_cap_eviction. - rotate() swaps a refresh token, adds the old hash to the denylist, and raises SessionReuseDetected (revoking all sessions for the user) when the old token is replayed. - revoke() / revoke_by_refresh_token() / revoke_all_for_user() with explicit reasons: user_terminated, admin_revoked, password_changed, reuse_detected, session_cap_eviction. - touch() bumps last_active_at (called on /auth/whoami). - session_cache.SessionValidationCache: bounded LRU+TTL wrapper (default 30s/1k entries) around SessionService.is_session_valid. The middleware hits this on every request carrying a V2 sid claim; one SQLite round-trip per 30s per session instead of per request. - get_session_service() / get_validation_cache() module-level singletons overridable in tests via set_session_service() / set_validation_cache(). Tests - tests/unit/auth/test_session_service.py: 15 cases covering create/rotate/revoke/list/cap-eviction/reuse-detection/expired sessions. Refs: U3 in docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md	2026-06-21 01:58:30 +08:00
chiguyong	5ba1aceb96	feat(auth): U2 JWT sid/jti claims + refresh-token denylist Adds V2 JWT claim schema that closes the kicked-out window and enables refresh-token rotation with reuse detection. Server - jwt_utils.create_token_pair now takes ``session_id`` and ``remember_me`` kwargs. When ``session_id`` is provided, both tokens carry a ``sid`` claim and the access token also carries a ``jti`` claim; the refresh token's jti is intentionally absent (rotation uses the token hash). - New ``REFRESH_TOKEN_TTL_REMEMBER_ME = 30d`` (default 7d) selected by the ``remember_me`` flag. - ``verify_token`` now supports an optional ``expected_type`` filter (e.g. ``"access"`` / ``"refresh"``); when omitted, both types pass (used by /auth/whoami's cold-start path). - New ``auth.denylist`` module: ``InMemoryRecentlyRevoked`` (default for the Tauri sidecar / dev mode) and ``RedisRecentlyRevoked`` (multi- process server). Bounded LRU with auto-expiry via ``time.monotonic()``. Backwards-compat - Tokens issued before U2 (no ``sid``) are still accepted by ``verify_token``; validation falls through to the legacy ``user_sessions`` table via the U10 shim (next commit). Tests - tests/unit/auth/test_jwt_utils.py: 12 cases — V1/V2 claim presence, default + remember-me TTL, expected_type filter, expiry, wrong secret. - tests/unit/auth/test_denylist.py: 6 cases — add/contains, TTL expiry, LRU eviction, re-add refresh, clear, hash stability. Refs: U2 in docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md	2026-06-21 01:53:13 +08:00
chiguyong	2f55fc7434	feat(auth): U11 AuthProvider 抽象层 + auth_sessions schema 为未来对接集团 IdP（OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微）留扩展点，同时落地 auth_sessions 表（V2 替代 user_sessions）。变更 - models.py: 新增 auth_sessions + auth_meta 表，V1→V2 数据回填 - providers/base.py: AuthProvider Protocol 接口契约 - providers/local.py: LocalAuthProvider 默认实现（封装 SQLite + bcrypt） - providers/oidc_stub.py: StubOIDCProvider 占位（NotImplementedError） - providers/__init__.py: get_auth_provider DI 工厂（lru_cache 单例） - providers/exceptions.py: AuthProviderError / InvalidCredentials / ProviderNotImplemented - providers/user.py: Provider-agnostic User 值对象 - tests/unit/auth/: 37 个测试覆盖 Protocol / DI / Local / OIDC 行为 auth_sessions.auth_provider 字段记录登录来源（local / oidc-stub / 未来 oidc-keycloak / saml / ldap），未来切 IdP 时审计可溯源。测试: 37 passed (providers) + 62 passed (auth 全集) + ruff check clean	2026-06-21 01:28:14 +08:00
chiguyong	cac9c73dd5	fix(routing): U1-U6 路由优化 + 修复方案 + 代码审查修复实现 6 个修复单元（U1-U6）并应用 ce-code-review 发现的 5 项安全修复。 ## U1: benchmark 超时阈值 - 按 difficulty 分级超时：easy=45s, medium=60s, hard=90s - 替换原单一 60s 硬编码 ## U2: OpenAICompatibleProvider httpx 超时 - 新增 timeout 参数（默认 120s），替换硬编码 60s - ProviderConfig.timeout 透传到 Provider - 新增 2 项单元测试 ## U3: 激活 QualityGate skill_match 校验 - BaseAgent._build_skill_context() 构造 skill_context - 在 base.py / tasks.py / runner.py 三处传入 QualityGate.validate() ## U4: 添加 disambiguation_keywords 字段 - IntentConfig 新增 disambiguation_keywords 字段 - 8 个 skill YAML 补充该字段 ## U5: 优化 RequestPreprocessor 路由正则 - 拆分 _FACTUAL_RE 为 CN/EN 双正则（中文无空格） - 新增 _MATH_RE / _TRANSLATION_RE 纯模式 - _TOOL_CONTEXT_RE 排除需要工具的实时查询 - 多行输入守卫 + 结尾标点支持 - 新增 21 项单元测试（共 40 项全通过） ## U6: 重新基准测试 - 真实 LLM benchmark：准确率 60% -> 93.3% - 4/5 通过，p50=40.8s，一致性=100% - 旧基线备份至 baseline_2026-06-17_old_arch.json ## ce-code-review 修复（5 项） - 修复 \s 字符类匹配换行符的安全隐患 - 添加事实/数学正则的结尾标点支持 - 修复 geo_optimizer.yaml 关键词重复 - 修复 _login_with_retry 不可达 return - 修复 real_llm_server fixture stderr_fh 资源泄漏测试：tests/unit/chat/ 63 项全通过，ruff 检查通过。	2026-06-20 19:31:49 +08:00
chiguyong	2e404cf1a0	test: 全面回测 + 真实 LLM E2E + 能力 benchmark + 问题修复 ## 测试结果 ### 后端 E2E（真实 LLM，真实服务器）— 13/13 通过 - tests/e2e/test_real_llm_e2e.py: 认证流程、LLM 网关、Chat API、WebSocket - 使用百炼 coding plan（qwen3.7-plus）真实 LLM，无 mock - 修复 SQLite 写锁竞争导致的间歇性 500（_login_with_retry 重试机制） ### 前端 E2E（Playwright + 真实 LLM）— 11/11 通过 - login.spec.ts (4): 登录流程、表单验证、token 存储 - chat.spec.ts (3): 真实 LLM 对话、消息渲染 - terminal.spec.ts (4): 终端面板、白名单管理 - 使用系统 Chrome（channel: 'chrome'）避免浏览器下载 ### Benchmark 能力评估（真实 LLM） - full 模式: 60% 准确率（5 用例 3 通过 2 超时） - fast 模式: 100% 准确率 - 失败用例: llm-001 (intent_understanding) / llm-004 (code_generation) 均为超时 ### 单元测试 - 174 个新测试通过 - 28 个预存失败（非本次架构变更引入） ## 代码修复 ### chat.ts: 消除 any 类型 TODO（line 406） - handleWsMessage 参数从 Record<string, any> 改为 WsServerMessage 联合类型 - 使用判别联合窄化，每个 case 分支直接访问类型化字段 - 移除通用 payload 变量，移除未使用的类型导入 - vue-tsc --noEmit 零错误 ### 基础设施修复 - playwright.config.ts: 修复 PROJECT_ROOT 路径（4 级而非 2 级） - playwright.config.ts: 用 uvicorn.run() 替代 agentkit serve（避免非 tty 交互提示） - helpers.ts: API_BASE 改为绝对 URL（Node.js fetch 不支持相对 URL） - helpers.ts: clearAuth 修复 page.evaluate 上下文问题（Node 常量传入浏览器） - helpers.ts: loginViaApi 添加 429 限流重试 + token 缓存 - login.spec.ts / terminal.spec.ts: 修复 Ant Design Vue autoInsertSpace 导致的选择器不匹配 - chat.spec.ts: .first() 改 .last() 避免拾取历史消息 - setup-test-user.py: .local 邮箱改为 .com（EmailStr 拒绝 .local TLD） - .gitignore: Playwright 产物路径限定到 frontend 目录 ### 依赖 - pyproject.toml: 补充 pyjwt, bcrypt, aiosqlite 依赖 - package.json: 添加 @playwright/test 依赖 ## 未完成计划清单（核对结果） ### 计划 001（聊天主区 VI 重梳）— active - U7: SkillsTab/SystemTab/KnowledgeTab 三子组件未实现 - U8: Preview 样例场景精修未完成 - U9: BoardMeetingModal VI 适配收尾未完成 - U10: 质量门与后端回归测试未完成 ### 计划 002（企业级 C/S 架构）— 方案评审中 - 8 个待决策问题未明确（卖给谁/部署位置/终端形态等） - P2/P3/P4 模块延后 ### 计划 003（企业级 C/S 演进）— completed - 7 项 Deferred（Web 管理台/技能市场/SSO/代码索引/多租户等） ### 代码 stub - DockerComputerUseSession: start/stop/screenshot/execute_action 4 个方法为 stub （需真实 Docker + VNC + Anthropic Computer Use API，属未来功能）	2026-06-20 18:22:10 +08:00
chiguyong	91f56ca663	feat: 企业级客户端-服务端架构 + 代码审查修复 ## 主要变更 ### 新增功能 - 企业级客户端-服务端架构（JWT 认证 + RBAC 权限 + 终端安全） - Tauri 桌面客户端与服务端配置同步 - 远程 LLM 网关（RemoteLLMProvider，支持 401 token 刷新重试） - 服务端终端 WebSocket（带管理员审批流程） - 终端白名单六层防御（黑名单 → shell 操作符检测 → 内置安全 → 全局/用户/会话白名单 → 危险检测） ### 代码审查修复（P0/P1/P2） - P0: 危险二进制（rm/docker 等）不再加入白名单，compute_whitelist_entry 返回 None - P1: 终端审批所有权追踪（_approval_owners dict）+ 会话清理防泄漏 - P1: 本地终端 WebSocket URL 补齐 JWT token - P1: 审计日志支持 terminal_mode 过滤 - P1: /system/resources 端点强制 SYSTEM_CONFIG 权限 - P1: RemoteLLMProvider 增加 401 token 刷新重试机制 - P1: auth/models.py 使用 Mapping[str, object] 替代 Any 类型 - P2: 终端授权依赖检查 is_active 账户状态 - 修复 app.py 未使用的 APIKeyAuthMiddleware 导入 ### 文档更新 - README.md: 新增第 16 章「企业级客户端-服务端架构」 - AGENTS.md / CLAUDE.md: 同步模块映射、路由表、前端页面 - 计划文档标记为 completed Closes: docs/plans/2026-06-19-003-feat-enterprise-client-server-evolution-plan.md	2026-06-20 06:48:18 +08:00
chiguyong	771756814f	fix(review): 修复代码审查发现的 P0/P1/P2 问题 P0 (Critical): - orchestrator: plan_update 事件 key 从 phases 改为 plan_phases 匹配前端契约 - orchestrator: team_formed 事件 payload 从 string[] 改为 IExpertInfo[] + plan_phases:[] P1 (High): - orchestrator: 新增 phase_failed 事件广播 (3处: gather 失败/_execute_phase 异常/_mark_dependents_failed 级联) - orchestrator: 新增 team_dissolved 事件广播 (3处: 正常完成/ValueError/Exception) - orchestrator: _mark_dependents_failed 改为 async 以支持事件广播 - orchestrator: gather 结果检查增加 asyncio.CancelledError (Python 3.11+ BaseException) - plan: PhaseStatus.RUNNING 值从 running 改为 in_progress 匹配前端联合类型 - team.ts: updatePhaseStatus 增加 plan_phases undefined 防御守卫 - chat.py: 增加 asyncio.CancelledError 处理 + team.dissolve() 移入 finally 块 P2 (Medium): - orchestrator: _get_isolated_agent 返回类型 Any 改为 ConfigDrivenAgent - orchestrator: _get_llm_gateway 返回类型 Any 改为 LLMGateway \| None - orchestrator: 依赖输出从 SharedWorkspace 读取改为内存 dep_phase.result (减少冗余 I/O) - plan: PlanPhase.to_dict() result 序列化为 string 匹配前端 ITeamPlanPhase.result 类型 - types.ts: expert_step.step 类型从 number 改为 string (后端发送 phase ID) Tests: 377 passed (experts + chat_team + expert_team)	2026-06-18 13:00:59 +08:00
chiguyong	871e20876f	test(integration): U9 重写集成测试覆盖流水线模式 - 33 个测试覆盖 F1-F16 全部场景 - F1: 手动团队组建 (@team:expert1,expert2) - F2: 默认团队模板 (@team:dev_team) - F3: 流水线串行执行 (3阶段 A→B→C) - F4: 并行阶段执行 (无依赖) - F5: 阶段失败和依赖失败传播 - F6: SharedWorkspace 数据传递 - F7: 上下文隔离 (独立 ConfigDrivenAgent) - F8: 事件序列验证 (team_formed → plan_update → phase_started → phase_completed → team_synthesis) - F9: TeamStatus.PLANNING 状态流转 - F10: 循环依赖检测 - F11: 无效专家引用 fallback - F12: LLM 分解失败 fallback - F13-F16: 去中心化协作、用户干预、团队解散、动态专家管理	2026-06-18 02:26:59 +08:00
chiguyong	1e818b507d	feat(server): U6 新增 _execute_team_collab 集成 @team 流水线到 WebSocket	2026-06-18 02:08:29 +08:00
chiguyong	ee6d16345c	feat(experts): U7 新增 5 个编程专家模板 + dev_team 团队模板 + ExpertTeamRouter 模板展开	2026-06-18 01:50:43 +08:00
chiguyong	0f8ea6e21e	feat(experts):重写 TeamOrchestrator 为流水线模式 + TeamStatus.PLANNING	2026-06-18 01:39:22 +08:00
chiguyong	1075598ebf	feat(experts):恢复 plan.py 阶段依赖图 (PlanPhase + topological_sort) - 新增 PhaseStatus 枚举 (PENDING/RUNNING/COMPLETED/FAILED) - 新增 PlanPhase 数据类 (id/name/assigned_expert/task_description/depends_on/status/result) - TeamPlan 新增 phases 字段及配套方法: get_phase/update_phase_status/topological_sort/get_ready_phases - topological_sort 使用 Kahn 算法返回执行层 (list[list[PlanPhase]])，检测循环依赖 - 保留 SubTask/MergeStrategy 向后兼容 - 新增 54 个单元测试覆盖线性/并行/循环依赖、无效引用、就绪阶段、序列化	2026-06-18 01:28:18 +08:00
chiguyong	28ca5b6001	fix(experts):修复 ExpertTeamRouter 模板引用 bug + 修复损坏的集成测试 U1: resolve_expert_configs 中使用 copy.deepcopy(template.config) 替代直接引用，防止 is_lead 赋值污染共享模板（与 BoardRouter 的 P1 修复保持一致）。 U2: 移除 test_expert_team.py 中对已移除类的导入（CollaborationPlan, MergeStrategy, ParallelType, PhaseStatus, PlanPhase），删除使用这些类的测试。保留不依赖已移除类的 8 个测试。U9 将重写为流水线模式测试。	2026-06-18 01:23:25 +08:00
chiguyong	dddcbd24e3	feat: 私董会讨论模式 + 回测集成 + WS持久化修复私董会讨论模式 (Board Meeting Mode): - BoardRouter: @board 前缀路由, 专家名验证, 模板回退 - BoardTeam: 讨论容器, 状态机 (FORMING->DISCUSSING->CONCLUDING->COMPLETED) - BoardOrchestrator: 多轮自主循环讨论引擎, 主持人小结, 停止命令检测 - 9个预设名人专家 YAML (马斯克/贝佐斯/张小龙/芒格等) - 前端 BoardStatusView 群聊式 UI + WebSocket 事件处理 - 后端 chat.py 集成 @board 路由到主聊天流程回测集成: - benchmark.py: 新增 board_meeting 维度 (18 tasks, 6 categories) - benchmark_dataset.py: 新增 BOARD_BENCHMARKS (11 E2E cases) - test_board_backtest.py: 66 个回测测试 (9 test classes) Bug 修复: - resolve_expert_configs: deep-copy 防止 is_lead 修改污染共享模板 - 所有专家名无效时回退到默认模板 - board_router: 非匹配路径 topic 未 strip - benchmark_dataset: board-name-invalid-001 输入修正 WebSocket 持久化修复: - chat.py: 三层防御机制确保任务结果不丢失 - chat store: 断线恢复逻辑部署配置: - Gitea Actions CI/CD workflow - docker-compose.deploy.yaml 部署编排 - scripts/deploy.sh 自动化部署脚本测试结果: 120 单元测试通过, 71 benchmark 测试 100% 通过, ruff 全部通过	2026-06-17 23:52:53 +08:00
chiguyong	5b5291c7e5	fix: WebSocket task persistence three-layer defense with security hardening Fix chat history empty content and task stops on refresh. Implements: result persistence on disconnect, task backgrounding via asyncio + EventQueue, frontend reconnection recovery. Security: fail-closed conversation_id ownership, asyncio.shield on CancelledError cleanup, async TaskStore shim, EventQueue subscriber limit, connection error resilience. 23 tests added.	2026-06-17 22:11:51 +08:00
chiguyong	1fbfd9d132	refactor: standardize benchmark with industry methodology (P/R/F1, multi-run, baseline)	2026-06-17 12:01:34 +08:00
chiguyong	d00995504d	feat: comprehensive capability benchmark and agentkit benchmark CLI	2026-06-17 11:28:09 +08:00
chiguyong	ecf87391a5	feat: integrate SQ/EQ into portal WebSocket and CLI (Phase 4) - app.py: initialize EventQueue + SubmissionQueue in app.state, close on shutdown - portal.py: emit unified events (task.created/started/completed/failed, turn.thinking/tool_call/tool_result/final_answer) to EQ alongside WebSocket messages - cli/chat.py: optional --event-queue flag for event emission - EQ is bypass-only: emit failures never affect WebSocket or CLI main flow - WebSocket message format unchanged (backward compatible) Tests: 650 passed, 0 failed, 4 skipped	2026-06-17 11:05:04 +08:00
chiguyong	bbedfff597	feat: hub-and-spoke experts, tiered tool injection, unified event model (U3/U7/U10)	2026-06-17 10:46:16 +08:00
chiguyong	200174c5c7	feat: SQLite persistence, verification loop, spec-driven execution Phase 2 of architecture optimization (U5/U6/U9): - U5: SqliteConversationStore with WAL mode + LRU cache (1000 convs) Replaces in-memory ConversationStore in portal.py Data survives server restarts (ref: Codex Thread persistence) - U6: VerificationLoop with verify/verify_and_retry Default commands: pytest + ruff check ReActEngine integration via verification_enabled flag New run_tests tool for LLM to invoke verification - U9: SpecManager for plan-as-contract (ref: Qoder Quest Mode) Plans persisted to .agentkit/specs/{spec_id}.yaml API: GET/PUT /api/v1/specs, POST /api/v1/specs/{id}/confirm PlanExecEngine emits spec_created event after plan generation Also fixes: portal skill_name routing, app.py SessionManager guard, test_telemetry CostAwareRouter removal, test_compression_config fixture	2026-06-17 10:45:20 +08:00
chiguyong	5374bc8501	refactor: eliminate routing layer, align with industry best practices Phase 1 of architecture optimization (U1/U2/U4/U8): - U1: Rename SimpleRouter to RequestPreprocessor, route() to preprocess() Eliminates misleading routing concept; LLM decides autonomously in REACT agent loop (matches Codex/Claude Code/Trae pattern) - U2: Delete CostAwareRouter, HeuristicClassifier, SemanticRouter (~700 lines removed). skill_routing.py: 1688 to 220 lines - U4: PlanExecEngine defaults to ReActStepExecutor, delete _LLMStepExecutor (pure LLM calls without tools = no execution capability) - U8: ReActEngine defaults to ContextCompressor(keep_recent=10) Supersedes plans 2026-06-15-002/003/004. New plan: 2026-06-16-006-refactor-architecture-optimization-evolution-plan.md	2026-06-17 10:44:40 +08:00
chiguyong	c4257591d4	refactor(router): replace CostAwareRouter with SimpleRouter and prompt-based tool calling	2026-06-16 03:31:05 +08:00
chiguyong	a27eed3714	fix(config): unify config loading chain and protect ${VAR} references - Settings API: reverse-resolve env vars to preserve ${VAR} refs in yaml, write new API keys to .env instead of agentkit.yaml, extract env_key from existing ${VAR} reference when updating providers - Onboarding: merge-update instead of overwrite when config exists, use config_arg to determine output path, .env merge instead of overwrite - Unified templates: bailian-coding provider name, full model_aliases, docker-compose with postgres, expanded .env.example - Optional ruamel.yaml for comment/format preservation in Settings API - clients.yaml: add _deep_resolve for ${VAR} env var references - All CLI commands use load_config_with_dotenv() consistently - Tests: mock find_config_path and CWD auto-discovery to avoid env leaks	2026-06-16 00:26:54 +08:00
chiguyong	11e2009cb8	feat(router): improve colloquial/mixed-lang routing, fix low-complexity IntentRouter bypass Key improvements: - Low-complexity queries (<0.3) now try IntentRouter keyword match before falling back to DIRECT_CHAT, fixing 0% F1 on keyword_match - SemanticRouter similarity_low lowered from 0.6 to 0.4 - Short text (<20 chars) uses effective_low = max(0.25, low - 0.15) - Short text with no semantic match forces LLM classify fallback - Added colloquial keywords to 7 skill YAMLs - Fixed code_reviewer.yaml output_schema placement - Fixed SemanticRouter build in e2e tests - Fixed base_url detection for bailian-coding API keys Results: keyword_match F1 0->60.87%, colloquial F1 0->100%, mixed_lang F1 0->100%	2026-06-15 23:54:57 +08:00
chiguyong	fa2a6dece2	feat(router): enable SemanticRouter + upgrade benchmark to L3/L5 - Enable SemanticRouter in agentkit.yaml (router.semantic.enabled: true) - Integrate SemanticRouter into e2e backtest (_build_real_components) - Add 8 new semantic test cases: 5 colloquial + 3 mixed-lang expressions - Add L3 output quality evaluation framework (LLM-as-Judge, 1-5 score) - Add L5 adaptive capability metrics (consistency rate from overfitting data) - Add OutputQualityObservation model and evaluate_output_quality() method - Report now includes L3 and L5 sections Results: 52 tests pass, description_match F1=66.67%, L5 adaptive rate=100%	2026-06-15 23:02:47 +08:00
chiguyong	e984b4c462	feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match - Expand ExecutionMode enum with REWOO/REFLEXION/PLAN_EXEC - Add _resolve_execution_mode() to respect skill.config.execution_mode - Rewrite IntentRouter._match_keywords() for multi-candidate scoring - Add QualityGate 5th dimension: skill_match validation with warning escalation - Calibrate HeuristicClassifier: low-complexity signals only when no high signals - Fix negation regex for Chinese text (avoid matching past punctuation) - Fix backtest mode_map normalization and .env loading - Add 61 unit tests (21 HeuristicClassifier + 14 IntentRouter + 13 QualityGate + 13 existing) Results: execution_mode_accuracy 9.09%→36.36%, skill_routing_F1 66.67%→77.78%	2026-06-15 22:43:13 +08:00
chiguyong	64d62a2b60	feat: autonomous task execution - connect PlanExecEngine + TeamOrchestrator U1: TeamOrchestrator._execute_phase real execution (Expert.agent.execute) U2: LLM-based merge strategies (BEST/VOTE/FUSION) with fallback U3: ReActStepExecutor replacing _LLMStepAgent for tool-enabled steps U4: SharedWorkspace integration for cross-phase/cross-execution state U5: GoalPlanner prompt tuning with few-shot and verb pattern matching U6: Replan-before-fallback in TeamOrchestrator U7: End-to-end validation tests for multi-step research tasks U8: WebSocket progress events (step_event_callback + new event types) Code review fixes: P0 response.strip fix, P1 competitor status check, milestone real impl, VOTE self-bias fix, confirmation_handler wiring, ExpertTeam public API, DRY _build_result_summaries, replan tests Also: geo_server.py refactor (ServerConfig.from_yaml), delete llm_config.yaml	2026-06-15 12:41:32 +08:00
chiguyong	99fe4c99f7	fix: comprehensive code review fixes + WS test stability	2026-06-15 08:17:34 +08:00
chiguyong	7384ecb03e	feat: Expert Team Mode — plan-execute collaboration with conversation UI Implements B+C hybrid Expert Team Mode with ExpertConfig, CollaborationPlan, TeamOrchestrator, ExpertTeamRouter, HandoffTransport, SharedWorkspace, and Expert wrapper. Frontend includes ExpertTeamView, ExpertMessage, PlanVisualization, team store, and WS event handlers. Code review fixes: sentinel-based close, per-phase retry, name validation, Vue component integration, teamState dedup, Redis reset, plan reassign, event_type validation, hmac timing-safe compare, message dedup, reactive updatePhases, O(1) phase lookup, iterative DFS, bounded Queue. 232 unit tests passing.	2026-06-14 22:20:14 +08:00
chiguyong	94c4c8b887	feat: accumulated frontend enhancements, docs, and static assets - Frontend view updates (ChatView, EvolutionView, SkillsView, etc.) - Updated portal routes and chat store - New frontend components (FilePreview, ToolCallCard, IconNav) - Updated static build assets - New test files (merged router, parallel tools, ReWOO fallback) - Documentation and brainstorm files - Codegraph and understand-anything artifacts	2026-06-14 16:35:01 +08:00
chiguyong	6e0e081f23	feat: gap closure sprint — dark theme, @-mention, LocalComputerUse, tests P0: U4 UsageStore + U5 CascadeStateStore independent test files (57 tests) P1: Dark theme — tokens.css [data-theme="dark"] + theme.ts Pinia store + TopNav toggle button + App.vue dynamic Ant Design theme P1: @-mention — MentionDropdown.vue + /skills/mention-suggest API + ChatInput integration with @ detection P2: LocalComputerUseSession — pyautogui + screencapture (replaces Docker stub) P2: Integration tests for gap closure (12 tests) Fix: create_cascade_state_store() now passes session_ttl to InMemory fallback	2026-06-14 16:16:50 +08:00
chiguyong	0ccef7be5c	feat: P0 production hardening — LLM cache, semantic routing, state persistence U1: LLM Cache Core (exact + semantic match, InMemory + Redis backends) U2: Cache integration into LLMGateway with CacheConfig U3: Semantic Router as Layer 1.5 in CostAwareRouter U4: UsageStore persistence (Redis Hash + InMemory fallback) U5: CascadeStateStore persistence (Redis INCR + InMemory TTL) U6: EvolutionStore interface unification (Protocol + PostgreSQL backend) U7: Configuration integration + E2E tests Code review fixes: - P0: date iteration bug (day>=28), semantic router index never built, Redis connection leak (per-call → persistent pool) - P1: cache degradation recovery, semantic_search degradation, double miss counting, asyncio.Lock for PG init, LIMIT on queries, __import__ anti-pattern → _utcnow() - P2: InMemory TTL cleanup, embedding preservation on put(), data TTL = max(exact_ttl, semantic_ttl)	2026-06-14 15:16:00 +08:00
chiguyong	09698d7a06	feat: frontend productization with code review fixes - Workflow: visual canvas, undo/redo, drag-and-drop, real-time execution WebSocket - Evolution: dashboard, ECharts metrics, experience timeline, pitfall warnings, usage panel - KB: source CRUD, document upload, search test - Terminal: interactive PTY WebSocket, whitelist security - Security: hmac.compare_digest, API key auth on all endpoints, whitelist bypass fix - Fixes: ECharts async init, WebSocket intentional disconnect, TOCTOU race, Pydantic models	2026-06-13 01:29:58 +08:00
chiguyong	5ef08a3b30	fix(review): comprehensive P0-P2 code review fixes	2026-06-12 22:18:25 +08:00
chiguyong	a36bc3d1c1	feat: optimize chat response speed for sub-1s first token latency - Add HeuristicClassifier to replace LLM quick_classify with zero-cost local heuristic (keyword/length/code-pattern scoring), gated by router.classifier config (default: heuristic) - Add parallel tool execution in ReActEngine via asyncio.gather for multiple independent tool_calls, gated by parallel_tools param - Add AsyncWriteQueue for non-blocking session persistence with WAL buffer, gated by async_writes param on SessionManager - Add httpx.Limits connection pool config to all LLM providers - Add router config section to ServerConfig and agentkit.yaml - All optimizations have config switches for safe rollback	2026-06-12 13:15:06 +08:00
chiguyong	8c365486e2	fix(pipeline): address code review findings for adversarial loop Critical: - C1: Add verifier_timeout_seconds for independent Verifier timeout - C2: Verifier parse failure raises RuntimeError instead of dead-loop Major: - M1: Inject previous_output into Worker retry context - M2: Add Pydantic ge/le constraint on ReviewFeedback.score - M3: Use Literal type for feedback_mode enum validation - M4: Use Literal types for ReviewIssue severity and category - M5: Merge error messages when escalation agent also fails Tests: 8 new test cases added (19 total), all passing	2026-06-12 10:02:37 +08:00
chiguyong	ddc735b078	test(pipeline): add coding harness integration tests 5 passing tests covering: - Pipeline config loading and validation - Review stage adversarial config verification - Stage dependencies validation - Code reviewer skill config and output schema 3 skipped tests (complex mock sequencing covered by unit tests)	2026-06-12 09:42:21 +08:00
chiguyong	3392413614	test(pipeline): add adversarial loop unit tests 11 test cases covering: - PipelineSchemaAdversarial (4): verifier fields, backward compat, serialization, state tracking - AdversarialExecution (3): no verifier passthrough, first round pass, max rounds exhausted - FeedbackContext (3): structured+natural, structured, natural modes - Escalation (1): no escalation configured	2026-06-12 09:40:19 +08:00
chiguyong	32c800d1e4	fix: portal routing + response speed + IME input 1. Portal unified routing: ws_chat now uses CostAwareRouter uniformly (handles Layer 0/1/2), replacing direct IntentRouter calls. Greeting/chat_mode requests skip IntentRouter LLM call entirely. 2. Response speed: greeting & simple chat now use direct LLM call (no ReAct loop), zero-cost Layer 0 detection. 3. IME input fix: use e.isComposing (native browser property) instead of compositionstart/end for Enter key detection. 4. Test: fix InMemoryMessageBus.request() parameter name timeout -> timeout_seconds.	2026-06-11 21:30:25 +08:00
chiguyong	d47f279887	fix: resolve code review issues from deferred improvements 1. InMemoryMessageBus.request(): fix param name (timeout→timeout_seconds) to match ABC 2. InMemoryMessageBus: track consumer tasks, cancel on unsubscribe 3. InMemoryMessageBus: _try_resolve_pending() in queue consumer path 4. evolve_soul(): use "default" category when patterns is empty 5. quick_classify(): use delimiter-based prompt to mitigate injection risk 6. Use asyncio.get_running_loop() instead of deprecated get_event_loop()	2026-06-11 13:49:02 +08:00
chiguyong	79eb8469f9	fix: address remaining code review issues - AlignmentGuard: direction-aware constraint checking (negation/affirmation detection) instead of simple substring matching to reduce false positives - Reflexion: extract actual token usage from LLM response instead of hardcoded 1 - MemoryTool: protect version/history sections from update_soul modification - Fix AsyncMock warnings for sync find_best_agent method	2026-06-11 00:14:11 +08:00
chiguyong	bba394be38	fix(marketplace): address code review findings - Fix str.format() crash when user input contains curly braces - Fix Layer 2 passing str to find_best_agent (expects list[str]) - Fix AlignmentGuard fail-open on LLM audit failure (now fail-closed) - Fix _config_reload_lock not initialized in create_app() - Fix evolve_soul redundant reflector.reflect() call (reuse existing reflection) - Fix test mocks using AsyncMock for sync find_best_agent method - Remove unused _COMPLEXITY_CLASSIFY_PROMPT constant	2026-06-10 19:21:40 +08:00
chiguyong	8713636d50	feat(marketplace): add Phase B/C - CostAwareRouter, OrganizationContext, AlignmentGuard, Soul Evolution, Auction, Server Integration Phase B: - U1: CostAwareRouter with 3-layer routing (rule/LLM/capability matching) - U6: OrganizationContext with agent profiles and capability-based discovery - U7: AlignmentGuard with constraint injection and cascade detection Phase C: - U8: Soul dynamic evolution with version tracking and reflection-triggered updates - U9: Auction mechanism as optional advanced routing mode - U10: Server integration + end-to-end integration tests 250 new tests passing across all units.	2026-06-10 19:09:02 +08:00
chiguyong	5b42487d8a	feat(core): add ReWOO, Plan-and-Execute, Reflexion execution engines Phase A of Multi-Agent Marketplace architecture: - ReWOOEngine: plan-all-then-execute pattern for parallel data fetch - PlanExecEngine: adapter wrapping GoalPlanner+PlanExecutor+PipelineReplanner - ReflexionEngine: ReAct + Evaluate + Reflect + Retry for high-precision tasks - SkillConfig: extend VALID_EXECUTION_MODES with rewoo/plan_exec/reflexion - ConfigDrivenAgent: add _handle_rewoo/_handle_plan_exec/_handle_reflexion routes - 5 professional agent YAML configs with layered model defaults - 107 unit tests passing	2026-06-10 17:08:48 +08:00
chiguyong	6852dfe892	fix(security,reliability): resolve all P2 findings from code review	2026-06-10 15:05:40 +08:00
chiguyong	658e188939	fix(review): resolve P0/P1 findings from final code review	2026-06-10 09:57:29 +08:00
chiguyong	1d1805753c	fix: resolve key P2 findings from code review - Shell whitelist: use exact binary match instead of startswith - Shell audit log: use deque(maxlen=10000) to cap memory - Terminal history: use deque(maxlen) for O(1) eviction - Path optimizer: cap _pending_paths at 50 entries per task_type - Pitfall detector: only add tips to matching steps, not all - Experience store: handle non-numeric _parse_time_window input - Extract shared is_safe_url() to utils/security.py (DRY) - Workflow condition evaluator: handle float() ValueError	2026-06-10 09:01:23 +08:00
chiguyong	b46a10973f	fix(tests): clean up test_shell_tool.py lint issues	2026-06-10 08:46:35 +08:00
chiguyong	9646b0f0dd	fix(tests): update test_shell_tool.py to match new ShellTool API	2026-06-10 08:22:15 +08:00
chiguyong	7874e875af	merge: integrate feat/agentkit-phase8-chat-adaptive (chat/gui commands + GUI mode) Restores agentkit chat, agentkit gui CLI commands, onboarding wizard, and GUI mode (AGENTKIT_GUI_MODE) with static file serving. Resolves merge conflicts in orchestrator.py, app.py, tools/__init__.py, shell.py.	2026-06-10 07:44:06 +08:00
chiguyong	9e9f1314f6	fix(security): resolve all P0/P1 findings from code review	2026-06-10 07:12:41 +08:00
chiguyong	b34f74f598	feat(phase6): implement end-to-end enterprise scenario validation (U15) - Add goal-driven agent skill config and pipeline config - Add 9 E2E integration tests covering all 7 capabilities: - SC1: Goal-driven SEO analysis (GoalPlanner→PlanExecutor→PlanChecker→ExperienceStore) - SC2: Knowledge Q&A with system operation (MultiSourceRAG) - SC3: Workflow with approval (WorkflowStore + approval node) - SC4: Self-evolution experience accumulation (ExperienceStore→PitfallDetector→PathOptimizer) - SC5: Parallel execution efficiency verification - SC6: Skill registry integration (capabilities, versions, health) - Cross-capability: Plan+Experience+Pitfall, Review+Experience, RAG+Workflow - All 2472 tests passing (9 integration + 2463 unit)	2026-06-10 01:38:28 +08:00
chiguyong	c606ffa64a	feat(phase5): implement management pages, evolution dashboard, and workflow editor (U13b/U13c/U14)	2026-06-10 01:29:01 +08:00
chiguyong	a1deeecede	feat(phase5): implement Vue3 portal foundation with chat interface and routing (U13a) - Add Portal API routes: chat, stream, capabilities, conversations, WebSocket - Add ConversationStore for in-memory conversation management - Add CAPABILITY_CATEGORIES mapping for 8 capability types - Create Vue3 SPA with TypeScript, Pinia, Vue Router, Ant Design Vue - Implement ChatView with message bubbles, input, sidebar, WebSocket support - Add side navigation skeleton for all 8 capability sections - Add placeholder views for workflow, knowledge, skills, terminal, etc. - 31 backend tests passing	2026-06-10 01:06:48 +08:00
chiguyong	901e4d9d0a	feat(phase4): implement Computer Use integration (U12) - ComputerUseTool: Anthropic API + fallback chain (API→Session→ShellTool→AskHuman) - ComputerUseSession: Docker sandbox + InMemory test session - ComputerUseRecorder: action recording, replay, and persistence 89 new tests passing. Degradation chain verified.	2026-06-10 00:54:31 +08:00
chiguyong	c99aee1423	feat(phase3): implement knowledge base and RAG enhancement (U9-U11) - U9: LocalDocumentIngestion - multi-format doc parsing and chunking - U10: ExternalKBAdapters - Feishu/Confluence/GenericHTTP adapters - U11: MultiSourceRAG - multi-source retrieval with source tracing KnowledgeBase protocol defined (KTD-7). 145 new tests passing.	2026-06-10 00:45:17 +08:00
chiguyong	e3d4f811dd	feat(phase2): implement self-evolution and smart terminal (U6-U8) - U6: PitfallDetector - detect historical failure patterns and warn - U7: PathOptimizer - discover and update optimal execution paths - U8: TerminalSession - session state, PTY interactive, output parsing 160 new tests passing. ShellTool enhanced with session_id support.	2026-06-10 00:22:36 +08:00
chiguyong	fd4a811929	feat(phase1): implement core kernel and experience foundation (U1-U5) - U1: GoalPlanner - structured goal decomposition wrapping _decompose_task() - U2: PlanExecutor - parallel execution with retry/skip/replace strategies - U3: PlanChecker - quality gate + review + experience writing - U4: Skill spec upgrade - dependencies, capabilities, version management - U5: ExperienceStore - PostgreSQL+pgvector task experience storage 208 new tests passing, fully backward compatible.	2026-06-09 23:57:03 +08:00
chiguyong	31bd3b126c	feat(phase8): chat adaptive enhancements, pipeline reflection, search tools upgrade - Enhanced chat CLI with adaptive mode and session management - Added pipeline reflection and schema extensions - Upgraded BaiduSearch and WebSearch tools with advanced capabilities - Expanded server routes for skills and chat - Added session store enhancements - New chat module and pipeline reflection support	2026-06-09 23:18:06 +08:00
chiguyong	045fecd4ce	feat(tools): add ShellTool + WebSearchTool, memory system, onboarding wizard, chat mode - ShellTool: safe command execution with allowlist, blocked patterns (regex), timeout, output truncation - WebSearchTool: multi-backend search with Tavily → Serper → DuckDuckGo Lite fallback - MemoryTool: agent-callable tool with add/replace/remove/read actions - MemoryStore/MemoryFile: file-based memory (SOUL.md, USER.md, MEMORY.md, DAILY.md) - Onboarding wizard: provider selection, API key, model selection, agent personality - Chat mode: interactive CLI with streaming, memory injection, tool integration - Add 百炼 Coding Plan provider with 10 models - 102 unit tests (34 new for ShellTool + WebSearchTool)	2026-06-09 01:06:45 +08:00
chiguyong	9874a4aac0	test: add Phase 8 integration tests for Chat + Adaptive + Multi-Agent (U8) End-to-end integration tests covering session lifecycle, adaptive pipeline, multi-agent communication via MessageBus, and config serialization.	2026-06-08 01:17:04 +08:00
chiguyong	45283d31e8	feat(core): integrate MessageBus into Orchestrator and AgentPool (U7) - Orchestrator accepts optional message_bus parameter; workers publish task.progress messages via MessageBus after each subtask execution - AgentPool accepts optional message_bus; auto-registers agents on create and auto-unregisters on remove - app.py initializes MessageBus from config and injects into AgentPool - ServerConfig adds bus configuration field - 5 new tests, all passing	2026-06-08 00:03:40 +08:00
chiguyong	13d6e74099	feat(bus): add MessageBus abstraction layer with InMemory + Redis Streams (U6) - AgentMessage: message model with sender/recipient/topic/payload/correlation_id - MessageBus Protocol: publish/subscribe/unsubscribe/request/broadcast/health_check - InMemoryMessageBus: asyncio.Queue-based implementation for testing - RedisMessageBus: Redis Streams (XADD/XREADGROUP) implementation with consumer groups, message acknowledgment, and dead letter queue - create_message_bus() factory with graceful Redis→InMemory fallback - Request-response pattern via correlation_id + asyncio.Future - 13 new tests, all passing	2026-06-07 23:58:16 +08:00
chiguyong	88d8298871	feat(core): add Orchestrator adaptive task decomposition (U5) - execute_adaptive(): iterative execute→evaluate→re-decompose loop - OrchestratorConfig: adaptive, max_iterations, quality_threshold - _evaluate_quality(): LLM-based or rule-based quality scoring (0-1) - _reexecute_failed(): preserves completed subtask results, retries failed ones with improvement feedback injected into input_data - OrchestrationResult.metadata field for tracking iteration history - 10 new tests, all passing	2026-06-07 23:50:54 +08:00
chiguyong	7054ac02b6	feat(tools): add AskHumanTool + token streaming in ReAct execute_stream - AskHumanTool: Human-in-the-Loop tool for Chat mode, pushes questions via WebSocket callback and waits for user reply via asyncio.Future - Token streaming: execute_stream() now uses chat_stream() instead of chat(), yielding token-type ReActEvents for each StreamChunk - _build_response_from_stream() static method constructs LLMResponse from accumulated stream data - Export AskHumanTool from tools/__init__.py - 12 new tests (7 AskHumanTool + 5 token streaming), all passing	2026-06-07 23:40:43 +08:00
chiguyong	6013d5189b	feat(chat): add Chat API routes with REST + WebSocket bidirectional communication	2026-06-07 22:49:26 +08:00
chiguyong	493187782c	feat(session): add Session/Message models and SessionManager with InMemory/Redis stores	2026-06-07 22:43:14 +08:00
chiguyong	b34b06724d	fix(agentkit): resolve all P0/P1/P2/P3 issues from code review	2026-06-07 22:05:18 +08:00
chiguyong	bad66445ff	feat(compression): U6 GEO Pipeline compression integration tests and config Add GEO Pipeline end-to-end compression integration tests with MockHeadroomCompressor. Add compression configuration section to llm_config.yaml with headroom and summary mode examples.	2026-06-07 18:20:41 +08:00
chiguyong	9c04362dba	feat(compression): U5 HeadroomRetrieveTool for CCR cache retrieval Add HeadroomRetrieveTool that allows LLM to retrieve original uncompressed data from CCR cache via Function Calling. Auto-registered when HeadroomCompressor is active and available.	2026-06-07 18:20:17 +08:00
chiguyong	286804792d	feat(compression): U4 ServerConfig compression field and Agent injection Add compression config to ServerConfig (following telemetry pattern), create compressor in create_app, pass through AgentPool to ConfigDrivenAgent, and inject into ReActEngine.execute() calls.	2026-06-07 18:20:05 +08:00
chiguyong	fcb4fb33f3	feat(compression): U3 ReAct engine tool result compression and incremental compress Extend _build_tool_result_message to accept compressor parameter for tool output compression. Add _should_compress helper for token budget checking. Add incremental compression within ReAct loop when conversation exceeds threshold.	2026-06-07 18:19:53 +08:00
chiguyong	ea705b979b	feat(compression): U2 HeadroomCompressor with SmartCrusher and CCR cache Add HeadroomCompressor implementing CompressionStrategy Protocol with content-type routing (JSON→SmartCrusher, code→CodeCompressor), CCR reversible compression cache, and graceful degradation when headroom-ai is not installed.	2026-06-07 18:19:41 +08:00
chiguyong	5d3a5f2bf3	feat(compression): U1 CompressionStrategy Protocol and create_compressor factory Add runtime-checkable CompressionStrategy Protocol with compress(), compress_tool_result(), and is_available() methods. Add compress_tool_result and is_available to existing ContextCompressor. Add create_compressor() factory function with headroom/summary provider routing and ImportError fallback.	2026-06-07 18:19:27 +08:00
chiguyong	239009357a	feat(telemetry): U7 OpenTelemetry integration with zero-dependency no-op pattern Add telemetry module with tracing (agent/tool/llm/pipeline_step spans), metrics (5 histograms/counters), and setup with optional OTLP exporters. Uses no-op pattern when opentelemetry not installed. GenAI Semantic Conventions for LLM spans. Integrated into ReactEngine, LLMGateway, ToolBase, and FastAPI app.	2026-06-07 17:26:21 +08:00
chiguyong	03a5167366	feat(pipeline): U6 step-level retry with exponential backoff and saga compensation Add StepRetryPolicy with jitter-based exponential backoff, SagaOrchestrator with LIFO compensation pattern, integrate retry_policy and compensate fields into PipelineStage/PipelineStep schema, add GEO pipeline compensation definitions for all 7 steps.	2026-06-07 17:26:07 +08:00
chiguyong	4db637cd4f	feat(pipeline): U5 state persistence with Redis hot + PG cold dual-write Add PipelineStateMemory/Redis/PG backends, PipelineStateManager with Redis Sorted Set hot state + PostgreSQL JSONB cold persistence. Integrated into PipelineEngine with state persistence calls at each step transition.	2026-06-07 17:25:52 +08:00
chiguyong	2e547e345a	feat(geo): U4 GEO skill tool binding with BaiduSearch and E2E tests Add BaiduSearchTool (API mode + scraping fallback), bind tools to GEO skill YAML configs (baidu_search, web_crawl, schema_extract, schema_generate), extend geo_full_pipeline with generate_content and deai steps, add 36 E2E integration tests.	2026-06-07 17:25:37 +08:00
chiguyong	9ec1740047	feat(tools): U3 built-in Python tools - WebCrawl, SchemaExtract, SchemaGenerate Add WebCrawlTool (Crawl4AI wrapper with graceful degradation), SchemaExtractTool (extruct-based Schema.org extraction), and SchemaGenerateTool (JSON-LD generation with optional pydantic-schemaorg validation). All tools work without optional dependencies.	2026-06-07 17:25:24 +08:00

1 2 3 4

183 Commits