Commit Graph

205 Commits

Author SHA1 Message Date
chiguyong 09feca3307 feat(admin): U7 — usage dashboard + quota enforcement
UsageRecord extended with user_id + department_id (backward compatible).
UsageStore Protocol extended: record() accepts user_id/department_id,
get_usage() accepts filters, new get_usage_by_user/department methods.
RedisUsageStore uses versioned keys (v2) for new records.

LLMGateway.chat()/chat_stream() accept user_id, department_ids, db_path.
Quota check before provider call: model whitelist + token limit + cost
limit (daily). Multi-department uses strictest-wins (any exceed → reject).
QuotaExceededError → 429 at route layer.

UsageService: summary, timeseries, by-model, top-users, export (CSV/JSON).
5 new admin endpoints under /admin/usage/*.

llm_gateway.py routes pass DepartmentContext + db_path to gateway,
catch QuotaExceededError → 429 (JSON for /chat, SSE error for /stream).

84 new tests. 441 admin+usage tests pass, no regressions.
2026-06-21 17:23:20 +08:00
chiguyong fd7f6816b8 feat(admin): U6 — Skill & KB management endpoints + department binding
SkillService: enable/disable (persisted in skill_states table, schema
v4), import from YAML (with path traversal + name validation), reload
from file, update config. GET /skills now filters disabled skills.

KbService: list/upload/delete documents with department_id binding.
Added department_id field to KnowledgeSource + UploadedDocument.
Department visibility: (bound to user depts) ∪ (global = None).

10 new admin endpoints: skill enable/disable/import/reload/update,
KB documents CRUD, source sync/rebuild. All guarded by _require_admin.

Implemented reload stub in skill_management.py (was no-op).

54 new tests (26 unit + 28 integration). Fixed 4 pre-existing lint
errors. 357 admin tests pass, no regressions.
2026-06-21 16:19:51 +08:00
chiguyong 980919fc95 feat(admin): U5 — LLM config admin endpoints + department quotas
QuotaService: set/get/list/delete quotas, check_quota (hard reject),
is_model_allowed. JSON-serialized limit_value, upsert with ON CONFLICT.

LlmConfigService: provider CRUD + set_api_key + fallback management.
fcntl.flock file lock prevents concurrent YAML writes. Reuses
settings.py helpers (_read_yaml_config, _write_yaml_config,
_write_env_var, _mask_api_key).

11 new admin endpoints: provider CRUD, api-key, fallback CRUD,
department quotas CRUD. All guarded by _require_admin.

93 new tests (30 quota unit + 32 llm-config unit + 31 integration).
2026-06-21 15:03:38 +08:00
chiguyong ad65f7a8d7 feat(admin): U1+U2+U4 — schema v3, department service, context filtering
U1: Bump _SCHEMA_VERSION to 3, add 5 department tables (departments,
user_departments, department_skill_bindings, department_kb_bindings,
department_quotas) + 5 ORM models + helpers.

U2: DepartmentService (12 async methods: CRUD + bind/unbind skill/KB +
count_users). Mount admin_router in app.py. 36 unit + 28 integration tests.

U4: DepartmentContext FastAPI dependency (per-route, admin bypasses
filtering). filter_skills_by_department / filter_kb_sources_by_department
helpers. Applied to GET /skills and GET /kb-management/* routes.
15 integration tests for department isolation.

Also includes brainstorm + plan docs. 108 new tests, all pass.
2026-06-21 15:03:27 +08:00
chiguyong 6dca9ba4f2 feat(admin): U3 — user CRUD + password reset + multi-department
Add create_user method to LocalAuthProvider (bcrypt hash + INSERT,
raises ValueError on duplicate username/email).

Add UserService with 9 async methods: create/list/get/update/delete
(soft)/reset_password/assign_department/remove_department/list_user_departments. reset_password revokes all sessions via SessionService.
delete_user is soft (is_active=0, row preserved).

Add 9 user endpoints to routes/admin.py: POST/GET/PATCH/DELETE users,
reset-password, assign/remove department, list departments. All
guarded by _require_admin.

Tests: 40 unit + 37 integration = 77 new tests. Full admin suite
170 tests pass, no regressions.
2026-06-21 13:45:42 +08:00
chiguyong 67c0d67262 fix(auth,chat): P0 security fixes + stop-generation button + doc sync
U1: whoami cold-start security — add is_active check (disabled users
now get 401, not 200) and replace create_token_pair with create_access_token
to avoid minting a discarded refresh token (token-amplification risk).

U2: list_active_by_provider now filters expired sessions (expires_at > now)
matching its docstring promise; previously only checked revoked = 0.

U3: Fix asyncio.run() crash in test_revoke_other_user_session_returns_404
(converted to async). Add U1/U2 verification tests (disabled-user whoami,
no-refresh-leak, expired-session filtering, provider filtering) and
strengthen admin route tests (404 boundary, non-admin 403 on /admin/sessions).

U4: Update CLAUDE.md/AGENTS.md Request Flow — CostAwareRouter 3-layer
diagram replaced with actual RequestPreprocessor architecture (@board/@team
prefix intercepts then @skill: prefix then trivial-input regex then default
REACT). ExecutionMode list expanded to all 7 values.

U5: Frontend stop-generation button — ChatInput.vue shows a stop button
when isGenerating is true; chat store gains stopGeneration() that sends
{type:"cancel"} over WebSocket (backend portal.py already handles cancel).

Tests: 120 auth tests pass (unit + integration). ruff clean. vue-tsc clean.
2026-06-21 11:36:58 +08:00
chiguyong 9328451050 feat(auth): U7-U10 会话管理 UI + admin API + 测试修复
- U7: 前端 ActiveSessionsPanel + ChangePasswordPanel 组件
- U8: 用户会话管理(查看/撤销/改密)集成到 SettingsView
- U9: 管理员会话管理 API + UserSessionsPanel + AdminApiClient
- U10: 认证中间件支持 sid 会话验证 + legacy client 兼容
- 修复 test_auth.py 测试夹具:注入 SessionService 单例绑定测试 DB
- 修复 wrong-password 断言大小写匹配
- ruff: 清理未使用导入
2026-06-21 08:48:25 +08:00
chiguyong b418c3dc95 feat(auth): U3 SessionService + validation cache
Adds the central business-logic layer for ``auth_sessions`` so routes,
the auth middleware, and the admin endpoints can call a single service
instead of touching the table directly.

Server
- session_service.SessionService: CRUD + lifecycle for auth_sessions.
  - create() enforces the per-user cap (default 10): the oldest
    active session is evicted with reason=session_cap_eviction.
  - rotate() swaps a refresh token, adds the old hash to the
    denylist, and raises SessionReuseDetected (revoking all sessions
    for the user) when the old token is replayed.
  - revoke() / revoke_by_refresh_token() / revoke_all_for_user()
    with explicit reasons: user_terminated, admin_revoked,
    password_changed, reuse_detected, session_cap_eviction.
  - touch() bumps last_active_at (called on /auth/whoami).
- session_cache.SessionValidationCache: bounded LRU+TTL wrapper
  (default 30s/1k entries) around SessionService.is_session_valid.
  The middleware hits this on every request carrying a V2 sid claim;
  one SQLite round-trip per 30s per session instead of per request.
- get_session_service() / get_validation_cache() module-level
  singletons overridable in tests via set_session_service() /
  set_validation_cache().

Tests
- tests/unit/auth/test_session_service.py: 15 cases covering
  create/rotate/revoke/list/cap-eviction/reuse-detection/expired
  sessions.

Refs: U3 in docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md
2026-06-21 01:58:30 +08:00
chiguyong 5ba1aceb96 feat(auth): U2 JWT sid/jti claims + refresh-token denylist
Adds V2 JWT claim schema that closes the kicked-out window and enables
refresh-token rotation with reuse detection.

Server
- jwt_utils.create_token_pair now takes ``session_id`` and ``remember_me``
  kwargs.  When ``session_id`` is provided, both tokens carry a ``sid``
  claim and the access token also carries a ``jti`` claim; the refresh
  token's jti is intentionally absent (rotation uses the token hash).
- New ``REFRESH_TOKEN_TTL_REMEMBER_ME = 30d`` (default 7d) selected by
  the ``remember_me`` flag.
- ``verify_token`` now supports an optional ``expected_type`` filter
  (e.g. ``"access"`` / ``"refresh"``); when omitted, both types pass
  (used by /auth/whoami's cold-start path).
- New ``auth.denylist`` module: ``InMemoryRecentlyRevoked`` (default for
  the Tauri sidecar / dev mode) and ``RedisRecentlyRevoked`` (multi-
  process server).  Bounded LRU with auto-expiry via ``time.monotonic()``.

Backwards-compat
- Tokens issued before U2 (no ``sid``) are still accepted by
  ``verify_token``; validation falls through to the legacy
  ``user_sessions`` table via the U10 shim (next commit).

Tests
- tests/unit/auth/test_jwt_utils.py: 12 cases — V1/V2 claim presence,
  default + remember-me TTL, expected_type filter, expiry, wrong secret.
- tests/unit/auth/test_denylist.py: 6 cases — add/contains, TTL expiry,
  LRU eviction, re-add refresh, clear, hash stability.

Refs: U2 in docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md
2026-06-21 01:53:13 +08:00
chiguyong 2f55fc7434 feat(auth): U11 AuthProvider 抽象层 + auth_sessions schema
为未来对接集团 IdP(OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)留扩展点,
同时落地 auth_sessions 表(V2 替代 user_sessions)。

变更
- models.py: 新增 auth_sessions + auth_meta 表,V1→V2 数据回填
- providers/base.py: AuthProvider Protocol 接口契约
- providers/local.py: LocalAuthProvider 默认实现(封装 SQLite + bcrypt)
- providers/oidc_stub.py: StubOIDCProvider 占位(NotImplementedError)
- providers/__init__.py: get_auth_provider DI 工厂(lru_cache 单例)
- providers/exceptions.py: AuthProviderError / InvalidCredentials / ProviderNotImplemented
- providers/user.py: Provider-agnostic User 值对象
- tests/unit/auth/: 37 个测试覆盖 Protocol / DI / Local / OIDC 行为

auth_sessions.auth_provider 字段记录登录来源(local / oidc-stub / 未来
oidc-keycloak / saml / ldap),未来切 IdP 时审计可溯源。

测试: 37 passed (providers) + 62 passed (auth 全集) + ruff check clean
2026-06-21 01:28:14 +08:00
chiguyong cac9c73dd5 fix(routing): U1-U6 路由优化 + 修复方案 + 代码审查修复
实现 6 个修复单元(U1-U6)并应用 ce-code-review 发现的 5 项安全修复。

## U1: benchmark 超时阈值
- 按 difficulty 分级超时:easy=45s, medium=60s, hard=90s
- 替换原单一 60s 硬编码

## U2: OpenAICompatibleProvider httpx 超时
- 新增 timeout 参数(默认 120s),替换硬编码 60s
- ProviderConfig.timeout 透传到 Provider
- 新增 2 项单元测试

## U3: 激活 QualityGate skill_match 校验
- BaseAgent._build_skill_context() 构造 skill_context
- 在 base.py / tasks.py / runner.py 三处传入 QualityGate.validate()

## U4: 添加 disambiguation_keywords 字段
- IntentConfig 新增 disambiguation_keywords 字段
- 8 个 skill YAML 补充该字段

## U5: 优化 RequestPreprocessor 路由正则
- 拆分 _FACTUAL_RE 为 CN/EN 双正则(中文无空格)
- 新增 _MATH_RE / _TRANSLATION_RE 纯模式
- _TOOL_CONTEXT_RE 排除需要工具的实时查询
- 多行输入守卫 + 结尾标点支持
- 新增 21 项单元测试(共 40 项全通过)

## U6: 重新基准测试
- 真实 LLM benchmark:准确率 60% -> 93.3%
- 4/5 通过,p50=40.8s,一致性=100%
- 旧基线备份至 baseline_2026-06-17_old_arch.json

## ce-code-review 修复(5 项)
- 修复 \s 字符类匹配换行符的安全隐患
- 添加事实/数学正则的结尾标点支持
- 修复 geo_optimizer.yaml 关键词重复
- 修复 _login_with_retry 不可达 return
- 修复 real_llm_server fixture stderr_fh 资源泄漏

测试:tests/unit/chat/ 63 项全通过,ruff 检查通过。
2026-06-20 19:31:49 +08:00
chiguyong 91f56ca663 feat: 企业级客户端-服务端架构 + 代码审查修复
## 主要变更

### 新增功能
- 企业级客户端-服务端架构(JWT 认证 + RBAC 权限 + 终端安全)
- Tauri 桌面客户端与服务端配置同步
- 远程 LLM 网关(RemoteLLMProvider,支持 401 token 刷新重试)
- 服务端终端 WebSocket(带管理员审批流程)
- 终端白名单六层防御(黑名单 → shell 操作符检测 → 内置安全 → 全局/用户/会话白名单 → 危险检测)

### 代码审查修复(P0/P1/P2)
- P0: 危险二进制(rm/docker 等)不再加入白名单,compute_whitelist_entry 返回 None
- P1: 终端审批所有权追踪(_approval_owners dict)+ 会话清理防泄漏
- P1: 本地终端 WebSocket URL 补齐 JWT token
- P1: 审计日志支持 terminal_mode 过滤
- P1: /system/resources 端点强制 SYSTEM_CONFIG 权限
- P1: RemoteLLMProvider 增加 401 token 刷新重试机制
- P1: auth/models.py 使用 Mapping[str, object] 替代 Any 类型
- P2: 终端授权依赖检查 is_active 账户状态
- 修复 app.py 未使用的 APIKeyAuthMiddleware 导入

### 文档更新
- README.md: 新增第 16 章「企业级客户端-服务端架构」
- AGENTS.md / CLAUDE.md: 同步模块映射、路由表、前端页面
- 计划文档标记为 completed

Closes: docs/plans/2026-06-19-003-feat-enterprise-client-server-evolution-plan.md
2026-06-20 06:48:18 +08:00
chiguyong 771756814f fix(review): 修复代码审查发现的 P0/P1/P2 问题
P0 (Critical):
- orchestrator: plan_update 事件 key 从 phases 改为 plan_phases 匹配前端契约
- orchestrator: team_formed 事件 payload 从 string[] 改为 IExpertInfo[] + plan_phases:[]

P1 (High):
- orchestrator: 新增 phase_failed 事件广播 (3处: gather 失败/_execute_phase 异常/_mark_dependents_failed 级联)
- orchestrator: 新增 team_dissolved 事件广播 (3处: 正常完成/ValueError/Exception)
- orchestrator: _mark_dependents_failed 改为 async 以支持事件广播
- orchestrator: gather 结果检查增加 asyncio.CancelledError (Python 3.11+ BaseException)
- plan: PhaseStatus.RUNNING 值从 running 改为 in_progress 匹配前端联合类型
- team.ts: updatePhaseStatus 增加 plan_phases undefined 防御守卫
- chat.py: 增加 asyncio.CancelledError 处理 + team.dissolve() 移入 finally 块

P2 (Medium):
- orchestrator: _get_isolated_agent 返回类型 Any 改为 ConfigDrivenAgent
- orchestrator: _get_llm_gateway 返回类型 Any 改为 LLMGateway | None
- orchestrator: 依赖输出从 SharedWorkspace 读取改为内存 dep_phase.result (减少冗余 I/O)
- plan: PlanPhase.to_dict() result 序列化为 string 匹配前端 ITeamPlanPhase.result 类型
- types.ts: expert_step.step 类型从 number 改为 string (后端发送 phase ID)

Tests: 377 passed (experts + chat_team + expert_team)
2026-06-18 13:00:59 +08:00
chiguyong 1e818b507d feat(server): U6 新增 _execute_team_collab 集成 @team 流水线到 WebSocket 2026-06-18 02:08:29 +08:00
chiguyong ee6d16345c feat(experts): U7 新增 5 个编程专家模板 + dev_team 团队模板 + ExpertTeamRouter 模板展开 2026-06-18 01:50:43 +08:00
chiguyong 0f8ea6e21e feat(experts):重写 TeamOrchestrator 为流水线模式 + TeamStatus.PLANNING 2026-06-18 01:39:22 +08:00
chiguyong 1075598ebf feat(experts):恢复 plan.py 阶段依赖图 (PlanPhase + topological_sort)
- 新增 PhaseStatus 枚举 (PENDING/RUNNING/COMPLETED/FAILED)
- 新增 PlanPhase 数据类 (id/name/assigned_expert/task_description/depends_on/status/result)
- TeamPlan 新增 phases 字段及配套方法: get_phase/update_phase_status/topological_sort/get_ready_phases
- topological_sort 使用 Kahn 算法返回执行层 (list[list[PlanPhase]]),检测循环依赖
- 保留 SubTask/MergeStrategy 向后兼容
- 新增 54 个单元测试覆盖线性/并行/循环依赖、无效引用、就绪阶段、序列化
2026-06-18 01:28:18 +08:00
chiguyong dddcbd24e3 feat: 私董会讨论模式 + 回测集成 + WS持久化修复
私董会讨论模式 (Board Meeting Mode):
- BoardRouter: @board 前缀路由, 专家名验证, 模板回退
- BoardTeam: 讨论容器, 状态机 (FORMING->DISCUSSING->CONCLUDING->COMPLETED)
- BoardOrchestrator: 多轮自主循环讨论引擎, 主持人小结, 停止命令检测
- 9个预设名人专家 YAML (马斯克/贝佐斯/张小龙/芒格等)
- 前端 BoardStatusView 群聊式 UI + WebSocket 事件处理
- 后端 chat.py 集成 @board 路由到主聊天流程

回测集成:
- benchmark.py: 新增 board_meeting 维度 (18 tasks, 6 categories)
- benchmark_dataset.py: 新增 BOARD_BENCHMARKS (11 E2E cases)
- test_board_backtest.py: 66 个回测测试 (9 test classes)

Bug 修复:
- resolve_expert_configs: deep-copy 防止 is_lead 修改污染共享模板
- 所有专家名无效时回退到默认模板
- board_router: 非匹配路径 topic 未 strip
- benchmark_dataset: board-name-invalid-001 输入修正

WebSocket 持久化修复:
- chat.py: 三层防御机制确保任务结果不丢失
- chat store: 断线恢复逻辑

部署配置:
- Gitea Actions CI/CD workflow
- docker-compose.deploy.yaml 部署编排
- scripts/deploy.sh 自动化部署脚本

测试结果: 120 单元测试通过, 71 benchmark 测试 100% 通过, ruff 全部通过
2026-06-17 23:52:53 +08:00
chiguyong 5b5291c7e5 fix: WebSocket task persistence three-layer defense with security hardening
Fix chat history empty content and task stops on refresh. Implements: result persistence on disconnect, task backgrounding via asyncio + EventQueue, frontend reconnection recovery. Security: fail-closed conversation_id ownership, asyncio.shield on CancelledError cleanup, async TaskStore shim, EventQueue subscriber limit, connection error resilience. 23 tests added.
2026-06-17 22:11:51 +08:00
chiguyong ecf87391a5 feat: integrate SQ/EQ into portal WebSocket and CLI (Phase 4)
- app.py: initialize EventQueue + SubmissionQueue in app.state, close on shutdown
- portal.py: emit unified events (task.created/started/completed/failed,
  turn.thinking/tool_call/tool_result/final_answer) to EQ alongside WebSocket messages
- cli/chat.py: optional --event-queue flag for event emission
- EQ is bypass-only: emit failures never affect WebSocket or CLI main flow
- WebSocket message format unchanged (backward compatible)

Tests: 650 passed, 0 failed, 4 skipped
2026-06-17 11:05:04 +08:00
chiguyong bbedfff597 feat: hub-and-spoke experts, tiered tool injection, unified event model (U3/U7/U10) 2026-06-17 10:46:16 +08:00
chiguyong 200174c5c7 feat: SQLite persistence, verification loop, spec-driven execution
Phase 2 of architecture optimization (U5/U6/U9):

- U5: SqliteConversationStore with WAL mode + LRU cache (1000 convs)
  Replaces in-memory ConversationStore in portal.py
  Data survives server restarts (ref: Codex Thread persistence)
- U6: VerificationLoop with verify/verify_and_retry
  Default commands: pytest + ruff check
  ReActEngine integration via verification_enabled flag
  New run_tests tool for LLM to invoke verification
- U9: SpecManager for plan-as-contract (ref: Qoder Quest Mode)
  Plans persisted to .agentkit/specs/{spec_id}.yaml
  API: GET/PUT /api/v1/specs, POST /api/v1/specs/{id}/confirm
  PlanExecEngine emits spec_created event after plan generation

Also fixes: portal skill_name routing, app.py SessionManager guard,
test_telemetry CostAwareRouter removal, test_compression_config fixture
2026-06-17 10:45:20 +08:00
chiguyong 5374bc8501 refactor: eliminate routing layer, align with industry best practices
Phase 1 of architecture optimization (U1/U2/U4/U8):

- U1: Rename SimpleRouter to RequestPreprocessor, route() to preprocess()
  Eliminates misleading routing concept; LLM decides autonomously
  in REACT agent loop (matches Codex/Claude Code/Trae pattern)
- U2: Delete CostAwareRouter, HeuristicClassifier, SemanticRouter
  (~700 lines removed). skill_routing.py: 1688 to 220 lines
- U4: PlanExecEngine defaults to ReActStepExecutor, delete _LLMStepExecutor
  (pure LLM calls without tools = no execution capability)
- U8: ReActEngine defaults to ContextCompressor(keep_recent=10)

Supersedes plans 2026-06-15-002/003/004.
New plan: 2026-06-16-006-refactor-architecture-optimization-evolution-plan.md
2026-06-17 10:44:40 +08:00
chiguyong c4257591d4 refactor(router): replace CostAwareRouter with SimpleRouter and prompt-based tool calling 2026-06-16 03:31:05 +08:00
chiguyong a27eed3714 fix(config): unify config loading chain and protect ${VAR} references
- Settings API: reverse-resolve env vars to preserve ${VAR} refs in yaml,
  write new API keys to .env instead of agentkit.yaml, extract env_key
  from existing ${VAR} reference when updating providers
- Onboarding: merge-update instead of overwrite when config exists,
  use config_arg to determine output path, .env merge instead of overwrite
- Unified templates: bailian-coding provider name, full model_aliases,
  docker-compose with postgres, expanded .env.example
- Optional ruamel.yaml for comment/format preservation in Settings API
- clients.yaml: add _deep_resolve for ${VAR} env var references
- All CLI commands use load_config_with_dotenv() consistently
- Tests: mock find_config_path and CWD auto-discovery to avoid env leaks
2026-06-16 00:26:54 +08:00
chiguyong e984b4c462 feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match
- Expand ExecutionMode enum with REWOO/REFLEXION/PLAN_EXEC
- Add _resolve_execution_mode() to respect skill.config.execution_mode
- Rewrite IntentRouter._match_keywords() for multi-candidate scoring
- Add QualityGate 5th dimension: skill_match validation with warning escalation
- Calibrate HeuristicClassifier: low-complexity signals only when no high signals
- Fix negation regex for Chinese text (avoid matching past punctuation)
- Fix backtest mode_map normalization and .env loading
- Add 61 unit tests (21 HeuristicClassifier + 14 IntentRouter + 13 QualityGate + 13 existing)

Results: execution_mode_accuracy 9.09%→36.36%, skill_routing_F1 66.67%→77.78%
2026-06-15 22:43:13 +08:00
chiguyong 64d62a2b60 feat: autonomous task execution - connect PlanExecEngine + TeamOrchestrator
U1: TeamOrchestrator._execute_phase real execution (Expert.agent.execute)
U2: LLM-based merge strategies (BEST/VOTE/FUSION) with fallback
U3: ReActStepExecutor replacing _LLMStepAgent for tool-enabled steps
U4: SharedWorkspace integration for cross-phase/cross-execution state
U5: GoalPlanner prompt tuning with few-shot and verb pattern matching
U6: Replan-before-fallback in TeamOrchestrator
U7: End-to-end validation tests for multi-step research tasks
U8: WebSocket progress events (step_event_callback + new event types)

Code review fixes: P0 response.strip fix, P1 competitor status check,
milestone real impl, VOTE self-bias fix, confirmation_handler wiring,
ExpertTeam public API, DRY _build_result_summaries, replan tests

Also: geo_server.py refactor (ServerConfig.from_yaml), delete llm_config.yaml
2026-06-15 12:41:32 +08:00
chiguyong 99fe4c99f7 fix: comprehensive code review fixes + WS test stability 2026-06-15 08:17:34 +08:00
chiguyong 7384ecb03e feat: Expert Team Mode — plan-execute collaboration with conversation UI
Implements B+C hybrid Expert Team Mode with ExpertConfig, CollaborationPlan,
TeamOrchestrator, ExpertTeamRouter, HandoffTransport, SharedWorkspace, and
Expert wrapper. Frontend includes ExpertTeamView, ExpertMessage,
PlanVisualization, team store, and WS event handlers.

Code review fixes: sentinel-based close, per-phase retry, name validation,
Vue component integration, teamState dedup, Redis reset, plan reassign,
event_type validation, hmac timing-safe compare, message dedup,
reactive updatePhases, O(1) phase lookup, iterative DFS, bounded Queue.

232 unit tests passing.
2026-06-14 22:20:14 +08:00
chiguyong 6e0e081f23 feat: gap closure sprint — dark theme, @-mention, LocalComputerUse, tests
P0: U4 UsageStore + U5 CascadeStateStore independent test files (57 tests)
P1: Dark theme — tokens.css [data-theme="dark"] + theme.ts Pinia store
    + TopNav toggle button + App.vue dynamic Ant Design theme
P1: @-mention — MentionDropdown.vue + /skills/mention-suggest API
    + ChatInput integration with @ detection
P2: LocalComputerUseSession — pyautogui + screencapture (replaces Docker stub)
P2: Integration tests for gap closure (12 tests)
Fix: create_cascade_state_store() now passes session_ttl to InMemory fallback
2026-06-14 16:16:50 +08:00
chiguyong 0ccef7be5c feat: P0 production hardening — LLM cache, semantic routing, state persistence
U1: LLM Cache Core (exact + semantic match, InMemory + Redis backends)
U2: Cache integration into LLMGateway with CacheConfig
U3: Semantic Router as Layer 1.5 in CostAwareRouter
U4: UsageStore persistence (Redis Hash + InMemory fallback)
U5: CascadeStateStore persistence (Redis INCR + InMemory TTL)
U6: EvolutionStore interface unification (Protocol + PostgreSQL backend)
U7: Configuration integration + E2E tests

Code review fixes:
- P0: date iteration bug (day>=28), semantic router index never built,
      Redis connection leak (per-call → persistent pool)
- P1: cache degradation recovery, semantic_search degradation,
      double miss counting, asyncio.Lock for PG init, LIMIT on queries,
      __import__ anti-pattern → _utcnow()
- P2: InMemory TTL cleanup, embedding preservation on put(),
      data TTL = max(exact_ttl, semantic_ttl)
2026-06-14 15:16:00 +08:00
chiguyong 09698d7a06 feat: frontend productization with code review fixes
- Workflow: visual canvas, undo/redo, drag-and-drop, real-time execution WebSocket
- Evolution: dashboard, ECharts metrics, experience timeline, pitfall warnings, usage panel
- KB: source CRUD, document upload, search test
- Terminal: interactive PTY WebSocket, whitelist security
- Security: hmac.compare_digest, API key auth on all endpoints, whitelist bypass fix
- Fixes: ECharts async init, WebSocket intentional disconnect, TOCTOU race, Pydantic models
2026-06-13 01:29:58 +08:00
chiguyong 5ef08a3b30 fix(review): comprehensive P0-P2 code review fixes 2026-06-12 22:18:25 +08:00
chiguyong a36bc3d1c1 feat: optimize chat response speed for sub-1s first token latency
- Add HeuristicClassifier to replace LLM quick_classify with zero-cost
  local heuristic (keyword/length/code-pattern scoring), gated by
  router.classifier config (default: heuristic)
- Add parallel tool execution in ReActEngine via asyncio.gather for
  multiple independent tool_calls, gated by parallel_tools param
- Add AsyncWriteQueue for non-blocking session persistence with WAL
  buffer, gated by async_writes param on SessionManager
- Add httpx.Limits connection pool config to all LLM providers
- Add router config section to ServerConfig and agentkit.yaml
- All optimizations have config switches for safe rollback
2026-06-12 13:15:06 +08:00
chiguyong 8c365486e2 fix(pipeline): address code review findings for adversarial loop
Critical:
- C1: Add verifier_timeout_seconds for independent Verifier timeout
- C2: Verifier parse failure raises RuntimeError instead of dead-loop

Major:
- M1: Inject previous_output into Worker retry context
- M2: Add Pydantic ge/le constraint on ReviewFeedback.score
- M3: Use Literal type for feedback_mode enum validation
- M4: Use Literal types for ReviewIssue severity and category
- M5: Merge error messages when escalation agent also fails

Tests: 8 new test cases added (19 total), all passing
2026-06-12 10:02:37 +08:00
chiguyong 3392413614 test(pipeline): add adversarial loop unit tests
11 test cases covering:
- PipelineSchemaAdversarial (4): verifier fields, backward compat, serialization, state tracking
- AdversarialExecution (3): no verifier passthrough, first round pass, max rounds exhausted
- FeedbackContext (3): structured+natural, structured, natural modes
- Escalation (1): no escalation configured
2026-06-12 09:40:19 +08:00
chiguyong 32c800d1e4 fix: portal routing + response speed + IME input
1. Portal unified routing: ws_chat now uses CostAwareRouter uniformly
   (handles Layer 0/1/2), replacing direct IntentRouter calls.
   Greeting/chat_mode requests skip IntentRouter LLM call entirely.

2. Response speed: greeting & simple chat now use direct LLM call
   (no ReAct loop), zero-cost Layer 0 detection.

3. IME input fix: use e.isComposing (native browser property)
   instead of compositionstart/end for Enter key detection.

4. Test: fix InMemoryMessageBus.request() parameter name
   timeout -> timeout_seconds.
2026-06-11 21:30:25 +08:00
chiguyong d47f279887 fix: resolve code review issues from deferred improvements
1. InMemoryMessageBus.request(): fix param name (timeout→timeout_seconds) to match ABC
2. InMemoryMessageBus: track consumer tasks, cancel on unsubscribe
3. InMemoryMessageBus: _try_resolve_pending() in queue consumer path
4. evolve_soul(): use "default" category when patterns is empty
5. quick_classify(): use delimiter-based prompt to mitigate injection risk
6. Use asyncio.get_running_loop() instead of deprecated get_event_loop()
2026-06-11 13:49:02 +08:00
chiguyong 79eb8469f9 fix: address remaining code review issues
- AlignmentGuard: direction-aware constraint checking (negation/affirmation detection)
  instead of simple substring matching to reduce false positives
- Reflexion: extract actual token usage from LLM response instead of hardcoded 1
- MemoryTool: protect version/history sections from update_soul modification
- Fix AsyncMock warnings for sync find_best_agent method
2026-06-11 00:14:11 +08:00
chiguyong bba394be38 fix(marketplace): address code review findings
- Fix str.format() crash when user input contains curly braces
- Fix Layer 2 passing str to find_best_agent (expects list[str])
- Fix AlignmentGuard fail-open on LLM audit failure (now fail-closed)
- Fix _config_reload_lock not initialized in create_app()
- Fix evolve_soul redundant reflector.reflect() call (reuse existing reflection)
- Fix test mocks using AsyncMock for sync find_best_agent method
- Remove unused _COMPLEXITY_CLASSIFY_PROMPT constant
2026-06-10 19:21:40 +08:00
chiguyong 8713636d50 feat(marketplace): add Phase B/C - CostAwareRouter, OrganizationContext, AlignmentGuard, Soul Evolution, Auction, Server Integration
Phase B:
- U1: CostAwareRouter with 3-layer routing (rule/LLM/capability matching)
- U6: OrganizationContext with agent profiles and capability-based discovery
- U7: AlignmentGuard with constraint injection and cascade detection

Phase C:
- U8: Soul dynamic evolution with version tracking and reflection-triggered updates
- U9: Auction mechanism as optional advanced routing mode
- U10: Server integration + end-to-end integration tests

250 new tests passing across all units.
2026-06-10 19:09:02 +08:00
chiguyong 5b42487d8a feat(core): add ReWOO, Plan-and-Execute, Reflexion execution engines
Phase A of Multi-Agent Marketplace architecture:
- ReWOOEngine: plan-all-then-execute pattern for parallel data fetch
- PlanExecEngine: adapter wrapping GoalPlanner+PlanExecutor+PipelineReplanner
- ReflexionEngine: ReAct + Evaluate + Reflect + Retry for high-precision tasks
- SkillConfig: extend VALID_EXECUTION_MODES with rewoo/plan_exec/reflexion
- ConfigDrivenAgent: add _handle_rewoo/_handle_plan_exec/_handle_reflexion routes
- 5 professional agent YAML configs with layered model defaults
- 107 unit tests passing
2026-06-10 17:08:48 +08:00
chiguyong 6852dfe892 fix(security,reliability): resolve all P2 findings from code review 2026-06-10 15:05:40 +08:00
chiguyong 658e188939 fix(review): resolve P0/P1 findings from final code review 2026-06-10 09:57:29 +08:00
chiguyong 1d1805753c fix: resolve key P2 findings from code review
- Shell whitelist: use exact binary match instead of startswith
- Shell audit log: use deque(maxlen=10000) to cap memory
- Terminal history: use deque(maxlen) for O(1) eviction
- Path optimizer: cap _pending_paths at 50 entries per task_type
- Pitfall detector: only add tips to matching steps, not all
- Experience store: handle non-numeric _parse_time_window input
- Extract shared is_safe_url() to utils/security.py (DRY)
- Workflow condition evaluator: handle float() ValueError
2026-06-10 09:01:23 +08:00
chiguyong b46a10973f fix(tests): clean up test_shell_tool.py lint issues 2026-06-10 08:46:35 +08:00
chiguyong 9646b0f0dd fix(tests): update test_shell_tool.py to match new ShellTool API 2026-06-10 08:22:15 +08:00
chiguyong 7874e875af merge: integrate feat/agentkit-phase8-chat-adaptive (chat/gui commands + GUI mode)
Restores agentkit chat, agentkit gui CLI commands, onboarding wizard,
and GUI mode (AGENTKIT_GUI_MODE) with static file serving.
Resolves merge conflicts in orchestrator.py, app.py, tools/__init__.py, shell.py.
2026-06-10 07:44:06 +08:00
chiguyong 9e9f1314f6 fix(security): resolve all P0/P1 findings from code review 2026-06-10 07:12:41 +08:00
chiguyong c606ffa64a feat(phase5): implement management pages, evolution dashboard, and workflow editor (U13b/U13c/U14) 2026-06-10 01:29:01 +08:00
chiguyong a1deeecede feat(phase5): implement Vue3 portal foundation with chat interface and routing (U13a)
- Add Portal API routes: chat, stream, capabilities, conversations, WebSocket
- Add ConversationStore for in-memory conversation management
- Add CAPABILITY_CATEGORIES mapping for 8 capability types
- Create Vue3 SPA with TypeScript, Pinia, Vue Router, Ant Design Vue
- Implement ChatView with message bubbles, input, sidebar, WebSocket support
- Add side navigation skeleton for all 8 capability sections
- Add placeholder views for workflow, knowledge, skills, terminal, etc.
- 31 backend tests passing
2026-06-10 01:06:48 +08:00
chiguyong 901e4d9d0a feat(phase4): implement Computer Use integration (U12)
- ComputerUseTool: Anthropic API + fallback chain (API→Session→ShellTool→AskHuman)
- ComputerUseSession: Docker sandbox + InMemory test session
- ComputerUseRecorder: action recording, replay, and persistence

89 new tests passing. Degradation chain verified.
2026-06-10 00:54:31 +08:00
chiguyong c99aee1423 feat(phase3): implement knowledge base and RAG enhancement (U9-U11)
- U9: LocalDocumentIngestion - multi-format doc parsing and chunking
- U10: ExternalKBAdapters - Feishu/Confluence/GenericHTTP adapters
- U11: MultiSourceRAG - multi-source retrieval with source tracing

KnowledgeBase protocol defined (KTD-7). 145 new tests passing.
2026-06-10 00:45:17 +08:00
chiguyong e3d4f811dd feat(phase2): implement self-evolution and smart terminal (U6-U8)
- U6: PitfallDetector - detect historical failure patterns and warn
- U7: PathOptimizer - discover and update optimal execution paths
- U8: TerminalSession - session state, PTY interactive, output parsing

160 new tests passing. ShellTool enhanced with session_id support.
2026-06-10 00:22:36 +08:00
chiguyong fd4a811929 feat(phase1): implement core kernel and experience foundation (U1-U5)
- U1: GoalPlanner - structured goal decomposition wrapping _decompose_task()
- U2: PlanExecutor - parallel execution with retry/skip/replace strategies
- U3: PlanChecker - quality gate + review + experience writing
- U4: Skill spec upgrade - dependencies, capabilities, version management
- U5: ExperienceStore - PostgreSQL+pgvector task experience storage

208 new tests passing, fully backward compatible.
2026-06-09 23:57:03 +08:00
chiguyong 31bd3b126c feat(phase8): chat adaptive enhancements, pipeline reflection, search tools upgrade
- Enhanced chat CLI with adaptive mode and session management
- Added pipeline reflection and schema extensions
- Upgraded BaiduSearch and WebSearch tools with advanced capabilities
- Expanded server routes for skills and chat
- Added session store enhancements
- New chat module and pipeline reflection support
2026-06-09 23:18:06 +08:00
chiguyong 045fecd4ce feat(tools): add ShellTool + WebSearchTool, memory system, onboarding wizard, chat mode
- ShellTool: safe command execution with allowlist, blocked patterns (regex), timeout, output truncation
- WebSearchTool: multi-backend search with Tavily → Serper → DuckDuckGo Lite fallback
- MemoryTool: agent-callable tool with add/replace/remove/read actions
- MemoryStore/MemoryFile: file-based memory (SOUL.md, USER.md, MEMORY.md, DAILY.md)
- Onboarding wizard: provider selection, API key, model selection, agent personality
- Chat mode: interactive CLI with streaming, memory injection, tool integration
- Add 百炼 Coding Plan provider with 10 models
- 102 unit tests (34 new for ShellTool + WebSearchTool)
2026-06-09 01:06:45 +08:00
chiguyong 45283d31e8 feat(core): integrate MessageBus into Orchestrator and AgentPool (U7)
- Orchestrator accepts optional message_bus parameter; workers publish
  task.progress messages via MessageBus after each subtask execution
- AgentPool accepts optional message_bus; auto-registers agents on
  create and auto-unregisters on remove
- app.py initializes MessageBus from config and injects into AgentPool
- ServerConfig adds bus configuration field
- 5 new tests, all passing
2026-06-08 00:03:40 +08:00
chiguyong 13d6e74099 feat(bus): add MessageBus abstraction layer with InMemory + Redis Streams (U6)
- AgentMessage: message model with sender/recipient/topic/payload/correlation_id
- MessageBus Protocol: publish/subscribe/unsubscribe/request/broadcast/health_check
- InMemoryMessageBus: asyncio.Queue-based implementation for testing
- RedisMessageBus: Redis Streams (XADD/XREADGROUP) implementation with
  consumer groups, message acknowledgment, and dead letter queue
- create_message_bus() factory with graceful Redis→InMemory fallback
- Request-response pattern via correlation_id + asyncio.Future
- 13 new tests, all passing
2026-06-07 23:58:16 +08:00
chiguyong 88d8298871 feat(core): add Orchestrator adaptive task decomposition (U5)
- execute_adaptive(): iterative execute→evaluate→re-decompose loop
- OrchestratorConfig: adaptive, max_iterations, quality_threshold
- _evaluate_quality(): LLM-based or rule-based quality scoring (0-1)
- _reexecute_failed(): preserves completed subtask results, retries
  failed ones with improvement feedback injected into input_data
- OrchestrationResult.metadata field for tracking iteration history
- 10 new tests, all passing
2026-06-07 23:50:54 +08:00
chiguyong 7054ac02b6 feat(tools): add AskHumanTool + token streaming in ReAct execute_stream
- AskHumanTool: Human-in-the-Loop tool for Chat mode, pushes questions
  via WebSocket callback and waits for user reply via asyncio.Future
- Token streaming: execute_stream() now uses chat_stream() instead of
  chat(), yielding token-type ReActEvents for each StreamChunk
- _build_response_from_stream() static method constructs LLMResponse
  from accumulated stream data
- Export AskHumanTool from tools/__init__.py
- 12 new tests (7 AskHumanTool + 5 token streaming), all passing
2026-06-07 23:40:43 +08:00
chiguyong 6013d5189b feat(chat): add Chat API routes with REST + WebSocket bidirectional communication 2026-06-07 22:49:26 +08:00
chiguyong 493187782c feat(session): add Session/Message models and SessionManager with InMemory/Redis stores 2026-06-07 22:43:14 +08:00
chiguyong b34b06724d fix(agentkit): resolve all P0/P1/P2/P3 issues from code review 2026-06-07 22:05:18 +08:00
chiguyong 9c04362dba feat(compression): U5 HeadroomRetrieveTool for CCR cache retrieval
Add HeadroomRetrieveTool that allows LLM to retrieve original
uncompressed data from CCR cache via Function Calling. Auto-registered
when HeadroomCompressor is active and available.
2026-06-07 18:20:17 +08:00
chiguyong 286804792d feat(compression): U4 ServerConfig compression field and Agent injection
Add compression config to ServerConfig (following telemetry pattern),
create compressor in create_app, pass through AgentPool to
ConfigDrivenAgent, and inject into ReActEngine.execute() calls.
2026-06-07 18:20:05 +08:00
chiguyong fcb4fb33f3 feat(compression): U3 ReAct engine tool result compression and incremental compress
Extend _build_tool_result_message to accept compressor parameter for
tool output compression. Add _should_compress helper for token budget
checking. Add incremental compression within ReAct loop when
conversation exceeds threshold.
2026-06-07 18:19:53 +08:00
chiguyong ea705b979b feat(compression): U2 HeadroomCompressor with SmartCrusher and CCR cache
Add HeadroomCompressor implementing CompressionStrategy Protocol with
content-type routing (JSON→SmartCrusher, code→CodeCompressor), CCR
reversible compression cache, and graceful degradation when headroom-ai
is not installed.
2026-06-07 18:19:41 +08:00
chiguyong 5d3a5f2bf3 feat(compression): U1 CompressionStrategy Protocol and create_compressor factory
Add runtime-checkable CompressionStrategy Protocol with compress(),
compress_tool_result(), and is_available() methods. Add compress_tool_result
and is_available to existing ContextCompressor. Add create_compressor()
factory function with headroom/summary provider routing and ImportError
fallback.
2026-06-07 18:19:27 +08:00
chiguyong 239009357a feat(telemetry): U7 OpenTelemetry integration with zero-dependency no-op pattern
Add telemetry module with tracing (agent/tool/llm/pipeline_step spans),
metrics (5 histograms/counters), and setup with optional OTLP exporters.
Uses no-op pattern when opentelemetry not installed. GenAI Semantic
Conventions for LLM spans. Integrated into ReactEngine, LLMGateway,
ToolBase, and FastAPI app.
2026-06-07 17:26:21 +08:00
chiguyong 03a5167366 feat(pipeline): U6 step-level retry with exponential backoff and saga compensation
Add StepRetryPolicy with jitter-based exponential backoff, SagaOrchestrator
with LIFO compensation pattern, integrate retry_policy and compensate
fields into PipelineStage/PipelineStep schema, add GEO pipeline
compensation definitions for all 7 steps.
2026-06-07 17:26:07 +08:00
chiguyong 4db637cd4f feat(pipeline): U5 state persistence with Redis hot + PG cold dual-write
Add PipelineStateMemory/Redis/PG backends, PipelineStateManager with
Redis Sorted Set hot state + PostgreSQL JSONB cold persistence.
Integrated into PipelineEngine with state persistence calls at each
step transition.
2026-06-07 17:25:52 +08:00
chiguyong 9ec1740047 feat(tools): U3 built-in Python tools - WebCrawl, SchemaExtract, SchemaGenerate
Add WebCrawlTool (Crawl4AI wrapper with graceful degradation),
SchemaExtractTool (extruct-based Schema.org extraction), and
SchemaGenerateTool (JSON-LD generation with optional pydantic-schemaorg
validation). All tools work without optional dependencies.
2026-06-07 17:25:24 +08:00
chiguyong 550d29a139 feat(mcp): U2 MCP config system and MCPManager lifecycle
Add MCPServerConfig dataclass with stdio/streamable_http/sse transport
validation, MCPManager for declarative YAML-driven MCP server lifecycle
(start_all/stop_all), tool discovery and registration. Integrated
into FastAPI lifespan startup/shutdown.
2026-06-07 17:25:07 +08:00
chiguyong 66b9217569 feat(mcp): U1 StdioTransport for subprocess-based MCP communication
Add StdioTransport class supporting stdio JSON-RPC over subprocess
stdin/stdout with asyncio.create_subprocess_exec, pending futures
for request/response matching, and stderr forwarding.
2026-06-07 17:24:52 +08:00
chiguyong 83cdddd199 feat(evaluation): U9 Ragas evaluation pipeline for RAG quality assessment
- RagasEvaluator: LLM-as-Judge evaluation with ragas lib or built-in fallback
- EvalDatasetBuilder: from traces or dict list
- EvalMetrics: faithfulness, answer_relevancy, context_precision, context_recall
- Built-in heuristic evaluation using keyword overlap and Jaccard similarity
- 13 tests passing
2026-06-06 22:49:27 +08:00
chiguyong 9753a08ac8 feat(llm): U8 Chinese LLM providers - Wenxin, Doubao, Yuanbao
- WenxinProvider: Baidu ERNIE via Qianfan v2 OpenAI-compatible API, AK/SK token auth
- DoubaoProvider: ByteDance Doubao via Volcengine Ark API
- YuanbaoProvider: Tencent Hunyuan via OpenAI-compatible API with enhancement mode
- All inherit from OpenAICompatibleProvider for retry/circuit breaker support
- 16 tests passing
2026-06-06 22:46:53 +08:00
chiguyong 34e083abde feat(evolution): U7 multi-objective fitness and extended strategy space
- MultiObjectiveFitness: weighted scoring, NSGA-II Pareto ranking, crowding distance
- FitnessWeights: configurable accuracy/latency/cost weights with auto-normalization
- ExtendedStrategyTuner: multi-dim Bayesian optimization (temperature, max_iterations, top_k, retrieval_mode)
- ExtendedStrategyConfig: expanded parameter space
- 20 tests passing
2026-06-06 22:42:54 +08:00
chiguyong d5998aaddd feat(evolution): U6 GEPA genetic algorithm evolution framework
- PromptChromosome: instructions + demos + constraints gene segments
- CrossoverOperator: paragraph-level text, demo, constraint crossover
- MutationOperator: LLM-driven instruction mutation + demo/constraint mutation
- GEPAPopulation: tournament selection, elite preservation, Pareto front
- FitnessScore: multi-objective (accuracy, latency, cost) with Pareto dominance
- 29 tests passing
2026-06-06 22:38:55 +08:00
chiguyong 1390bd8d6e feat(skills): U5 GEO Pipeline orchestration with DAG execution
- GEOPipeline: YAML-driven DAG pipeline with parallel/sequential execution
- PipelineStep with input_mapping ($.input.xxx, $.steps.name.output.xxx)
- Topological sort for execution groups, SharedWorkspace integration
- geo_full_pipeline.yaml: detect→analyze→optimize→track workflow
- 10 tests passing
2026-06-06 22:34:24 +08:00
chiguyong 23934602c0 feat(core): U4 multi-agent Orchestrator with SharedWorkspace
- Orchestrator: Orchestrator-Worker pattern with LLM-driven task decomposition
- SharedWorkspace: Redis-backed shared state with versioning and distributed locks
- SubTask dependency graph, parallel group building, result aggregation
- 16 tests passing
2026-06-06 22:25:12 +08:00
chiguyong f16dcb5ebe feat(memory): U2 Contextual Retrieval - LLM-generated context prefixes for chunks
- ContextualChunker: generates context prefixes per chunk via LLM
- Integrated into HttpRAGService ingest with contextual_chunking option
- Caching, batch processing, graceful LLM failure handling
- 12 tests passing
2026-06-06 22:19:02 +08:00
chiguyong a6c9babfdc feat(memory): U1 RAG self-correction loop (CRAG)
- RelevanceScorer: keyword overlap + query coverage + retrieval score + length penalty
- RAGSelfCorrectionLoop: state machine driven retrieve-evaluate-correct-degrade cycle
- Integrated into MemoryRetriever with enable_self_correction option
- 21 tests passing
2026-06-06 22:16:23 +08:00
chiguyong 6e362a8ae7 feat(agentkit): Phase 4 enterprise production upgrade — 12 Implementation Units
Phase A (P0): EpisodicMemory pgvector search+EmbeddingCache, ReAct timeout+CancellationToken, evolution system fix (A/B test+LLMPromptOptimizer+StrategyTuner), AnthropicProvider native Messages API
Phase B (P1): RetryPolicy+CircuitBreaker, chat_stream fallback chain, WebSocket endpoint, SSE stream fix, Evolution+Memory API routes (7 endpoints), embedding cache+Enhanced Search per-KB degradation fix
Phase C (P2): GeminiProvider native generateContent API, Agent state lock+config hot-reload

Tests: 1301 passed, 18 skipped, 0 failed
2026-06-06 21:51:04 +08:00
chiguyong e33dc25ad3 feat(memory): RAG pipeline optimization — 5 Implementation Units
U1: QueryTransformer — LLM/rule-based query rewriting + sub-query decomposition
U2: HttpRAGService enhanced_search() — rerank + compression via /bases/{kb_id}/retrieve
U3: Structured context injection — source attribution headers in RAG results
U4: RetrieveKnowledgeTool — built-in tool for mid-reasoning knowledge retrieval
U5: Configurable retrieval params + per-KB weights + CJK token estimation

Config example:
  memory:
    retrieval:
      top_k: 5
      token_budget: 2000
      context_template: structured
    query_transform:
      enabled: true
      strategy: llm
    semantic:
      search_mode: enhanced
      use_rerank: true
      kb_weights:
        industry-kb-id: 1.2
        enterprise-kb-id: 0.8

Tests: 1037 passed, 18 skipped, 0 failed
2026-06-06 19:27:09 +08:00
chiguyong cd5b39087e feat(memory): add HttpRAGService for config-driven knowledge base integration 2026-06-06 18:36:05 +08:00
chiguyong 0456429beb fix(review): address all 14 P2 advisory findings 2026-06-06 18:20:46 +08:00
chiguyong 8620751864 fix(review): address P0+P1 findings from Tier 2 code review
P0: MemoryRetriever.retrieve score mutation fix
P1: Redis atomic Lua script, deprecated API fix, SQLite WAL mode,
Redis URL masking, UniqueConstraint, TraceRecorder completed flag,
EpisodicMemory recall improvement, LLMReflector sanitization,
A/B test safety, generator cleanup, ContextCompressor guards,
OpenAIEmbedder reuse, Pipeline failure handling, Metrics O(1),
Health check Redis PING, CLI skill loading, CORS config,
API key direct pass-through

Tests: 924 passed, 18 skipped, 0 failed
2026-06-06 17:57:47 +08:00
chiguyong f858d279f3 feat(agentkit): Phase 3 upgrade - persistence, memory, evolution, observability
10 Implementation Units across 3 phases:

Phase A - Infrastructure:
- U1: RedisTaskStore with Redis/memory backend + factory function
- U2: TraceRecorder for execution trace recording
- U3: PersistentEvolutionStore with SQLite backend

Phase B - Core Capabilities:
- U4: MemoryRetriever integration into ReAct engine
- U5: Embedder abstraction + EpisodicMemory vector search
- U6: LLMReflector for LLM-in-the-loop reflection
- U7: SkillPipeline for multi-skill orchestration

Phase C - Enhancement:
- U8: SKILL.md format + progressive disclosure levels
- U9: ContextCompressor + prompt cache rendering
- U10: Structured logging + metrics endpoint + enhanced health check

Tests: 924 passed, 18 skipped, 0 failed
2026-06-06 17:17:45 +08:00
chiguyong 74e2223153 feat(cli): pair command + doctor rename + client config priority
- health → doctor (better naming)
- agentkit pair --name <client> generates ak_live_ API key
- agentkit pair --list / --revoke for client management
- ClientConfig class: client config > init defaults > hardcoded
- README updated with pair usage + business system pairing guide
- 38 CLI tests passing
2026-06-06 13:08:14 +08:00
chiguyong b2709da08b feat(cli): AgentKit CLI with serve/version/health/task/skill/init/usage
U1: CLI framework (Typer) + serve/version/health commands + __main__.py + pyproject scripts
U2: task command group (submit/status/list/cancel) with remote mode
U3: skill command group (list/load/info) with local and remote modes
U4: init command (generates agentkit.yaml/.env.example/docker-compose/skills) + usage command

31 tests passing, TDD workflow.
2026-06-06 12:45:51 +08:00
chiguyong acec8ff743 feat(evolution): Phase A - lifecycle hooks + EvolutionConfig
U11: EvolutionMixin integrated into ConfigDrivenAgent lifecycle
  - on_task_complete triggers evolve_after_task
  - on_task_failed records failure patterns
  - Evolution errors never break main task flow

U12: EvolutionConfig added to SkillConfig
  - enabled, reflect_on_failure, auto_apply, min_quality_threshold
  - Backward compatible: defaults to enabled=False

21 new tests passing, no regression.
2026-06-06 12:05:56 +08:00
chiguyong 2844eeb548 feat(streaming): Phase C - LLM streaming + ReAct events + SSE endpoint
U8: StreamChunk protocol + OpenAI chat_stream + Gateway streaming with usage tracking
U9: ReActEvent dataclass + execute_stream() yielding thinking/tool_call/tool_result/final_answer
U10: POST /tasks/stream SSE endpoint + Client SDK stream_task()

15 new tests passing, no regression.
2026-06-06 11:54:17 +08:00
chiguyong ec0e221beb feat(server): Phase D - async task system (TaskStore + BackgroundRunner + API)
U5: TaskStore - in-memory task state with TTL cleanup and max records
U6: BackgroundRunner - async task execution with semaphore concurrency control
U7: Task status/result API + cancel endpoint + async submit mode

45 tests passing (28 new + 17 existing, no regression).
2026-06-06 11:39:41 +08:00
chiguyong 5f1c51cf9a feat(server): Phase B - auth, rate limiting, SSRF protection, handler whitelist
U1: API Key authentication middleware (dev mode skip, health whitelist)
U2: Rate limiting middleware (fixed-window, 60 req/min default)
U3: Callback URL SSRF protection (private IP blocking)
U4: custom_handler module prefix whitelist

65 tests passing. CORS conflict fixed.
2026-06-05 23:37:36 +08:00
chiguyong f87b790c0f feat(agentkit): v2 Phase 1 - ReAct/LLM Gateway/Skill/Server + review fixes
535 unit + 52 integration tests passing. README added.
2026-06-05 23:32:16 +08:00
chiguyong 133ae3927e test(u8): add GEO business integration tests for YAML configs and agent creation
41 tests covering:
- 8 YAML config file loading and validation
- ConfigDrivenAgent creation from YAML (llm_generate/tool_call/custom)
- Custom handler routing (citation/monitor/schema)
- Tool registration completeness
- Adapter compatibility (unique names/types, no task overlap)
2026-06-05 17:25:51 +08:00
chiguyong 35f84fd770 test(orchestrator): add multi-agent collaboration integration tests
- Pipeline parallel execution with topological grouping
- Conditional stages (execute/skip)
- Circular dependency detection
- Variable resolution ${var.path}
- Handoff message creation and serialization
- DynamicPipeline: conditional, loop, nested
- End-to-end: content production pipeline with handoff
- 18 new tests, total 148 passing
2026-06-04 22:59:29 +08:00
chiguyong 96ea0c2972 feat(mcp,evolution): add Transport layer and Evolution lifecycle integration
U5 - MCP Transport:
- Transport abstract base class with connect/disconnect/send_request
- HTTPTransport: Streamable HTTP with JSON-RPC 2.0
- SSETransport: Server-Sent Events + HTTP POST hybrid
- MCPClient: from_transport() factory method
- 31 transport tests

U6 - Evolution Lifecycle:
- EvolutionMixin: reflect → optimize → AB test → apply/rollback
- EvolutionLogEntry: tracks each evolution step
- Integrates with BaseAgent on_task_complete hook
- 10 lifecycle tests

Total: 130 tests passing
2026-06-04 22:55:23 +08:00
chiguyong cc6a858150 test(memory): add memory system tests with BaseAgent lifecycle integration
- InMemoryMemory test implementation
- SemanticMemory: RAG + graph search tests
- MemoryRetriever: weight-based ranking, token budget
- BaseAgent lifecycle: on_task_start loads, on_task_complete stores, on_task_failed records
- 19 new tests, total 89 passing
2026-06-04 22:44:27 +08:00
chiguyong d73a3391ab feat(tools): add MCPTool, SequentialChain, ParallelFanOut, DynamicSelector
- MCPTool: call remote MCP tools via MCPClient
- SequentialChain: chain tools with output-to-input piping
- ParallelFanOut: execute tools concurrently with merge strategies
- DynamicSelector: keyword/LLM-based tool selection
- 14 new tests, total 70 passing
2026-06-04 22:42:22 +08:00
chiguyong 5a90824c77 feat(core): add ConfigDrivenAgent with YAML-driven agent definition
- AgentConfig: YAML/dict config model with validation
- ConfigDrivenAgent: 3 task modes (llm_generate, tool_call, custom)
- StandaloneRunner: auto-discover YAML configs and build agents
- 25 new tests covering all modes and edge cases
- Total: 56 tests passing
2026-06-04 22:39:25 +08:00
chiguyong 2ddffcdf37 chore: add .gitignore and remove cached files 2026-06-04 22:28:44 +08:00
chiguyong cc3dfd44e3 fix: switch to setuptools for Python 3.14 compatibility 2026-06-04 22:27:06 +08:00
chiguyong 9a6d6fee4e feat: initial fischer-agentkit package with unified agent architecture
- BaseAgent with handle_task() pattern (execute template moved up)
- Protocol: TaskMessage, TaskResult, HandoffMessage, EvolutionEvent
- Tool system: FunctionTool, AgentTool, ToolRegistry with versioning
- Memory system: WorkingMemory (Redis), EpisodicMemory (pgvector), SemanticMemory (RAG adapter), MemoryRetriever (hybrid)
- Evolution engine: Reflector, PromptOptimizer (DSPy-style), StrategyTuner, ABTester, EvolutionStore
- Orchestrator: PipelineEngine (parallel DAG), PipelineLoader (YAML), HandoffManager, DynamicPipeline
- MCP: Server (FastAPI), Client (httpx), MCPTool
- Prompts: PromptTemplate, PromptSection
- Exceptions: full hierarchy including Tool, Schema, Handoff, Evolution errors
- Tests: unit tests for core, tools, protocol, evolution, pipeline
2026-06-04 22:24:06 +08:00