Commit Graph

43 Commits

Author SHA1 Message Date
chiguyong cac9c73dd5 fix(routing): U1-U6 路由优化 + 修复方案 + 代码审查修复
实现 6 个修复单元(U1-U6)并应用 ce-code-review 发现的 5 项安全修复。

## U1: benchmark 超时阈值
- 按 difficulty 分级超时:easy=45s, medium=60s, hard=90s
- 替换原单一 60s 硬编码

## U2: OpenAICompatibleProvider httpx 超时
- 新增 timeout 参数(默认 120s),替换硬编码 60s
- ProviderConfig.timeout 透传到 Provider
- 新增 2 项单元测试

## U3: 激活 QualityGate skill_match 校验
- BaseAgent._build_skill_context() 构造 skill_context
- 在 base.py / tasks.py / runner.py 三处传入 QualityGate.validate()

## U4: 添加 disambiguation_keywords 字段
- IntentConfig 新增 disambiguation_keywords 字段
- 8 个 skill YAML 补充该字段

## U5: 优化 RequestPreprocessor 路由正则
- 拆分 _FACTUAL_RE 为 CN/EN 双正则(中文无空格)
- 新增 _MATH_RE / _TRANSLATION_RE 纯模式
- _TOOL_CONTEXT_RE 排除需要工具的实时查询
- 多行输入守卫 + 结尾标点支持
- 新增 21 项单元测试(共 40 项全通过)

## U6: 重新基准测试
- 真实 LLM benchmark:准确率 60% -> 93.3%
- 4/5 通过,p50=40.8s,一致性=100%
- 旧基线备份至 baseline_2026-06-17_old_arch.json

## ce-code-review 修复(5 项)
- 修复 \s 字符类匹配换行符的安全隐患
- 添加事实/数学正则的结尾标点支持
- 修复 geo_optimizer.yaml 关键词重复
- 修复 _login_with_retry 不可达 return
- 修复 real_llm_server fixture stderr_fh 资源泄漏

测试:tests/unit/chat/ 63 项全通过,ruff 检查通过。
2026-06-20 19:31:49 +08:00
chiguyong 91f56ca663 feat: 企业级客户端-服务端架构 + 代码审查修复
## 主要变更

### 新增功能
- 企业级客户端-服务端架构(JWT 认证 + RBAC 权限 + 终端安全)
- Tauri 桌面客户端与服务端配置同步
- 远程 LLM 网关(RemoteLLMProvider,支持 401 token 刷新重试)
- 服务端终端 WebSocket(带管理员审批流程)
- 终端白名单六层防御(黑名单 → shell 操作符检测 → 内置安全 → 全局/用户/会话白名单 → 危险检测)

### 代码审查修复(P0/P1/P2)
- P0: 危险二进制(rm/docker 等)不再加入白名单,compute_whitelist_entry 返回 None
- P1: 终端审批所有权追踪(_approval_owners dict)+ 会话清理防泄漏
- P1: 本地终端 WebSocket URL 补齐 JWT token
- P1: 审计日志支持 terminal_mode 过滤
- P1: /system/resources 端点强制 SYSTEM_CONFIG 权限
- P1: RemoteLLMProvider 增加 401 token 刷新重试机制
- P1: auth/models.py 使用 Mapping[str, object] 替代 Any 类型
- P2: 终端授权依赖检查 is_active 账户状态
- 修复 app.py 未使用的 APIKeyAuthMiddleware 导入

### 文档更新
- README.md: 新增第 16 章「企业级客户端-服务端架构」
- AGENTS.md / CLAUDE.md: 同步模块映射、路由表、前端页面
- 计划文档标记为 completed

Closes: docs/plans/2026-06-19-003-feat-enterprise-client-server-evolution-plan.md
2026-06-20 06:48:18 +08:00
chiguyong dddcbd24e3 feat: 私董会讨论模式 + 回测集成 + WS持久化修复
私董会讨论模式 (Board Meeting Mode):
- BoardRouter: @board 前缀路由, 专家名验证, 模板回退
- BoardTeam: 讨论容器, 状态机 (FORMING->DISCUSSING->CONCLUDING->COMPLETED)
- BoardOrchestrator: 多轮自主循环讨论引擎, 主持人小结, 停止命令检测
- 9个预设名人专家 YAML (马斯克/贝佐斯/张小龙/芒格等)
- 前端 BoardStatusView 群聊式 UI + WebSocket 事件处理
- 后端 chat.py 集成 @board 路由到主聊天流程

回测集成:
- benchmark.py: 新增 board_meeting 维度 (18 tasks, 6 categories)
- benchmark_dataset.py: 新增 BOARD_BENCHMARKS (11 E2E cases)
- test_board_backtest.py: 66 个回测测试 (9 test classes)

Bug 修复:
- resolve_expert_configs: deep-copy 防止 is_lead 修改污染共享模板
- 所有专家名无效时回退到默认模板
- board_router: 非匹配路径 topic 未 strip
- benchmark_dataset: board-name-invalid-001 输入修正

WebSocket 持久化修复:
- chat.py: 三层防御机制确保任务结果不丢失
- chat store: 断线恢复逻辑

部署配置:
- Gitea Actions CI/CD workflow
- docker-compose.deploy.yaml 部署编排
- scripts/deploy.sh 自动化部署脚本

测试结果: 120 单元测试通过, 71 benchmark 测试 100% 通过, ruff 全部通过
2026-06-17 23:52:53 +08:00
chiguyong ecf87391a5 feat: integrate SQ/EQ into portal WebSocket and CLI (Phase 4)
- app.py: initialize EventQueue + SubmissionQueue in app.state, close on shutdown
- portal.py: emit unified events (task.created/started/completed/failed,
  turn.thinking/tool_call/tool_result/final_answer) to EQ alongside WebSocket messages
- cli/chat.py: optional --event-queue flag for event emission
- EQ is bypass-only: emit failures never affect WebSocket or CLI main flow
- WebSocket message format unchanged (backward compatible)

Tests: 650 passed, 0 failed, 4 skipped
2026-06-17 11:05:04 +08:00
chiguyong 773a62ead2 refactor: remove IntentRouter from tasks.py, delete legacy ConversationStore
- tasks.py: replace IntentRouter.route() with default agent fallback (REACT mode)
- app.py: remove IntentRouter import and initialization
- portal.py: delete legacy in-memory ConversationStore class (~120 lines),
  SqliteConversationStore is the sole implementation now
- Remove unused SessionManager import from portal.py

Tests: 622 passed, 0 failed
2026-06-17 10:50:41 +08:00
chiguyong 5374bc8501 refactor: eliminate routing layer, align with industry best practices
Phase 1 of architecture optimization (U1/U2/U4/U8):

- U1: Rename SimpleRouter to RequestPreprocessor, route() to preprocess()
  Eliminates misleading routing concept; LLM decides autonomously
  in REACT agent loop (matches Codex/Claude Code/Trae pattern)
- U2: Delete CostAwareRouter, HeuristicClassifier, SemanticRouter
  (~700 lines removed). skill_routing.py: 1688 to 220 lines
- U4: PlanExecEngine defaults to ReActStepExecutor, delete _LLMStepExecutor
  (pure LLM calls without tools = no execution capability)
- U8: ReActEngine defaults to ContextCompressor(keep_recent=10)

Supersedes plans 2026-06-15-002/003/004.
New plan: 2026-06-16-006-refactor-architecture-optimization-evolution-plan.md
2026-06-17 10:44:40 +08:00
chiguyong b54213b3c6 fix(review): resolve all P0/P1/P2 findings from code review 2026-06-16 09:08:03 +08:00
chiguyong 2c5e90104d feat: message persistence, traceability and empty response auto-retry 2026-06-16 08:13:22 +08:00
chiguyong 87c59bb3e2 feat(tools): add SkillSearchTool and improve skill_install workflow
Add skill_search tool so agent can search for skills before installing.
Update skill_install description to guide LLM to search first.
Update system prompt to use skill_search -> skill_install flow.
This fixes the issue where agent returns empty when asked to find a skill.
2026-06-16 07:52:04 +08:00
chiguyong c4257591d4 refactor(router): replace CostAwareRouter with SimpleRouter and prompt-based tool calling 2026-06-16 03:31:05 +08:00
chiguyong a27eed3714 fix(config): unify config loading chain and protect ${VAR} references
- Settings API: reverse-resolve env vars to preserve ${VAR} refs in yaml,
  write new API keys to .env instead of agentkit.yaml, extract env_key
  from existing ${VAR} reference when updating providers
- Onboarding: merge-update instead of overwrite when config exists,
  use config_arg to determine output path, .env merge instead of overwrite
- Unified templates: bailian-coding provider name, full model_aliases,
  docker-compose with postgres, expanded .env.example
- Optional ruamel.yaml for comment/format preservation in Settings API
- clients.yaml: add _deep_resolve for ${VAR} env var references
- All CLI commands use load_config_with_dotenv() consistently
- Tests: mock find_config_path and CWD auto-discovery to avoid env leaks
2026-06-16 00:26:54 +08:00
chiguyong 99fe4c99f7 fix: comprehensive code review fixes + WS test stability 2026-06-15 08:17:34 +08:00
chiguyong 0ccef7be5c feat: P0 production hardening — LLM cache, semantic routing, state persistence
U1: LLM Cache Core (exact + semantic match, InMemory + Redis backends)
U2: Cache integration into LLMGateway with CacheConfig
U3: Semantic Router as Layer 1.5 in CostAwareRouter
U4: UsageStore persistence (Redis Hash + InMemory fallback)
U5: CascadeStateStore persistence (Redis INCR + InMemory TTL)
U6: EvolutionStore interface unification (Protocol + PostgreSQL backend)
U7: Configuration integration + E2E tests

Code review fixes:
- P0: date iteration bug (day>=28), semantic router index never built,
      Redis connection leak (per-call → persistent pool)
- P1: cache degradation recovery, semantic_search degradation,
      double miss counting, asyncio.Lock for PG init, LIMIT on queries,
      __import__ anti-pattern → _utcnow()
- P2: InMemory TTL cleanup, embedding preservation on put(),
      data TTL = max(exact_ttl, semantic_ttl)
2026-06-14 15:16:00 +08:00
chiguyong 6945b78c55 fix(server): 自动发现 CWD 下的 agentkit.yaml 和 .env
之前 create_app 只在 AGENTKIT_CONFIG_PATH 环境变量设置时才加载配置和 .env,
导致 uvicorn 直接启动时 LLM provider 未注册(No provider registered)。
现在当 AGENTKIT_CONFIG_PATH 未设置时,自动查找 CWD 下的 agentkit.yaml,
并加载同目录的 .env 文件注入环境变量。
2026-06-13 11:40:26 +08:00
chiguyong 5b63214bc1 fix(gui): address all P1 code review findings
- AgentLayout: lazy-load views via defineAsyncComponent, wire route meta to quadrant tab switching
- QuadrantPanel: ARIA tablist/tab/tabpanel roles, keyboard nav, v-if via computed, expose setActiveTab
- SplitPane: touch support, keyboard resize, ARIA separator role
- ChatMessage: DOMPurify sanitization, anchor toolCalls regex to line start
- TerminalEmulator: fix ANSI span imbalance with depth tracking
- theme.ts: read CSS custom properties at runtime via readToken()
- responsive.css: fix bottom-right auto-collapse selector
- app.py: path traversal protection, exclude docs/openapi.json
- skills.ts: use BaseApiClient.request() for installSkill/uninstallSkill
2026-06-13 10:01:26 +08:00
chiguyong 4d051c2f25 feat(gui): add transitions, responsive breakpoints and bug fixes (U7)
- Add transitions.css with fade, slide, collapse, scale, stagger-list, skeleton-pulse, pulse-dot animations
- Add responsive.css with breakpoints (≥1440px full, 1280-1439px compact, <1280px prompt)
- Add small-screen prompt in AgentLayout with DesktopOutlined icon
- Fix SPA serving in app.py for Vue build output
- Fix TypeScript errors in kb.ts, skills.ts, workflow.ts, FlowCanvas.vue, SideNav.vue
- Fix unused imports in ExperienceTimeline, PathOptimizerPanel, PitfallPanel
2026-06-13 02:47:51 +08:00
chiguyong 09698d7a06 feat: frontend productization with code review fixes
- Workflow: visual canvas, undo/redo, drag-and-drop, real-time execution WebSocket
- Evolution: dashboard, ECharts metrics, experience timeline, pitfall warnings, usage panel
- KB: source CRUD, document upload, search test
- Terminal: interactive PTY WebSocket, whitelist security
- Security: hmac.compare_digest, API key auth on all endpoints, whitelist bypass fix
- Fixes: ECharts async init, WebSocket intentional disconnect, TOCTOU race, Pydantic models
2026-06-13 01:29:58 +08:00
chiguyong 5ef08a3b30 fix(review): comprehensive P0-P2 code review fixes 2026-06-12 22:18:25 +08:00
chiguyong a36bc3d1c1 feat: optimize chat response speed for sub-1s first token latency
- Add HeuristicClassifier to replace LLM quick_classify with zero-cost
  local heuristic (keyword/length/code-pattern scoring), gated by
  router.classifier config (default: heuristic)
- Add parallel tool execution in ReActEngine via asyncio.gather for
  multiple independent tool_calls, gated by parallel_tools param
- Add AsyncWriteQueue for non-blocking session persistence with WAL
  buffer, gated by async_writes param on SessionManager
- Add httpx.Limits connection pool config to all LLM providers
- Add router config section to ServerConfig and agentkit.yaml
- All optimizations have config switches for safe rollback
2026-06-12 13:15:06 +08:00
chiguyong ec51dbb259 feat: optimize劣势项 — 拍卖开关/审计采样/线程安全/评分锚定
1. 拍卖机制: 已有配置开关(marketplace.auction_enabled), 默认关闭
2. LLM审计采样: 新增 audit_sample_rate (0.0-1.0), 默认1.0, 可降低审计频率
3. AlignmentConfig.from_dict: 忽略未知键, 防止YAML额外字段崩溃
4. 配置热重载线程安全: 用 threading.Event 替代布尔标志, 消除数据竞态
5. Reflexion评分锚定: 添加评分维度(Completeness/Correctness/Clarity)和锚定点
2026-06-11 13:04:36 +08:00
chiguyong bba394be38 fix(marketplace): address code review findings
- Fix str.format() crash when user input contains curly braces
- Fix Layer 2 passing str to find_best_agent (expects list[str])
- Fix AlignmentGuard fail-open on LLM audit failure (now fail-closed)
- Fix _config_reload_lock not initialized in create_app()
- Fix evolve_soul redundant reflector.reflect() call (reuse existing reflection)
- Fix test mocks using AsyncMock for sync find_best_agent method
- Remove unused _COMPLEXITY_CLASSIFY_PROMPT constant
2026-06-10 19:21:40 +08:00
chiguyong 8713636d50 feat(marketplace): add Phase B/C - CostAwareRouter, OrganizationContext, AlignmentGuard, Soul Evolution, Auction, Server Integration
Phase B:
- U1: CostAwareRouter with 3-layer routing (rule/LLM/capability matching)
- U6: OrganizationContext with agent profiles and capability-based discovery
- U7: AlignmentGuard with constraint injection and cascade detection

Phase C:
- U8: Soul dynamic evolution with version tracking and reflection-triggered updates
- U9: Auction mechanism as optional advanced routing mode
- U10: Server integration + end-to-end integration tests

250 new tests passing across all units.
2026-06-10 19:09:02 +08:00
chiguyong 6852dfe892 fix(security,reliability): resolve all P2 findings from code review 2026-06-10 15:05:40 +08:00
chiguyong 658e188939 fix(review): resolve P0/P1 findings from final code review 2026-06-10 09:57:29 +08:00
chiguyong 7874e875af merge: integrate feat/agentkit-phase8-chat-adaptive (chat/gui commands + GUI mode)
Restores agentkit chat, agentkit gui CLI commands, onboarding wizard,
and GUI mode (AGENTKIT_GUI_MODE) with static file serving.
Resolves merge conflicts in orchestrator.py, app.py, tools/__init__.py, shell.py.
2026-06-10 07:44:06 +08:00
chiguyong c606ffa64a feat(phase5): implement management pages, evolution dashboard, and workflow editor (U13b/U13c/U14) 2026-06-10 01:29:01 +08:00
chiguyong a1deeecede feat(phase5): implement Vue3 portal foundation with chat interface and routing (U13a)
- Add Portal API routes: chat, stream, capabilities, conversations, WebSocket
- Add ConversationStore for in-memory conversation management
- Add CAPABILITY_CATEGORIES mapping for 8 capability types
- Create Vue3 SPA with TypeScript, Pinia, Vue Router, Ant Design Vue
- Implement ChatView with message bubbles, input, sidebar, WebSocket support
- Add side navigation skeleton for all 8 capability sections
- Add placeholder views for workflow, knowledge, skills, terminal, etc.
- 31 backend tests passing
2026-06-10 01:06:48 +08:00
chiguyong 31bd3b126c feat(phase8): chat adaptive enhancements, pipeline reflection, search tools upgrade
- Enhanced chat CLI with adaptive mode and session management
- Added pipeline reflection and schema extensions
- Upgraded BaiduSearch and WebSearch tools with advanced capabilities
- Expanded server routes for skills and chat
- Added session store enhancements
- New chat module and pipeline reflection support
2026-06-09 23:18:06 +08:00
chiguyong 45283d31e8 feat(core): integrate MessageBus into Orchestrator and AgentPool (U7)
- Orchestrator accepts optional message_bus parameter; workers publish
  task.progress messages via MessageBus after each subtask execution
- AgentPool accepts optional message_bus; auto-registers agents on
  create and auto-unregisters on remove
- app.py initializes MessageBus from config and injects into AgentPool
- ServerConfig adds bus configuration field
- 5 new tests, all passing
2026-06-08 00:03:40 +08:00
chiguyong 6013d5189b feat(chat): add Chat API routes with REST + WebSocket bidirectional communication 2026-06-07 22:49:26 +08:00
chiguyong b34b06724d fix(agentkit): resolve all P0/P1/P2/P3 issues from code review 2026-06-07 22:05:18 +08:00
chiguyong 286804792d feat(compression): U4 ServerConfig compression field and Agent injection
Add compression config to ServerConfig (following telemetry pattern),
create compressor in create_app, pass through AgentPool to
ConfigDrivenAgent, and inject into ReActEngine.execute() calls.
2026-06-07 18:20:05 +08:00
chiguyong 550d29a139 feat(mcp): U2 MCP config system and MCPManager lifecycle
Add MCPServerConfig dataclass with stdio/streamable_http/sse transport
validation, MCPManager for declarative YAML-driven MCP server lifecycle
(start_all/stop_all), tool discovery and registration. Integrated
into FastAPI lifespan startup/shutdown.
2026-06-07 17:25:07 +08:00
chiguyong 24e501f745 fix(core): U10 Agent status lock timeout and config hot-reload audit
- Added _acquire_status_lock with timeout (30s) to prevent deadlocks
- Added _release_status_lock for safe lock release
- Added config_version tracking on BaseAgent
- Config hot-reload now increments version and propagates to agents
- Audit logging with config version in _on_config_change
2026-06-06 22:52:51 +08:00
chiguyong 364fe6bd6d feat(memory): U3 EpisodicMemory ORM integration - EpisodeModel and session factory
- EpisodeModel ORM model with pgvector embedding support
- create_episodic_session_factory for async PostgreSQL sessions
- Server app.py now resolves session_factory from database_url config
- Graceful fallback when database_url not configured
2026-06-06 22:21:00 +08:00
chiguyong 6e362a8ae7 feat(agentkit): Phase 4 enterprise production upgrade — 12 Implementation Units
Phase A (P0): EpisodicMemory pgvector search+EmbeddingCache, ReAct timeout+CancellationToken, evolution system fix (A/B test+LLMPromptOptimizer+StrategyTuner), AnthropicProvider native Messages API
Phase B (P1): RetryPolicy+CircuitBreaker, chat_stream fallback chain, WebSocket endpoint, SSE stream fix, Evolution+Memory API routes (7 endpoints), embedding cache+Enhanced Search per-KB degradation fix
Phase C (P2): GeminiProvider native generateContent API, Agent state lock+config hot-reload

Tests: 1301 passed, 18 skipped, 0 failed
2026-06-06 21:51:04 +08:00
chiguyong e33dc25ad3 feat(memory): RAG pipeline optimization — 5 Implementation Units
U1: QueryTransformer — LLM/rule-based query rewriting + sub-query decomposition
U2: HttpRAGService enhanced_search() — rerank + compression via /bases/{kb_id}/retrieve
U3: Structured context injection — source attribution headers in RAG results
U4: RetrieveKnowledgeTool — built-in tool for mid-reasoning knowledge retrieval
U5: Configurable retrieval params + per-KB weights + CJK token estimation

Config example:
  memory:
    retrieval:
      top_k: 5
      token_budget: 2000
      context_template: structured
    query_transform:
      enabled: true
      strategy: llm
    semantic:
      search_mode: enhanced
      use_rerank: true
      kb_weights:
        industry-kb-id: 1.2
        enterprise-kb-id: 0.8

Tests: 1037 passed, 18 skipped, 0 failed
2026-06-06 19:27:09 +08:00
chiguyong cd5b39087e feat(memory): add HttpRAGService for config-driven knowledge base integration 2026-06-06 18:36:05 +08:00
chiguyong 8620751864 fix(review): address P0+P1 findings from Tier 2 code review
P0: MemoryRetriever.retrieve score mutation fix
P1: Redis atomic Lua script, deprecated API fix, SQLite WAL mode,
Redis URL masking, UniqueConstraint, TraceRecorder completed flag,
EpisodicMemory recall improvement, LLMReflector sanitization,
A/B test safety, generator cleanup, ContextCompressor guards,
OpenAIEmbedder reuse, Pipeline failure handling, Metrics O(1),
Health check Redis PING, CLI skill loading, CORS config,
API key direct pass-through

Tests: 924 passed, 18 skipped, 0 failed
2026-06-06 17:57:47 +08:00
chiguyong f858d279f3 feat(agentkit): Phase 3 upgrade - persistence, memory, evolution, observability
10 Implementation Units across 3 phases:

Phase A - Infrastructure:
- U1: RedisTaskStore with Redis/memory backend + factory function
- U2: TraceRecorder for execution trace recording
- U3: PersistentEvolutionStore with SQLite backend

Phase B - Core Capabilities:
- U4: MemoryRetriever integration into ReAct engine
- U5: Embedder abstraction + EpisodicMemory vector search
- U6: LLMReflector for LLM-in-the-loop reflection
- U7: SkillPipeline for multi-skill orchestration

Phase C - Enhancement:
- U8: SKILL.md format + progressive disclosure levels
- U9: ContextCompressor + prompt cache rendering
- U10: Structured logging + metrics endpoint + enhanced health check

Tests: 924 passed, 18 skipped, 0 failed
2026-06-06 17:17:45 +08:00
chiguyong ec0e221beb feat(server): Phase D - async task system (TaskStore + BackgroundRunner + API)
U5: TaskStore - in-memory task state with TTL cleanup and max records
U6: BackgroundRunner - async task execution with semaphore concurrency control
U7: Task status/result API + cancel endpoint + async submit mode

45 tests passing (28 new + 17 existing, no regression).
2026-06-06 11:39:41 +08:00
chiguyong 5f1c51cf9a feat(server): Phase B - auth, rate limiting, SSRF protection, handler whitelist
U1: API Key authentication middleware (dev mode skip, health whitelist)
U2: Rate limiting middleware (fixed-window, 60 req/min default)
U3: Callback URL SSRF protection (private IP blocking)
U4: custom_handler module prefix whitelist

65 tests passing. CORS conflict fixed.
2026-06-05 23:37:36 +08:00
chiguyong f87b790c0f feat(agentkit): v2 Phase 1 - ReAct/LLM Gateway/Skill/Server + review fixes
535 unit + 52 integration tests passing. README added.
2026-06-05 23:32:16 +08:00