Commit Graph

217 Commits

Author SHA1 Message Date
chiguyong 4d051c2f25 feat(gui): add transitions, responsive breakpoints and bug fixes (U7)
- Add transitions.css with fade, slide, collapse, scale, stagger-list, skeleton-pulse, pulse-dot animations
- Add responsive.css with breakpoints (≥1440px full, 1280-1439px compact, <1280px prompt)
- Add small-screen prompt in AgentLayout with DesktopOutlined icon
- Fix SPA serving in app.py for Vue build output
- Fix TypeScript errors in kb.ts, skills.ts, workflow.ts, FlowCanvas.vue, SideNav.vue
- Fix unused imports in ExperienceTimeline, PathOptimizerPanel, PitfallPanel
2026-06-13 02:47:51 +08:00
chiguyong db216a5cc4 feat(gui): streamline evolution panel and group settings (U6)
- EvolutionView: simplify from 6 sub-panels to 3 tabs (概览+指标, 经验+坑点, 用量)
- SettingsView: group into 4 tabs (LLM, 技能, 知识库, 系统), each with independent save
- SkillsView: replace hardcoded colors with Design Tokens
- All three views: replace hardcoded colors with Design Token references
2026-06-13 02:43:41 +08:00
chiguyong 3dc5c68135 feat(gui): refactor terminal panel with One Dark Pro theme and Ant Design Modal (U5)
- TerminalView: replace native HTML confirmation with Ant Design Modal,
  make command history sidebar collapsible (default collapsed)
- TerminalEmulator: use One Dark Pro CSS variables for ANSI colors,
  replace all hardcoded colors with Design Tokens
- CommandHistory: replace all hardcoded colors with Design Tokens
2026-06-13 02:41:46 +08:00
chiguyong 79a400afe8 feat(gui): add code/preview panel with diff viewer and file tree (U4)
- Create CodeDiffViewer.vue with line-level diff highlighting
- Create FileTree.vue with status badges (added/modified/deleted)
- Update WorkflowView.vue: replace hardcoded colors with Design Tokens
- Update KnowledgeBaseView.vue: replace hardcoded colors with Design Tokens
2026-06-13 02:40:14 +08:00
chiguyong 6d5a08cb0c feat(gui): refactor chat panel with Markdown rendering and tool indicators (U3)
- ChatMessage: replace v-html with markdown-it, add ToolCallIndicator
- ChatInput: add ContextPill support for file/skill context
- ChatSidebar: make collapsible (default collapsed)
- Create ToolCallIndicator.vue with color-coded tool type badges
- Create ContextPill.vue for context references
- Replace all hardcoded colors with Design Token references
- Install markdown-it for Markdown rendering
2026-06-13 02:38:37 +08:00
chiguyong 9273612a5b feat(gui): add four-quadrant Agent-First layout with top navigation (U2)
- Create AgentLayout.vue with CSS Grid four-quadrant layout
- Create SplitPane.vue with draggable divider and localStorage persistence
- Create TopNav.vue with logo, status indicator, and settings entry
- Create QuadrantPanel.vue with tab switching and collapse support
- Restructure router: /agent as main route, legacy routes redirect
- App.vue now uses router-view for layout switching
2026-06-13 02:35:28 +08:00
chiguyong 2988d3768e feat(gui): add Design Token system and theme configuration (U1)
- Create tokens.css with CSS custom properties for colors, spacing,
  radius, fonts, shadows, transitions, and code theme
- Create theme.ts mapping tokens to Ant Design Vue ConfigProvider
- Create styles/index.ts as unified entry point
- Inject theme into App.vue ConfigProvider
- Import styles in main.ts

Primary color unified to #7c3aed (purple gradient brand color).
2026-06-13 02:32:29 +08:00
chiguyong 09698d7a06 feat: frontend productization with code review fixes
- Workflow: visual canvas, undo/redo, drag-and-drop, real-time execution WebSocket
- Evolution: dashboard, ECharts metrics, experience timeline, pitfall warnings, usage panel
- KB: source CRUD, document upload, search test
- Terminal: interactive PTY WebSocket, whitelist security
- Security: hmac.compare_digest, API key auth on all endpoints, whitelist bypass fix
- Fixes: ECharts async init, WebSocket intentional disconnect, TOCTOU race, Pydantic models
2026-06-13 01:29:58 +08:00
chiguyong 3c6c41c140 merge: feat/chat-response-speed-optimization into main 2026-06-12 22:18:40 +08:00
chiguyong 5ef08a3b30 fix(review): comprehensive P0-P2 code review fixes 2026-06-12 22:18:25 +08:00
chiguyong 2e55aae775 fix(review): address code review findings for speed optimization
- P0: Rename WAL buffer to pending buffer, add crash-loss warning
- P1: Fix keyword substring false matches with word-boundary regex
- P1: Pass connection pool params in _build_llm_config
- P1: Change parallel_tools default to False (safer default)
- P1: Add classifier value validation in CostAwareRouter
- P2: Replace __import__ with proper datetime import
- P2: Add max_buffer_size enforcement in AsyncWriteQueue
2026-06-12 13:21:44 +08:00
chiguyong a36bc3d1c1 feat: optimize chat response speed for sub-1s first token latency
- Add HeuristicClassifier to replace LLM quick_classify with zero-cost
  local heuristic (keyword/length/code-pattern scoring), gated by
  router.classifier config (default: heuristic)
- Add parallel tool execution in ReActEngine via asyncio.gather for
  multiple independent tool_calls, gated by parallel_tools param
- Add AsyncWriteQueue for non-blocking session persistence with WAL
  buffer, gated by async_writes param on SessionManager
- Add httpx.Limits connection pool config to all LLM providers
- Add router config section to ServerConfig and agentkit.yaml
- All optimizations have config switches for safe rollback
2026-06-12 13:15:06 +08:00
chiguyong d3b792a9ec merge: feat/pipeline-adversarial-loop into main
Pipeline-level adversarial loop (Worker ↔ Verifier) implementation:
- Schema: ReviewIssue, ReviewFeedback, AdversarialState models
- Engine: adversarial execution, feedback injection, escalation
- Config: code_reviewer skill + coding_harness pipeline
- Tests: 19 unit + 5 integration, all passing
- Code review fixes: 2 Critical + 5 Major issues resolved
2026-06-12 10:02:47 +08:00
chiguyong 8c365486e2 fix(pipeline): address code review findings for adversarial loop
Critical:
- C1: Add verifier_timeout_seconds for independent Verifier timeout
- C2: Verifier parse failure raises RuntimeError instead of dead-loop

Major:
- M1: Inject previous_output into Worker retry context
- M2: Add Pydantic ge/le constraint on ReviewFeedback.score
- M3: Use Literal type for feedback_mode enum validation
- M4: Use Literal types for ReviewIssue severity and category
- M5: Merge error messages when escalation agent also fails

Tests: 8 new test cases added (19 total), all passing
2026-06-12 10:02:37 +08:00
chiguyong ddc735b078 test(pipeline): add coding harness integration tests
5 passing tests covering:
- Pipeline config loading and validation
- Review stage adversarial config verification
- Stage dependencies validation
- Code reviewer skill config and output schema

3 skipped tests (complex mock sequencing covered by unit tests)
2026-06-12 09:42:21 +08:00
chiguyong 3392413614 test(pipeline): add adversarial loop unit tests
11 test cases covering:
- PipelineSchemaAdversarial (4): verifier fields, backward compat, serialization, state tracking
- AdversarialExecution (3): no verifier passthrough, first round pass, max rounds exhausted
- FeedbackContext (3): structured+natural, structured, natural modes
- Escalation (1): no escalation configured
2026-06-12 09:40:19 +08:00
chiguyong 6731d96c65 feat(configs): add code_reviewer skill and coding_harness pipeline
- code_reviewer.yaml: Verifier Agent skill config for adversarial review
  with structured output schema for ReviewFeedback format
- coding_harness.yaml: Example pipeline with adversarial loop
  develop → test → review (Worker↔Verifier) → archive
2026-06-12 09:38:37 +08:00
chiguyong dc07c7c60a feat(pipeline): implement adversarial loop execution logic
Add Worker-Verifier adversarial loop to PipelineEngine:
- _execute_stage_with_adversarial: main loop for Worker→Verifier→retry
- _execute_agent_stage: extracted agent execution logic
- _execute_verifier: execute verifier and parse ReviewFeedback
- _build_feedback_context: build feedback context for worker retry
- _escalate: handle round exhaustion (escalate or fail)
- Route to adversarial mode when stage.verifier is configured

Support three feedback modes: structured+natural, structured, natural
2026-06-12 09:37:30 +08:00
chiguyong b733b3a732 feat(pipeline): add adversarial loop schema models
Add ReviewIssue, ReviewFeedback, AdversarialState models and extend
PipelineStage with verifier, max_adversarial_rounds, feedback_mode,
and escalate_on_exhaust fields for Worker-Verifier adversarial loop.
2026-06-12 09:35:01 +08:00
chiguyong 2110c84fb6 fix: switch default model to qwen3-coder-plus for better function calling
DeepSeek-chat has limited/partial function calling support. Qwen3-coder-plus
(DashScope) has robust OpenAI-compatible function calling.

Also added tool usage instructions to system prompt and enhanced logging
to trace tool propagation through the pipeline.
2026-06-12 09:27:52 +08:00
chiguyong 44f19fcf14 feat: loading animation + tool descriptions in system prompt
1. Loading indicator: three-dot bouncing animation appears after
   sending a message and disappears when server starts responding.

2. Tool descriptions: resolve_skill_routing now appends available
   tools (name + description + parameters) to the system prompt so
   the LLM knows what tools it can call.
2026-06-11 22:25:21 +08:00
chiguyong 55421dd126 fix: get_tools() and get_system_prompt() now read from tool_registry too
Root cause: app.py registers tools via agent._tool_registry.register()
which adds to the ToolRegistry but NOT to agent._tools (which is only
populated by use_tool() from config). Both get_tools() and
get_system_prompt() were reading only _tools, missing all post-init
registered tools. Now both methods merge _tools with
_tool_registry.list_tools().
2026-06-11 22:17:14 +08:00
chiguyong f7225bc91a fix: include available tools in system prompt so LLM knows what it can call
Previously get_system_prompt() only returned identity/instructions but
did not tell the LLM what tools are available. The LLM would therefore
refuse to call tools even when they were registered, saying it had no
tools. Now the system prompt includes a '## 可用工具' section listing
all registered tools with their descriptions and parameters.
2026-06-11 22:00:30 +08:00
chiguyong b6ec13cbca debug: log tools count and names in portal before execute_stream 2026-06-11 21:44:20 +08:00
chiguyong 32c800d1e4 fix: portal routing + response speed + IME input
1. Portal unified routing: ws_chat now uses CostAwareRouter uniformly
   (handles Layer 0/1/2), replacing direct IntentRouter calls.
   Greeting/chat_mode requests skip IntentRouter LLM call entirely.

2. Response speed: greeting & simple chat now use direct LLM call
   (no ReAct loop), zero-cost Layer 0 detection.

3. IME input fix: use e.isComposing (native browser property)
   instead of compositionstart/end for Enter key detection.

4. Test: fix InMemoryMessageBus.request() parameter name
   timeout -> timeout_seconds.
2026-06-11 21:30:25 +08:00
chiguyong ae95b56465 fix: use e.isComposing for IME detection instead of manual flag
e.isComposing is a standard KeyboardEvent property that's true during
IME composition. More reliable than compositionstart/compositionend
which can fire at unpredictable timing relative to keydown.
2026-06-11 20:43:38 +08:00
chiguyong 66d0901938 fix: prevent Enter from submitting during IME composition
Added compositionstart/compositionend event listeners to track IME
composing state. Enter key now only submits when not composing,
so Chinese/Japanese/Korean input methods work correctly.
2026-06-11 15:37:06 +08:00
chiguyong cc4c6fe346 fix: direct-mode agent falls through to default when task needs tools
When IntentRouter matches a direct-mode agent (no tools), but the task
content suggests tool needs (shell, search, file ops, etc.), the routing
now falls through to the default agent which has full tool access.

This fixes the issue where "帮我执行个命令" would be routed to
direct_agent and fail because direct mode doesn't support tool calling.

Also restored "你好" in direct_agent keywords since it's correctly
handled now — greetings don't need tools, direct mode is fine.
2026-06-11 15:26:19 +08:00
chiguyong 52b7d6007d fix: remove '你好' from direct_agent keywords so greetings route to default agent with tools 2026-06-11 14:49:59 +08:00
chiguyong 93bc7c4e3e fix: change all agent YAML model from hardcoded provider to 'default'
Hardcoded model names like 'openai/gpt-4o-mini' or 'anthropic/claude-sonnet'
cause 'No provider available' errors when the specific provider isn't configured.
Using 'default' lets the system pick the available provider automatically.
2026-06-11 14:19:26 +08:00
chiguyong d47f279887 fix: resolve code review issues from deferred improvements
1. InMemoryMessageBus.request(): fix param name (timeout→timeout_seconds) to match ABC
2. InMemoryMessageBus: track consumer tasks, cancel on unsubscribe
3. InMemoryMessageBus: _try_resolve_pending() in queue consumer path
4. evolve_soul(): use "default" category when patterns is empty
5. quick_classify(): use delimiter-based prompt to mitigate injection risk
6. Use asyncio.get_running_loop() instead of deprecated get_event_loop()
2026-06-11 13:49:02 +08:00
chiguyong ec51dbb259 feat: optimize劣势项 — 拍卖开关/审计采样/线程安全/评分锚定
1. 拍卖机制: 已有配置开关(marketplace.auction_enabled), 默认关闭
2. LLM审计采样: 新增 audit_sample_rate (0.0-1.0), 默认1.0, 可降低审计频率
3. AlignmentConfig.from_dict: 忽略未知键, 防止YAML额外字段崩溃
4. 配置热重载线程安全: 用 threading.Event 替代布尔标志, 消除数据竞态
5. Reflexion评分锚定: 添加评分维度(Completeness/Correctness/Clarity)和锚定点
2026-06-11 13:04:36 +08:00
chiguyong cc2cd414c9 fix: resolve all code review issues from cross-validation
1. Critical: Add missing TaskResult import in plan_exec_engine.py
2. Critical: Fix ReWOOEngine param name (max_steps → max_plan_steps)
3. Major: Remove duplicate token counting in reflexion.py
4. Major: LLM audit failure now passes (trusts rule check) instead of failing
5. Major: Fix dict iteration with del using list() copy in lifecycle.py
6. Major: Fix Chinese content tokenization using regex split instead of space split
7. Minor: _is_positive_mention now checks all occurrences, not just the first
2026-06-11 06:22:35 +08:00
chiguyong 79eb8469f9 fix: address remaining code review issues
- AlignmentGuard: direction-aware constraint checking (negation/affirmation detection)
  instead of simple substring matching to reduce false positives
- Reflexion: extract actual token usage from LLM response instead of hardcoded 1
- MemoryTool: protect version/history sections from update_soul modification
- Fix AsyncMock warnings for sync find_best_agent method
2026-06-11 00:14:11 +08:00
chiguyong 5171e942d6 feat: multi-agent marketplace architecture evolution
Phase A: ReWOO, PlanExec, Reflexion engines + SkillConfig extension
Phase B: CostAwareRouter, OrganizationContext, AlignmentGuard
Phase C: Soul evolution, Auction mechanism, Server integration

250 tests passing across all units.
2026-06-10 23:58:06 +08:00
chiguyong bba394be38 fix(marketplace): address code review findings
- Fix str.format() crash when user input contains curly braces
- Fix Layer 2 passing str to find_best_agent (expects list[str])
- Fix AlignmentGuard fail-open on LLM audit failure (now fail-closed)
- Fix _config_reload_lock not initialized in create_app()
- Fix evolve_soul redundant reflector.reflect() call (reuse existing reflection)
- Fix test mocks using AsyncMock for sync find_best_agent method
- Remove unused _COMPLEXITY_CLASSIFY_PROMPT constant
2026-06-10 19:21:40 +08:00
chiguyong 8713636d50 feat(marketplace): add Phase B/C - CostAwareRouter, OrganizationContext, AlignmentGuard, Soul Evolution, Auction, Server Integration
Phase B:
- U1: CostAwareRouter with 3-layer routing (rule/LLM/capability matching)
- U6: OrganizationContext with agent profiles and capability-based discovery
- U7: AlignmentGuard with constraint injection and cascade detection

Phase C:
- U8: Soul dynamic evolution with version tracking and reflection-triggered updates
- U9: Auction mechanism as optional advanced routing mode
- U10: Server integration + end-to-end integration tests

250 new tests passing across all units.
2026-06-10 19:09:02 +08:00
chiguyong 5b42487d8a feat(core): add ReWOO, Plan-and-Execute, Reflexion execution engines
Phase A of Multi-Agent Marketplace architecture:
- ReWOOEngine: plan-all-then-execute pattern for parallel data fetch
- PlanExecEngine: adapter wrapping GoalPlanner+PlanExecutor+PipelineReplanner
- ReflexionEngine: ReAct + Evaluate + Reflect + Retry for high-precision tasks
- SkillConfig: extend VALID_EXECUTION_MODES with rewoo/plan_exec/reflexion
- ConfigDrivenAgent: add _handle_rewoo/_handle_plan_exec/_handle_reflexion routes
- 5 professional agent YAML configs with layered model defaults
- 107 unit tests passing
2026-06-10 17:08:48 +08:00
chiguyong 6852dfe892 fix(security,reliability): resolve all P2 findings from code review 2026-06-10 15:05:40 +08:00
chiguyong 658e188939 fix(review): resolve P0/P1 findings from final code review 2026-06-10 09:57:29 +08:00
chiguyong 1d1805753c fix: resolve key P2 findings from code review
- Shell whitelist: use exact binary match instead of startswith
- Shell audit log: use deque(maxlen=10000) to cap memory
- Terminal history: use deque(maxlen) for O(1) eviction
- Path optimizer: cap _pending_paths at 50 entries per task_type
- Pitfall detector: only add tips to matching steps, not all
- Experience store: handle non-numeric _parse_time_window input
- Extract shared is_safe_url() to utils/security.py (DRY)
- Workflow condition evaluator: handle float() ValueError
2026-06-10 09:01:23 +08:00
chiguyong b46a10973f fix(tests): clean up test_shell_tool.py lint issues 2026-06-10 08:46:35 +08:00
chiguyong 9646b0f0dd fix(tests): update test_shell_tool.py to match new ShellTool API 2026-06-10 08:22:15 +08:00
chiguyong 7874e875af merge: integrate feat/agentkit-phase8-chat-adaptive (chat/gui commands + GUI mode)
Restores agentkit chat, agentkit gui CLI commands, onboarding wizard,
and GUI mode (AGENTKIT_GUI_MODE) with static file serving.
Resolves merge conflicts in orchestrator.py, app.py, tools/__init__.py, shell.py.
2026-06-10 07:44:06 +08:00
chiguyong 9e9f1314f6 fix(security): resolve all P0/P1 findings from code review 2026-06-10 07:12:41 +08:00
chiguyong b34f74f598 feat(phase6): implement end-to-end enterprise scenario validation (U15)
- Add goal-driven agent skill config and pipeline config
- Add 9 E2E integration tests covering all 7 capabilities:
  - SC1: Goal-driven SEO analysis (GoalPlanner→PlanExecutor→PlanChecker→ExperienceStore)
  - SC2: Knowledge Q&A with system operation (MultiSourceRAG)
  - SC3: Workflow with approval (WorkflowStore + approval node)
  - SC4: Self-evolution experience accumulation (ExperienceStore→PitfallDetector→PathOptimizer)
  - SC5: Parallel execution efficiency verification
  - SC6: Skill registry integration (capabilities, versions, health)
  - Cross-capability: Plan+Experience+Pitfall, Review+Experience, RAG+Workflow
- All 2472 tests passing (9 integration + 2463 unit)
2026-06-10 01:38:28 +08:00
chiguyong c606ffa64a feat(phase5): implement management pages, evolution dashboard, and workflow editor (U13b/U13c/U14) 2026-06-10 01:29:01 +08:00
chiguyong a1deeecede feat(phase5): implement Vue3 portal foundation with chat interface and routing (U13a)
- Add Portal API routes: chat, stream, capabilities, conversations, WebSocket
- Add ConversationStore for in-memory conversation management
- Add CAPABILITY_CATEGORIES mapping for 8 capability types
- Create Vue3 SPA with TypeScript, Pinia, Vue Router, Ant Design Vue
- Implement ChatView with message bubbles, input, sidebar, WebSocket support
- Add side navigation skeleton for all 8 capability sections
- Add placeholder views for workflow, knowledge, skills, terminal, etc.
- 31 backend tests passing
2026-06-10 01:06:48 +08:00
chiguyong 901e4d9d0a feat(phase4): implement Computer Use integration (U12)
- ComputerUseTool: Anthropic API + fallback chain (API→Session→ShellTool→AskHuman)
- ComputerUseSession: Docker sandbox + InMemory test session
- ComputerUseRecorder: action recording, replay, and persistence

89 new tests passing. Degradation chain verified.
2026-06-10 00:54:31 +08:00
chiguyong c99aee1423 feat(phase3): implement knowledge base and RAG enhancement (U9-U11)
- U9: LocalDocumentIngestion - multi-format doc parsing and chunking
- U10: ExternalKBAdapters - Feishu/Confluence/GenericHTTP adapters
- U11: MultiSourceRAG - multi-source retrieval with source tracing

KnowledgeBase protocol defined (KTD-7). 145 new tests passing.
2026-06-10 00:45:17 +08:00