Commit Graph

82 Commits

Author SHA1 Message Date
chiguyong bba394be38 fix(marketplace): address code review findings
- Fix str.format() crash when user input contains curly braces
- Fix Layer 2 passing str to find_best_agent (expects list[str])
- Fix AlignmentGuard fail-open on LLM audit failure (now fail-closed)
- Fix _config_reload_lock not initialized in create_app()
- Fix evolve_soul redundant reflector.reflect() call (reuse existing reflection)
- Fix test mocks using AsyncMock for sync find_best_agent method
- Remove unused _COMPLEXITY_CLASSIFY_PROMPT constant
2026-06-10 19:21:40 +08:00
chiguyong 8713636d50 feat(marketplace): add Phase B/C - CostAwareRouter, OrganizationContext, AlignmentGuard, Soul Evolution, Auction, Server Integration
Phase B:
- U1: CostAwareRouter with 3-layer routing (rule/LLM/capability matching)
- U6: OrganizationContext with agent profiles and capability-based discovery
- U7: AlignmentGuard with constraint injection and cascade detection

Phase C:
- U8: Soul dynamic evolution with version tracking and reflection-triggered updates
- U9: Auction mechanism as optional advanced routing mode
- U10: Server integration + end-to-end integration tests

250 new tests passing across all units.
2026-06-10 19:09:02 +08:00
chiguyong 5b42487d8a feat(core): add ReWOO, Plan-and-Execute, Reflexion execution engines
Phase A of Multi-Agent Marketplace architecture:
- ReWOOEngine: plan-all-then-execute pattern for parallel data fetch
- PlanExecEngine: adapter wrapping GoalPlanner+PlanExecutor+PipelineReplanner
- ReflexionEngine: ReAct + Evaluate + Reflect + Retry for high-precision tasks
- SkillConfig: extend VALID_EXECUTION_MODES with rewoo/plan_exec/reflexion
- ConfigDrivenAgent: add _handle_rewoo/_handle_plan_exec/_handle_reflexion routes
- 5 professional agent YAML configs with layered model defaults
- 107 unit tests passing
2026-06-10 17:08:48 +08:00
chiguyong 6852dfe892 fix(security,reliability): resolve all P2 findings from code review 2026-06-10 15:05:40 +08:00
chiguyong 658e188939 fix(review): resolve P0/P1 findings from final code review 2026-06-10 09:57:29 +08:00
chiguyong 1d1805753c fix: resolve key P2 findings from code review
- Shell whitelist: use exact binary match instead of startswith
- Shell audit log: use deque(maxlen=10000) to cap memory
- Terminal history: use deque(maxlen) for O(1) eviction
- Path optimizer: cap _pending_paths at 50 entries per task_type
- Pitfall detector: only add tips to matching steps, not all
- Experience store: handle non-numeric _parse_time_window input
- Extract shared is_safe_url() to utils/security.py (DRY)
- Workflow condition evaluator: handle float() ValueError
2026-06-10 09:01:23 +08:00
chiguyong b46a10973f fix(tests): clean up test_shell_tool.py lint issues 2026-06-10 08:46:35 +08:00
chiguyong 9646b0f0dd fix(tests): update test_shell_tool.py to match new ShellTool API 2026-06-10 08:22:15 +08:00
chiguyong 7874e875af merge: integrate feat/agentkit-phase8-chat-adaptive (chat/gui commands + GUI mode)
Restores agentkit chat, agentkit gui CLI commands, onboarding wizard,
and GUI mode (AGENTKIT_GUI_MODE) with static file serving.
Resolves merge conflicts in orchestrator.py, app.py, tools/__init__.py, shell.py.
2026-06-10 07:44:06 +08:00
chiguyong 9e9f1314f6 fix(security): resolve all P0/P1 findings from code review 2026-06-10 07:12:41 +08:00
chiguyong b34f74f598 feat(phase6): implement end-to-end enterprise scenario validation (U15)
- Add goal-driven agent skill config and pipeline config
- Add 9 E2E integration tests covering all 7 capabilities:
  - SC1: Goal-driven SEO analysis (GoalPlanner→PlanExecutor→PlanChecker→ExperienceStore)
  - SC2: Knowledge Q&A with system operation (MultiSourceRAG)
  - SC3: Workflow with approval (WorkflowStore + approval node)
  - SC4: Self-evolution experience accumulation (ExperienceStore→PitfallDetector→PathOptimizer)
  - SC5: Parallel execution efficiency verification
  - SC6: Skill registry integration (capabilities, versions, health)
  - Cross-capability: Plan+Experience+Pitfall, Review+Experience, RAG+Workflow
- All 2472 tests passing (9 integration + 2463 unit)
2026-06-10 01:38:28 +08:00
chiguyong c606ffa64a feat(phase5): implement management pages, evolution dashboard, and workflow editor (U13b/U13c/U14) 2026-06-10 01:29:01 +08:00
chiguyong a1deeecede feat(phase5): implement Vue3 portal foundation with chat interface and routing (U13a)
- Add Portal API routes: chat, stream, capabilities, conversations, WebSocket
- Add ConversationStore for in-memory conversation management
- Add CAPABILITY_CATEGORIES mapping for 8 capability types
- Create Vue3 SPA with TypeScript, Pinia, Vue Router, Ant Design Vue
- Implement ChatView with message bubbles, input, sidebar, WebSocket support
- Add side navigation skeleton for all 8 capability sections
- Add placeholder views for workflow, knowledge, skills, terminal, etc.
- 31 backend tests passing
2026-06-10 01:06:48 +08:00
chiguyong 901e4d9d0a feat(phase4): implement Computer Use integration (U12)
- ComputerUseTool: Anthropic API + fallback chain (API→Session→ShellTool→AskHuman)
- ComputerUseSession: Docker sandbox + InMemory test session
- ComputerUseRecorder: action recording, replay, and persistence

89 new tests passing. Degradation chain verified.
2026-06-10 00:54:31 +08:00
chiguyong c99aee1423 feat(phase3): implement knowledge base and RAG enhancement (U9-U11)
- U9: LocalDocumentIngestion - multi-format doc parsing and chunking
- U10: ExternalKBAdapters - Feishu/Confluence/GenericHTTP adapters
- U11: MultiSourceRAG - multi-source retrieval with source tracing

KnowledgeBase protocol defined (KTD-7). 145 new tests passing.
2026-06-10 00:45:17 +08:00
chiguyong e3d4f811dd feat(phase2): implement self-evolution and smart terminal (U6-U8)
- U6: PitfallDetector - detect historical failure patterns and warn
- U7: PathOptimizer - discover and update optimal execution paths
- U8: TerminalSession - session state, PTY interactive, output parsing

160 new tests passing. ShellTool enhanced with session_id support.
2026-06-10 00:22:36 +08:00
chiguyong fd4a811929 feat(phase1): implement core kernel and experience foundation (U1-U5)
- U1: GoalPlanner - structured goal decomposition wrapping _decompose_task()
- U2: PlanExecutor - parallel execution with retry/skip/replace strategies
- U3: PlanChecker - quality gate + review + experience writing
- U4: Skill spec upgrade - dependencies, capabilities, version management
- U5: ExperienceStore - PostgreSQL+pgvector task experience storage

208 new tests passing, fully backward compatible.
2026-06-09 23:57:03 +08:00
chiguyong 31bd3b126c feat(phase8): chat adaptive enhancements, pipeline reflection, search tools upgrade
- Enhanced chat CLI with adaptive mode and session management
- Added pipeline reflection and schema extensions
- Upgraded BaiduSearch and WebSearch tools with advanced capabilities
- Expanded server routes for skills and chat
- Added session store enhancements
- New chat module and pipeline reflection support
2026-06-09 23:18:06 +08:00
chiguyong 045fecd4ce feat(tools): add ShellTool + WebSearchTool, memory system, onboarding wizard, chat mode
- ShellTool: safe command execution with allowlist, blocked patterns (regex), timeout, output truncation
- WebSearchTool: multi-backend search with Tavily → Serper → DuckDuckGo Lite fallback
- MemoryTool: agent-callable tool with add/replace/remove/read actions
- MemoryStore/MemoryFile: file-based memory (SOUL.md, USER.md, MEMORY.md, DAILY.md)
- Onboarding wizard: provider selection, API key, model selection, agent personality
- Chat mode: interactive CLI with streaming, memory injection, tool integration
- Add 百炼 Coding Plan provider with 10 models
- 102 unit tests (34 new for ShellTool + WebSearchTool)
2026-06-09 01:06:45 +08:00
chiguyong 9874a4aac0 test: add Phase 8 integration tests for Chat + Adaptive + Multi-Agent (U8)
End-to-end integration tests covering session lifecycle, adaptive pipeline,
multi-agent communication via MessageBus, and config serialization.
2026-06-08 01:17:04 +08:00
chiguyong 45283d31e8 feat(core): integrate MessageBus into Orchestrator and AgentPool (U7)
- Orchestrator accepts optional message_bus parameter; workers publish
  task.progress messages via MessageBus after each subtask execution
- AgentPool accepts optional message_bus; auto-registers agents on
  create and auto-unregisters on remove
- app.py initializes MessageBus from config and injects into AgentPool
- ServerConfig adds bus configuration field
- 5 new tests, all passing
2026-06-08 00:03:40 +08:00
chiguyong 13d6e74099 feat(bus): add MessageBus abstraction layer with InMemory + Redis Streams (U6)
- AgentMessage: message model with sender/recipient/topic/payload/correlation_id
- MessageBus Protocol: publish/subscribe/unsubscribe/request/broadcast/health_check
- InMemoryMessageBus: asyncio.Queue-based implementation for testing
- RedisMessageBus: Redis Streams (XADD/XREADGROUP) implementation with
  consumer groups, message acknowledgment, and dead letter queue
- create_message_bus() factory with graceful Redis→InMemory fallback
- Request-response pattern via correlation_id + asyncio.Future
- 13 new tests, all passing
2026-06-07 23:58:16 +08:00
chiguyong 88d8298871 feat(core): add Orchestrator adaptive task decomposition (U5)
- execute_adaptive(): iterative execute→evaluate→re-decompose loop
- OrchestratorConfig: adaptive, max_iterations, quality_threshold
- _evaluate_quality(): LLM-based or rule-based quality scoring (0-1)
- _reexecute_failed(): preserves completed subtask results, retries
  failed ones with improvement feedback injected into input_data
- OrchestrationResult.metadata field for tracking iteration history
- 10 new tests, all passing
2026-06-07 23:50:54 +08:00
chiguyong 7054ac02b6 feat(tools): add AskHumanTool + token streaming in ReAct execute_stream
- AskHumanTool: Human-in-the-Loop tool for Chat mode, pushes questions
  via WebSocket callback and waits for user reply via asyncio.Future
- Token streaming: execute_stream() now uses chat_stream() instead of
  chat(), yielding token-type ReActEvents for each StreamChunk
- _build_response_from_stream() static method constructs LLMResponse
  from accumulated stream data
- Export AskHumanTool from tools/__init__.py
- 12 new tests (7 AskHumanTool + 5 token streaming), all passing
2026-06-07 23:40:43 +08:00
chiguyong 6013d5189b feat(chat): add Chat API routes with REST + WebSocket bidirectional communication 2026-06-07 22:49:26 +08:00
chiguyong 493187782c feat(session): add Session/Message models and SessionManager with InMemory/Redis stores 2026-06-07 22:43:14 +08:00
chiguyong e4d6efb4bf Merge feat/agentkit-phase7-headroom: Phase 6-7 + all review fixes 2026-06-07 22:05:34 +08:00
chiguyong b34b06724d fix(agentkit): resolve all P0/P1/P2/P3 issues from code review 2026-06-07 22:05:18 +08:00
chiguyong 3645c7a080 docs: mark Phase 7 Headroom integration plan as completed 2026-06-07 18:21:27 +08:00
chiguyong bad66445ff feat(compression): U6 GEO Pipeline compression integration tests and config
Add GEO Pipeline end-to-end compression integration tests with
MockHeadroomCompressor. Add compression configuration section to
llm_config.yaml with headroom and summary mode examples.
2026-06-07 18:20:41 +08:00
chiguyong 9c04362dba feat(compression): U5 HeadroomRetrieveTool for CCR cache retrieval
Add HeadroomRetrieveTool that allows LLM to retrieve original
uncompressed data from CCR cache via Function Calling. Auto-registered
when HeadroomCompressor is active and available.
2026-06-07 18:20:17 +08:00
chiguyong 286804792d feat(compression): U4 ServerConfig compression field and Agent injection
Add compression config to ServerConfig (following telemetry pattern),
create compressor in create_app, pass through AgentPool to
ConfigDrivenAgent, and inject into ReActEngine.execute() calls.
2026-06-07 18:20:05 +08:00
chiguyong fcb4fb33f3 feat(compression): U3 ReAct engine tool result compression and incremental compress
Extend _build_tool_result_message to accept compressor parameter for
tool output compression. Add _should_compress helper for token budget
checking. Add incremental compression within ReAct loop when
conversation exceeds threshold.
2026-06-07 18:19:53 +08:00
chiguyong ea705b979b feat(compression): U2 HeadroomCompressor with SmartCrusher and CCR cache
Add HeadroomCompressor implementing CompressionStrategy Protocol with
content-type routing (JSON→SmartCrusher, code→CodeCompressor), CCR
reversible compression cache, and graceful degradation when headroom-ai
is not installed.
2026-06-07 18:19:41 +08:00
chiguyong 5d3a5f2bf3 feat(compression): U1 CompressionStrategy Protocol and create_compressor factory
Add runtime-checkable CompressionStrategy Protocol with compress(),
compress_tool_result(), and is_available() methods. Add compress_tool_result
and is_available to existing ContextCompressor. Add create_compressor()
factory function with headroom/summary provider routing and ImportError
fallback.
2026-06-07 18:19:27 +08:00
chiguyong 80a505b1c1 docs: mark Phase 6 plan as completed 2026-06-07 17:27:01 +08:00
chiguyong 239009357a feat(telemetry): U7 OpenTelemetry integration with zero-dependency no-op pattern
Add telemetry module with tracing (agent/tool/llm/pipeline_step spans),
metrics (5 histograms/counters), and setup with optional OTLP exporters.
Uses no-op pattern when opentelemetry not installed. GenAI Semantic
Conventions for LLM spans. Integrated into ReactEngine, LLMGateway,
ToolBase, and FastAPI app.
2026-06-07 17:26:21 +08:00
chiguyong 03a5167366 feat(pipeline): U6 step-level retry with exponential backoff and saga compensation
Add StepRetryPolicy with jitter-based exponential backoff, SagaOrchestrator
with LIFO compensation pattern, integrate retry_policy and compensate
fields into PipelineStage/PipelineStep schema, add GEO pipeline
compensation definitions for all 7 steps.
2026-06-07 17:26:07 +08:00
chiguyong 4db637cd4f feat(pipeline): U5 state persistence with Redis hot + PG cold dual-write
Add PipelineStateMemory/Redis/PG backends, PipelineStateManager with
Redis Sorted Set hot state + PostgreSQL JSONB cold persistence.
Integrated into PipelineEngine with state persistence calls at each
step transition.
2026-06-07 17:25:52 +08:00
chiguyong 2e547e345a feat(geo): U4 GEO skill tool binding with BaiduSearch and E2E tests
Add BaiduSearchTool (API mode + scraping fallback), bind tools to
GEO skill YAML configs (baidu_search, web_crawl, schema_extract,
schema_generate), extend geo_full_pipeline with generate_content
and deai steps, add 36 E2E integration tests.
2026-06-07 17:25:37 +08:00
chiguyong 9ec1740047 feat(tools): U3 built-in Python tools - WebCrawl, SchemaExtract, SchemaGenerate
Add WebCrawlTool (Crawl4AI wrapper with graceful degradation),
SchemaExtractTool (extruct-based Schema.org extraction), and
SchemaGenerateTool (JSON-LD generation with optional pydantic-schemaorg
validation). All tools work without optional dependencies.
2026-06-07 17:25:24 +08:00
chiguyong 550d29a139 feat(mcp): U2 MCP config system and MCPManager lifecycle
Add MCPServerConfig dataclass with stdio/streamable_http/sse transport
validation, MCPManager for declarative YAML-driven MCP server lifecycle
(start_all/stop_all), tool discovery and registration. Integrated
into FastAPI lifespan startup/shutdown.
2026-06-07 17:25:07 +08:00
chiguyong 66b9217569 feat(mcp): U1 StdioTransport for subprocess-based MCP communication
Add StdioTransport class supporting stdio JSON-RPC over subprocess
stdin/stdout with asyncio.create_subprocess_exec, pending futures
for request/response matching, and stderr forwarding.
2026-06-07 17:24:52 +08:00
chiguyong 9b6c0230c0 docs: add Phase 6 toolkit plan 2026-06-07 16:21:50 +08:00
chiguyong 11a12fed29 docs: mark Phase 5 plan as completed 2026-06-06 22:53:14 +08:00
chiguyong 24e501f745 fix(core): U10 Agent status lock timeout and config hot-reload audit
- Added _acquire_status_lock with timeout (30s) to prevent deadlocks
- Added _release_status_lock for safe lock release
- Added config_version tracking on BaseAgent
- Config hot-reload now increments version and propagates to agents
- Audit logging with config version in _on_config_change
2026-06-06 22:52:51 +08:00
chiguyong 83cdddd199 feat(evaluation): U9 Ragas evaluation pipeline for RAG quality assessment
- RagasEvaluator: LLM-as-Judge evaluation with ragas lib or built-in fallback
- EvalDatasetBuilder: from traces or dict list
- EvalMetrics: faithfulness, answer_relevancy, context_precision, context_recall
- Built-in heuristic evaluation using keyword overlap and Jaccard similarity
- 13 tests passing
2026-06-06 22:49:27 +08:00
chiguyong 9753a08ac8 feat(llm): U8 Chinese LLM providers - Wenxin, Doubao, Yuanbao
- WenxinProvider: Baidu ERNIE via Qianfan v2 OpenAI-compatible API, AK/SK token auth
- DoubaoProvider: ByteDance Doubao via Volcengine Ark API
- YuanbaoProvider: Tencent Hunyuan via OpenAI-compatible API with enhancement mode
- All inherit from OpenAICompatibleProvider for retry/circuit breaker support
- 16 tests passing
2026-06-06 22:46:53 +08:00
chiguyong 34e083abde feat(evolution): U7 multi-objective fitness and extended strategy space
- MultiObjectiveFitness: weighted scoring, NSGA-II Pareto ranking, crowding distance
- FitnessWeights: configurable accuracy/latency/cost weights with auto-normalization
- ExtendedStrategyTuner: multi-dim Bayesian optimization (temperature, max_iterations, top_k, retrieval_mode)
- ExtendedStrategyConfig: expanded parameter space
- 20 tests passing
2026-06-06 22:42:54 +08:00
chiguyong d5998aaddd feat(evolution): U6 GEPA genetic algorithm evolution framework
- PromptChromosome: instructions + demos + constraints gene segments
- CrossoverOperator: paragraph-level text, demo, constraint crossover
- MutationOperator: LLM-driven instruction mutation + demo/constraint mutation
- GEPAPopulation: tournament selection, elite preservation, Pareto front
- FitnessScore: multi-objective (accuracy, latency, cost) with Pareto dominance
- 29 tests passing
2026-06-06 22:38:55 +08:00