fischer-agentkit

Commit Graph

Author	SHA1	Message	Date
chiguyong	a27eed3714	fix(config): unify config loading chain and protect ${VAR} references - Settings API: reverse-resolve env vars to preserve ${VAR} refs in yaml, write new API keys to .env instead of agentkit.yaml, extract env_key from existing ${VAR} reference when updating providers - Onboarding: merge-update instead of overwrite when config exists, use config_arg to determine output path, .env merge instead of overwrite - Unified templates: bailian-coding provider name, full model_aliases, docker-compose with postgres, expanded .env.example - Optional ruamel.yaml for comment/format preservation in Settings API - clients.yaml: add _deep_resolve for ${VAR} env var references - All CLI commands use load_config_with_dotenv() consistently - Tests: mock find_config_path and CWD auto-discovery to avoid env leaks	2026-06-16 00:26:54 +08:00
chiguyong	11e2009cb8	feat(router): improve colloquial/mixed-lang routing, fix low-complexity IntentRouter bypass Key improvements: - Low-complexity queries (<0.3) now try IntentRouter keyword match before falling back to DIRECT_CHAT, fixing 0% F1 on keyword_match - SemanticRouter similarity_low lowered from 0.6 to 0.4 - Short text (<20 chars) uses effective_low = max(0.25, low - 0.15) - Short text with no semantic match forces LLM classify fallback - Added colloquial keywords to 7 skill YAMLs - Fixed code_reviewer.yaml output_schema placement - Fixed SemanticRouter build in e2e tests - Fixed base_url detection for bailian-coding API keys Results: keyword_match F1 0->60.87%, colloquial F1 0->100%, mixed_lang F1 0->100%	2026-06-15 23:54:57 +08:00
chiguyong	fa2a6dece2	feat(router): enable SemanticRouter + upgrade benchmark to L3/L5 - Enable SemanticRouter in agentkit.yaml (router.semantic.enabled: true) - Integrate SemanticRouter into e2e backtest (_build_real_components) - Add 8 new semantic test cases: 5 colloquial + 3 mixed-lang expressions - Add L3 output quality evaluation framework (LLM-as-Judge, 1-5 score) - Add L5 adaptive capability metrics (consistency rate from overfitting data) - Add OutputQualityObservation model and evaluate_output_quality() method - Report now includes L3 and L5 sections Results: 52 tests pass, description_match F1=66.67%, L5 adaptive rate=100%	2026-06-15 23:02:47 +08:00
chiguyong	e984b4c462	feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match - Expand ExecutionMode enum with REWOO/REFLEXION/PLAN_EXEC - Add _resolve_execution_mode() to respect skill.config.execution_mode - Rewrite IntentRouter._match_keywords() for multi-candidate scoring - Add QualityGate 5th dimension: skill_match validation with warning escalation - Calibrate HeuristicClassifier: low-complexity signals only when no high signals - Fix negation regex for Chinese text (avoid matching past punctuation) - Fix backtest mode_map normalization and .env loading - Add 61 unit tests (21 HeuristicClassifier + 14 IntentRouter + 13 QualityGate + 13 existing) Results: execution_mode_accuracy 9.09%→36.36%, skill_routing_F1 66.67%→77.78%	2026-06-15 22:43:13 +08:00
chiguyong	64d62a2b60	feat: autonomous task execution - connect PlanExecEngine + TeamOrchestrator U1: TeamOrchestrator._execute_phase real execution (Expert.agent.execute) U2: LLM-based merge strategies (BEST/VOTE/FUSION) with fallback U3: ReActStepExecutor replacing _LLMStepAgent for tool-enabled steps U4: SharedWorkspace integration for cross-phase/cross-execution state U5: GoalPlanner prompt tuning with few-shot and verb pattern matching U6: Replan-before-fallback in TeamOrchestrator U7: End-to-end validation tests for multi-step research tasks U8: WebSocket progress events (step_event_callback + new event types) Code review fixes: P0 response.strip fix, P1 competitor status check, milestone real impl, VOTE self-bias fix, confirmation_handler wiring, ExpertTeam public API, DRY _build_result_summaries, replan tests Also: geo_server.py refactor (ServerConfig.from_yaml), delete llm_config.yaml	2026-06-15 12:41:32 +08:00
chiguyong	99fe4c99f7	fix: comprehensive code review fixes + WS test stability	2026-06-15 08:17:34 +08:00
chiguyong	7384ecb03e	feat: Expert Team Mode — plan-execute collaboration with conversation UI Implements B+C hybrid Expert Team Mode with ExpertConfig, CollaborationPlan, TeamOrchestrator, ExpertTeamRouter, HandoffTransport, SharedWorkspace, and Expert wrapper. Frontend includes ExpertTeamView, ExpertMessage, PlanVisualization, team store, and WS event handlers. Code review fixes: sentinel-based close, per-phase retry, name validation, Vue component integration, teamState dedup, Redis reset, plan reassign, event_type validation, hmac timing-safe compare, message dedup, reactive updatePhases, O(1) phase lookup, iterative DFS, bounded Queue. 232 unit tests passing.	2026-06-14 22:20:14 +08:00
chiguyong	94c4c8b887	feat: accumulated frontend enhancements, docs, and static assets - Frontend view updates (ChatView, EvolutionView, SkillsView, etc.) - Updated portal routes and chat store - New frontend components (FilePreview, ToolCallCard, IconNav) - Updated static build assets - New test files (merged router, parallel tools, ReWOO fallback) - Documentation and brainstorm files - Codegraph and understand-anything artifacts	2026-06-14 16:35:01 +08:00
chiguyong	6e0e081f23	feat: gap closure sprint — dark theme, @-mention, LocalComputerUse, tests P0: U4 UsageStore + U5 CascadeStateStore independent test files (57 tests) P1: Dark theme — tokens.css [data-theme="dark"] + theme.ts Pinia store + TopNav toggle button + App.vue dynamic Ant Design theme P1: @-mention — MentionDropdown.vue + /skills/mention-suggest API + ChatInput integration with @ detection P2: LocalComputerUseSession — pyautogui + screencapture (replaces Docker stub) P2: Integration tests for gap closure (12 tests) Fix: create_cascade_state_store() now passes session_ttl to InMemory fallback	2026-06-14 16:16:50 +08:00
chiguyong	0ccef7be5c	feat: P0 production hardening — LLM cache, semantic routing, state persistence U1: LLM Cache Core (exact + semantic match, InMemory + Redis backends) U2: Cache integration into LLMGateway with CacheConfig U3: Semantic Router as Layer 1.5 in CostAwareRouter U4: UsageStore persistence (Redis Hash + InMemory fallback) U5: CascadeStateStore persistence (Redis INCR + InMemory TTL) U6: EvolutionStore interface unification (Protocol + PostgreSQL backend) U7: Configuration integration + E2E tests Code review fixes: - P0: date iteration bug (day>=28), semantic router index never built, Redis connection leak (per-call → persistent pool) - P1: cache degradation recovery, semantic_search degradation, double miss counting, asyncio.Lock for PG init, LIMIT on queries, __import__ anti-pattern → _utcnow() - P2: InMemory TTL cleanup, embedding preservation on put(), data TTL = max(exact_ttl, semantic_ttl)	2026-06-14 15:16:00 +08:00
chiguyong	09698d7a06	feat: frontend productization with code review fixes - Workflow: visual canvas, undo/redo, drag-and-drop, real-time execution WebSocket - Evolution: dashboard, ECharts metrics, experience timeline, pitfall warnings, usage panel - KB: source CRUD, document upload, search test - Terminal: interactive PTY WebSocket, whitelist security - Security: hmac.compare_digest, API key auth on all endpoints, whitelist bypass fix - Fixes: ECharts async init, WebSocket intentional disconnect, TOCTOU race, Pydantic models	2026-06-13 01:29:58 +08:00
chiguyong	5ef08a3b30	fix(review): comprehensive P0-P2 code review fixes	2026-06-12 22:18:25 +08:00
chiguyong	a36bc3d1c1	feat: optimize chat response speed for sub-1s first token latency - Add HeuristicClassifier to replace LLM quick_classify with zero-cost local heuristic (keyword/length/code-pattern scoring), gated by router.classifier config (default: heuristic) - Add parallel tool execution in ReActEngine via asyncio.gather for multiple independent tool_calls, gated by parallel_tools param - Add AsyncWriteQueue for non-blocking session persistence with WAL buffer, gated by async_writes param on SessionManager - Add httpx.Limits connection pool config to all LLM providers - Add router config section to ServerConfig and agentkit.yaml - All optimizations have config switches for safe rollback	2026-06-12 13:15:06 +08:00
chiguyong	8c365486e2	fix(pipeline): address code review findings for adversarial loop Critical: - C1: Add verifier_timeout_seconds for independent Verifier timeout - C2: Verifier parse failure raises RuntimeError instead of dead-loop Major: - M1: Inject previous_output into Worker retry context - M2: Add Pydantic ge/le constraint on ReviewFeedback.score - M3: Use Literal type for feedback_mode enum validation - M4: Use Literal types for ReviewIssue severity and category - M5: Merge error messages when escalation agent also fails Tests: 8 new test cases added (19 total), all passing	2026-06-12 10:02:37 +08:00
chiguyong	ddc735b078	test(pipeline): add coding harness integration tests 5 passing tests covering: - Pipeline config loading and validation - Review stage adversarial config verification - Stage dependencies validation - Code reviewer skill config and output schema 3 skipped tests (complex mock sequencing covered by unit tests)	2026-06-12 09:42:21 +08:00
chiguyong	3392413614	test(pipeline): add adversarial loop unit tests 11 test cases covering: - PipelineSchemaAdversarial (4): verifier fields, backward compat, serialization, state tracking - AdversarialExecution (3): no verifier passthrough, first round pass, max rounds exhausted - FeedbackContext (3): structured+natural, structured, natural modes - Escalation (1): no escalation configured	2026-06-12 09:40:19 +08:00
chiguyong	32c800d1e4	fix: portal routing + response speed + IME input 1. Portal unified routing: ws_chat now uses CostAwareRouter uniformly (handles Layer 0/1/2), replacing direct IntentRouter calls. Greeting/chat_mode requests skip IntentRouter LLM call entirely. 2. Response speed: greeting & simple chat now use direct LLM call (no ReAct loop), zero-cost Layer 0 detection. 3. IME input fix: use e.isComposing (native browser property) instead of compositionstart/end for Enter key detection. 4. Test: fix InMemoryMessageBus.request() parameter name timeout -> timeout_seconds.	2026-06-11 21:30:25 +08:00
chiguyong	d47f279887	fix: resolve code review issues from deferred improvements 1. InMemoryMessageBus.request(): fix param name (timeout→timeout_seconds) to match ABC 2. InMemoryMessageBus: track consumer tasks, cancel on unsubscribe 3. InMemoryMessageBus: _try_resolve_pending() in queue consumer path 4. evolve_soul(): use "default" category when patterns is empty 5. quick_classify(): use delimiter-based prompt to mitigate injection risk 6. Use asyncio.get_running_loop() instead of deprecated get_event_loop()	2026-06-11 13:49:02 +08:00
chiguyong	79eb8469f9	fix: address remaining code review issues - AlignmentGuard: direction-aware constraint checking (negation/affirmation detection) instead of simple substring matching to reduce false positives - Reflexion: extract actual token usage from LLM response instead of hardcoded 1 - MemoryTool: protect version/history sections from update_soul modification - Fix AsyncMock warnings for sync find_best_agent method	2026-06-11 00:14:11 +08:00
chiguyong	bba394be38	fix(marketplace): address code review findings - Fix str.format() crash when user input contains curly braces - Fix Layer 2 passing str to find_best_agent (expects list[str]) - Fix AlignmentGuard fail-open on LLM audit failure (now fail-closed) - Fix _config_reload_lock not initialized in create_app() - Fix evolve_soul redundant reflector.reflect() call (reuse existing reflection) - Fix test mocks using AsyncMock for sync find_best_agent method - Remove unused _COMPLEXITY_CLASSIFY_PROMPT constant	2026-06-10 19:21:40 +08:00
chiguyong	8713636d50	feat(marketplace): add Phase B/C - CostAwareRouter, OrganizationContext, AlignmentGuard, Soul Evolution, Auction, Server Integration Phase B: - U1: CostAwareRouter with 3-layer routing (rule/LLM/capability matching) - U6: OrganizationContext with agent profiles and capability-based discovery - U7: AlignmentGuard with constraint injection and cascade detection Phase C: - U8: Soul dynamic evolution with version tracking and reflection-triggered updates - U9: Auction mechanism as optional advanced routing mode - U10: Server integration + end-to-end integration tests 250 new tests passing across all units.	2026-06-10 19:09:02 +08:00
chiguyong	5b42487d8a	feat(core): add ReWOO, Plan-and-Execute, Reflexion execution engines Phase A of Multi-Agent Marketplace architecture: - ReWOOEngine: plan-all-then-execute pattern for parallel data fetch - PlanExecEngine: adapter wrapping GoalPlanner+PlanExecutor+PipelineReplanner - ReflexionEngine: ReAct + Evaluate + Reflect + Retry for high-precision tasks - SkillConfig: extend VALID_EXECUTION_MODES with rewoo/plan_exec/reflexion - ConfigDrivenAgent: add _handle_rewoo/_handle_plan_exec/_handle_reflexion routes - 5 professional agent YAML configs with layered model defaults - 107 unit tests passing	2026-06-10 17:08:48 +08:00
chiguyong	6852dfe892	fix(security,reliability): resolve all P2 findings from code review	2026-06-10 15:05:40 +08:00
chiguyong	658e188939	fix(review): resolve P0/P1 findings from final code review	2026-06-10 09:57:29 +08:00
chiguyong	1d1805753c	fix: resolve key P2 findings from code review - Shell whitelist: use exact binary match instead of startswith - Shell audit log: use deque(maxlen=10000) to cap memory - Terminal history: use deque(maxlen) for O(1) eviction - Path optimizer: cap _pending_paths at 50 entries per task_type - Pitfall detector: only add tips to matching steps, not all - Experience store: handle non-numeric _parse_time_window input - Extract shared is_safe_url() to utils/security.py (DRY) - Workflow condition evaluator: handle float() ValueError	2026-06-10 09:01:23 +08:00
chiguyong	b46a10973f	fix(tests): clean up test_shell_tool.py lint issues	2026-06-10 08:46:35 +08:00
chiguyong	9646b0f0dd	fix(tests): update test_shell_tool.py to match new ShellTool API	2026-06-10 08:22:15 +08:00
chiguyong	7874e875af	merge: integrate feat/agentkit-phase8-chat-adaptive (chat/gui commands + GUI mode) Restores agentkit chat, agentkit gui CLI commands, onboarding wizard, and GUI mode (AGENTKIT_GUI_MODE) with static file serving. Resolves merge conflicts in orchestrator.py, app.py, tools/__init__.py, shell.py.	2026-06-10 07:44:06 +08:00
chiguyong	9e9f1314f6	fix(security): resolve all P0/P1 findings from code review	2026-06-10 07:12:41 +08:00
chiguyong	b34f74f598	feat(phase6): implement end-to-end enterprise scenario validation (U15) - Add goal-driven agent skill config and pipeline config - Add 9 E2E integration tests covering all 7 capabilities: - SC1: Goal-driven SEO analysis (GoalPlanner→PlanExecutor→PlanChecker→ExperienceStore) - SC2: Knowledge Q&A with system operation (MultiSourceRAG) - SC3: Workflow with approval (WorkflowStore + approval node) - SC4: Self-evolution experience accumulation (ExperienceStore→PitfallDetector→PathOptimizer) - SC5: Parallel execution efficiency verification - SC6: Skill registry integration (capabilities, versions, health) - Cross-capability: Plan+Experience+Pitfall, Review+Experience, RAG+Workflow - All 2472 tests passing (9 integration + 2463 unit)	2026-06-10 01:38:28 +08:00
chiguyong	c606ffa64a	feat(phase5): implement management pages, evolution dashboard, and workflow editor (U13b/U13c/U14)	2026-06-10 01:29:01 +08:00
chiguyong	a1deeecede	feat(phase5): implement Vue3 portal foundation with chat interface and routing (U13a) - Add Portal API routes: chat, stream, capabilities, conversations, WebSocket - Add ConversationStore for in-memory conversation management - Add CAPABILITY_CATEGORIES mapping for 8 capability types - Create Vue3 SPA with TypeScript, Pinia, Vue Router, Ant Design Vue - Implement ChatView with message bubbles, input, sidebar, WebSocket support - Add side navigation skeleton for all 8 capability sections - Add placeholder views for workflow, knowledge, skills, terminal, etc. - 31 backend tests passing	2026-06-10 01:06:48 +08:00
chiguyong	901e4d9d0a	feat(phase4): implement Computer Use integration (U12) - ComputerUseTool: Anthropic API + fallback chain (API→Session→ShellTool→AskHuman) - ComputerUseSession: Docker sandbox + InMemory test session - ComputerUseRecorder: action recording, replay, and persistence 89 new tests passing. Degradation chain verified.	2026-06-10 00:54:31 +08:00
chiguyong	c99aee1423	feat(phase3): implement knowledge base and RAG enhancement (U9-U11) - U9: LocalDocumentIngestion - multi-format doc parsing and chunking - U10: ExternalKBAdapters - Feishu/Confluence/GenericHTTP adapters - U11: MultiSourceRAG - multi-source retrieval with source tracing KnowledgeBase protocol defined (KTD-7). 145 new tests passing.	2026-06-10 00:45:17 +08:00
chiguyong	e3d4f811dd	feat(phase2): implement self-evolution and smart terminal (U6-U8) - U6: PitfallDetector - detect historical failure patterns and warn - U7: PathOptimizer - discover and update optimal execution paths - U8: TerminalSession - session state, PTY interactive, output parsing 160 new tests passing. ShellTool enhanced with session_id support.	2026-06-10 00:22:36 +08:00
chiguyong	fd4a811929	feat(phase1): implement core kernel and experience foundation (U1-U5) - U1: GoalPlanner - structured goal decomposition wrapping _decompose_task() - U2: PlanExecutor - parallel execution with retry/skip/replace strategies - U3: PlanChecker - quality gate + review + experience writing - U4: Skill spec upgrade - dependencies, capabilities, version management - U5: ExperienceStore - PostgreSQL+pgvector task experience storage 208 new tests passing, fully backward compatible.	2026-06-09 23:57:03 +08:00
chiguyong	31bd3b126c	feat(phase8): chat adaptive enhancements, pipeline reflection, search tools upgrade - Enhanced chat CLI with adaptive mode and session management - Added pipeline reflection and schema extensions - Upgraded BaiduSearch and WebSearch tools with advanced capabilities - Expanded server routes for skills and chat - Added session store enhancements - New chat module and pipeline reflection support	2026-06-09 23:18:06 +08:00
chiguyong	045fecd4ce	feat(tools): add ShellTool + WebSearchTool, memory system, onboarding wizard, chat mode - ShellTool: safe command execution with allowlist, blocked patterns (regex), timeout, output truncation - WebSearchTool: multi-backend search with Tavily → Serper → DuckDuckGo Lite fallback - MemoryTool: agent-callable tool with add/replace/remove/read actions - MemoryStore/MemoryFile: file-based memory (SOUL.md, USER.md, MEMORY.md, DAILY.md) - Onboarding wizard: provider selection, API key, model selection, agent personality - Chat mode: interactive CLI with streaming, memory injection, tool integration - Add 百炼 Coding Plan provider with 10 models - 102 unit tests (34 new for ShellTool + WebSearchTool)	2026-06-09 01:06:45 +08:00
chiguyong	9874a4aac0	test: add Phase 8 integration tests for Chat + Adaptive + Multi-Agent (U8) End-to-end integration tests covering session lifecycle, adaptive pipeline, multi-agent communication via MessageBus, and config serialization.	2026-06-08 01:17:04 +08:00
chiguyong	45283d31e8	feat(core): integrate MessageBus into Orchestrator and AgentPool (U7) - Orchestrator accepts optional message_bus parameter; workers publish task.progress messages via MessageBus after each subtask execution - AgentPool accepts optional message_bus; auto-registers agents on create and auto-unregisters on remove - app.py initializes MessageBus from config and injects into AgentPool - ServerConfig adds bus configuration field - 5 new tests, all passing	2026-06-08 00:03:40 +08:00
chiguyong	13d6e74099	feat(bus): add MessageBus abstraction layer with InMemory + Redis Streams (U6) - AgentMessage: message model with sender/recipient/topic/payload/correlation_id - MessageBus Protocol: publish/subscribe/unsubscribe/request/broadcast/health_check - InMemoryMessageBus: asyncio.Queue-based implementation for testing - RedisMessageBus: Redis Streams (XADD/XREADGROUP) implementation with consumer groups, message acknowledgment, and dead letter queue - create_message_bus() factory with graceful Redis→InMemory fallback - Request-response pattern via correlation_id + asyncio.Future - 13 new tests, all passing	2026-06-07 23:58:16 +08:00
chiguyong	88d8298871	feat(core): add Orchestrator adaptive task decomposition (U5) - execute_adaptive(): iterative execute→evaluate→re-decompose loop - OrchestratorConfig: adaptive, max_iterations, quality_threshold - _evaluate_quality(): LLM-based or rule-based quality scoring (0-1) - _reexecute_failed(): preserves completed subtask results, retries failed ones with improvement feedback injected into input_data - OrchestrationResult.metadata field for tracking iteration history - 10 new tests, all passing	2026-06-07 23:50:54 +08:00
chiguyong	7054ac02b6	feat(tools): add AskHumanTool + token streaming in ReAct execute_stream - AskHumanTool: Human-in-the-Loop tool for Chat mode, pushes questions via WebSocket callback and waits for user reply via asyncio.Future - Token streaming: execute_stream() now uses chat_stream() instead of chat(), yielding token-type ReActEvents for each StreamChunk - _build_response_from_stream() static method constructs LLMResponse from accumulated stream data - Export AskHumanTool from tools/__init__.py - 12 new tests (7 AskHumanTool + 5 token streaming), all passing	2026-06-07 23:40:43 +08:00
chiguyong	6013d5189b	feat(chat): add Chat API routes with REST + WebSocket bidirectional communication	2026-06-07 22:49:26 +08:00
chiguyong	493187782c	feat(session): add Session/Message models and SessionManager with InMemory/Redis stores	2026-06-07 22:43:14 +08:00
chiguyong	b34b06724d	fix(agentkit): resolve all P0/P1/P2/P3 issues from code review	2026-06-07 22:05:18 +08:00
chiguyong	bad66445ff	feat(compression): U6 GEO Pipeline compression integration tests and config Add GEO Pipeline end-to-end compression integration tests with MockHeadroomCompressor. Add compression configuration section to llm_config.yaml with headroom and summary mode examples.	2026-06-07 18:20:41 +08:00
chiguyong	9c04362dba	feat(compression): U5 HeadroomRetrieveTool for CCR cache retrieval Add HeadroomRetrieveTool that allows LLM to retrieve original uncompressed data from CCR cache via Function Calling. Auto-registered when HeadroomCompressor is active and available.	2026-06-07 18:20:17 +08:00
chiguyong	286804792d	feat(compression): U4 ServerConfig compression field and Agent injection Add compression config to ServerConfig (following telemetry pattern), create compressor in create_app, pass through AgentPool to ConfigDrivenAgent, and inject into ReActEngine.execute() calls.	2026-06-07 18:20:05 +08:00
chiguyong	fcb4fb33f3	feat(compression): U3 ReAct engine tool result compression and incremental compress Extend _build_tool_result_message to accept compressor parameter for tool output compression. Add _should_compress helper for token budget checking. Add incremental compression within ReAct loop when conversation exceeds threshold.	2026-06-07 18:19:53 +08:00

1 2

90 Commits