fischer-agentkit

Commit Graph

Author	SHA1	Message	Date
chiguyong	1d09fafec9	feat(core): reflexion in main flow - verify fail → reflect → retry (U5, R4)	2026-07-03 13:29:54 +08:00
chiguyong	dd259153fa	feat(core): wire evolution hooks into execute_stream path (U2, OQ6 fix) ConfigDrivenAgent.execute_stream() now fires on_task_complete/on_task_failed evolution hooks in its finally block, achieving lifecycle parity with the sync execute() path. This fixes the OQ6 gap where WebSocket-routed streaming tasks bypassed evolution entirely. Implementation: - Module-level backpressure manager (_schedule_evolution / drain_pending_evolution_tasks) with cap = max(2, max_concurrency * 2), drop + log + counter on exceed, and shutdown drain via asyncio.gather(return_exceptions=True). - _trigger_evolution_hooks / _evolve_safe methods on ConfigDrivenAgent: fire-and-forget via asyncio.create_task, evolution errors swallowed (never fail the stream). - execute_stream finally block distinguishes cancelled (CancelledError / TaskCancelledError -> CANCELLED), failed (Exception -> FAILED), completed (final_answer received -> COMPLETED), and early-close (no completion, no error -> CANCELLED "stream closed before completion"). - app.py shutdown drains pending evolution tasks. - plan_exec_engine.py / reflexion.py: doc comments noting hooks fire at the ConfigDrivenAgent layer (single chokepoint, no double-fire). - portal.py: verification comments at 3 execute_stream call sites (these call react_engine.execute_stream directly, bypassing ConfigDrivenAgent - known gap tracked separately). Tests (8 new in test_execute_stream_hooks.py): - Happy path: success fires COMPLETED, failure fires FAILED. - Edge cases: cancellation fires CANCELLED, early aclose fires CANCELLED, evolution error suppressed, backpressure cap drops + counts. - Parity: REST on_task_complete vs execute_stream both fire COMPLETED. - Disabled: _evolution_enabled=False fires no hooks.	2026-07-03 12:16:02 +08:00
chiguyong	47a437c5e3	fix(experts): resolve residual review findings from PR #13 Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details Addresses 4 actionable findings (1 P1 + 3 P2) from ce-code-review of feat/ui-ue-enhancement (PR #13), now merged to main (`8066e0b`). P1 — expert_step payload alignment (_phase_executor.py) The thinking/tool_call/tool_result event payloads were missing the fields the frontend WsServerMessage contract requires (expert_name/expert_color/content/step). Frontend code consuming these events silently degraded. Now all expert_step broadcasts carry the full contract; tool_call/tool_result keep step_data for the raw payload. P2 #1 — execute_stream CancellationToken registration (config_driven.py) execute_stream() bypassed BaseAgent.execute() and never registered a CancellationToken, so cancel_task() could not cooperatively cancel a streaming task. Now registers the token and cleans it up in finally. P2 #2 — team_synthesis orphan milestone cleanup (orchestrator.py) If synthesis streaming was interrupted (cancel/exception), no terminal team_synthesis event was emitted, leaving the frontend streaming milestone spinning forever. Now an inner try/except emits a terminal team_synthesis with status=cancelled\|error before re-raising, so the frontend can finalize the milestone. The success path also carries the synthesis_id. P2 #3 — synthesis_id dedup (orchestrator.py + types.ts + chatStream.ts) Without an identifier, the frontend could not precisely match a team_synthesis terminal event to its streaming milestone (especially across retries/concurrent teams). The backend now injects a stable synthesis_id (`{plan.id}:synthesis`) into both team_synthesis_chunk and team_synthesis events; the frontend uses it for exact milestone matching and treats error/cancelled status as terminal. Test updates - Updated test_thinking_events_forwarded_as_expert_step to assert the new payload contract (expert_id/name/color/content/step). - Added test_tool_call_events_forwarded_as_expert_step covering tool_call/tool_result payload shape (content=tool_name摘要 + step_data=原始 payload). Verification - ruff check: clean - pytest tests/unit/experts/test_phase_executor_streaming.py: 14/14 - npm run typecheck: clean - vitest: 126/127 (1 unrelated baseline failure in tauri-auth.test.ts) Residuals doc: docs/residual-review-findings/feat-ui-ue-enhancement.md	2026-07-01 13:26:19 +08:00
chiguyong	f872a3fac6	feat: UI/UE enhancement — streaming, sticky header, hover actions, calendar tokens Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details U1 ThinkingBlock: streaming cursor + auto-collapse to summary bar U2 StickyModeHeader: new component replacing ExpertTeamView + BoardStatusView U3 Backend _phase_executor: execute_stream() with token/thinking/final_answer forwarding U4 Frontend chatStream: expert_result_chunk/team_synthesis_chunk token accumulation U5 AssistantText: routing tag hover fade-in U6 UserBubble: hover actions (copy/delete/refill) U7 CalendarGrid: token-based color redesign Review fixes (ce-code-review): - P0: _VALID_TEAM_EVENT_TYPES whitelist adds 3 new streaming event types - P0: final_answer no longer double-accumulates token content - P2: exception handling expanded to except Exception for LLMProviderError etc. Simplification (ce-simplify-code): - _synthesizer.py: O(n²) concat -> list+join, _concat_results extraction - config_driven.py: 4 duplicate _handle_*_stream -> _wrap_sync_as_stream - chatStream.ts: 5x [...messages].reverse().find() -> findLastMessage helper Tests: pytest 13/13, vitest 126/127 (1 baseline), typecheck pass, ruff clean	2026-07-01 12:51:45 +08:00
Fischer	a778f816c5	refactor: tech debt Wave 1+2 (except Exception 收尾 + core/experts Any 治理) (#10 ) Deploy to Production / deploy (push) Waiting to run Details Test / backend-test (push) Waiting to run Details Test / frontend-unit (push) Waiting to run Details Test / api-e2e (push) Waiting to run Details Test / frontend-e2e (push) Waiting to run Details	2026-07-01 03:54:53 +08:00
chiguyong	3337589395	fix(review): document-processing code review fixes — validation, tests, formatting Deploy to Production / deploy (push) Waiting to run Details - SkillConfig._validate_v2: validate fallback_strategies against ReWOOEngine.VALID_STRATEGIES (lazy import, #20) - test_skill_config: +4 tests for fallback_strategies validation - test_document_loader: +8 xlsx edge case tests (empty workbook, malformed bytes, column mismatch, row/cell truncation, multi-sheet, file size limit, None cells, #16) - test_execution_modes: fix ReWOOEngine patch path (lazy import -> patch at source) + FakeReWOOEngine.execute return .output attribute - config_driven: ruff formatting (quotes, blank lines after imports) - project_rules: remove stale "known failing test" note (now passes)	2026-06-23 20:21:19 +08:00
chiguyong	99fe4c99f7	fix: comprehensive code review fixes + WS test stability	2026-06-15 08:17:34 +08:00
chiguyong	0ccef7be5c	feat: P0 production hardening — LLM cache, semantic routing, state persistence U1: LLM Cache Core (exact + semantic match, InMemory + Redis backends) U2: Cache integration into LLMGateway with CacheConfig U3: Semantic Router as Layer 1.5 in CostAwareRouter U4: UsageStore persistence (Redis Hash + InMemory fallback) U5: CascadeStateStore persistence (Redis INCR + InMemory TTL) U6: EvolutionStore interface unification (Protocol + PostgreSQL backend) U7: Configuration integration + E2E tests Code review fixes: - P0: date iteration bug (day>=28), semantic router index never built, Redis connection leak (per-call → persistent pool) - P1: cache degradation recovery, semantic_search degradation, double miss counting, asyncio.Lock for PG init, LIMIT on queries, __import__ anti-pattern → _utcnow() - P2: InMemory TTL cleanup, embedding preservation on put(), data TTL = max(exact_ttl, semantic_ttl)	2026-06-14 15:16:00 +08:00
chiguyong	55421dd126	fix: get_tools() and get_system_prompt() now read from tool_registry too Root cause: app.py registers tools via agent._tool_registry.register() which adds to the ToolRegistry but NOT to agent._tools (which is only populated by use_tool() from config). Both get_tools() and get_system_prompt() were reading only _tools, missing all post-init registered tools. Now both methods merge _tools with _tool_registry.list_tools().	2026-06-11 22:17:14 +08:00
chiguyong	f7225bc91a	fix: include available tools in system prompt so LLM knows what it can call Previously get_system_prompt() only returned identity/instructions but did not tell the LLM what tools are available. The LLM would therefore refuse to call tools even when they were registered, saying it had no tools. Now the system prompt includes a '## 可用工具' section listing all registered tools with their descriptions and parameters.	2026-06-11 22:00:30 +08:00
chiguyong	cc2cd414c9	fix: resolve all code review issues from cross-validation 1. Critical: Add missing TaskResult import in plan_exec_engine.py 2. Critical: Fix ReWOOEngine param name (max_steps → max_plan_steps) 3. Major: Remove duplicate token counting in reflexion.py 4. Major: LLM audit failure now passes (trusts rule check) instead of failing 5. Major: Fix dict iteration with del using list() copy in lifecycle.py 6. Major: Fix Chinese content tokenization using regex split instead of space split 7. Minor: _is_positive_mention now checks all occurrences, not just the first	2026-06-11 06:22:35 +08:00
chiguyong	5b42487d8a	feat(core): add ReWOO, Plan-and-Execute, Reflexion execution engines Phase A of Multi-Agent Marketplace architecture: - ReWOOEngine: plan-all-then-execute pattern for parallel data fetch - PlanExecEngine: adapter wrapping GoalPlanner+PlanExecutor+PipelineReplanner - ReflexionEngine: ReAct + Evaluate + Reflect + Retry for high-precision tasks - SkillConfig: extend VALID_EXECUTION_MODES with rewoo/plan_exec/reflexion - ConfigDrivenAgent: add _handle_rewoo/_handle_plan_exec/_handle_reflexion routes - 5 professional agent YAML configs with layered model defaults - 107 unit tests passing	2026-06-10 17:08:48 +08:00
chiguyong	286804792d	feat(compression): U4 ServerConfig compression field and Agent injection Add compression config to ServerConfig (following telemetry pattern), create compressor in create_app, pass through AgentPool to ConfigDrivenAgent, and inject into ReActEngine.execute() calls.	2026-06-07 18:20:05 +08:00
chiguyong	6e362a8ae7	feat(agentkit): Phase 4 enterprise production upgrade — 12 Implementation Units Phase A (P0): EpisodicMemory pgvector search+EmbeddingCache, ReAct timeout+CancellationToken, evolution system fix (A/B test+LLMPromptOptimizer+StrategyTuner), AnthropicProvider native Messages API Phase B (P1): RetryPolicy+CircuitBreaker, chat_stream fallback chain, WebSocket endpoint, SSE stream fix, Evolution+Memory API routes (7 endpoints), embedding cache+Enhanced Search per-KB degradation fix Phase C (P2): GeminiProvider native generateContent API, Agent state lock+config hot-reload Tests: 1301 passed, 18 skipped, 0 failed	2026-06-06 21:51:04 +08:00
chiguyong	e33dc25ad3	feat(memory): RAG pipeline optimization — 5 Implementation Units U1: QueryTransformer — LLM/rule-based query rewriting + sub-query decomposition U2: HttpRAGService enhanced_search() — rerank + compression via /bases/{kb_id}/retrieve U3: Structured context injection — source attribution headers in RAG results U4: RetrieveKnowledgeTool — built-in tool for mid-reasoning knowledge retrieval U5: Configurable retrieval params + per-KB weights + CJK token estimation Config example: memory: retrieval: top_k: 5 token_budget: 2000 context_template: structured query_transform: enabled: true strategy: llm semantic: search_mode: enhanced use_rerank: true kb_weights: industry-kb-id: 1.2 enterprise-kb-id: 0.8 Tests: 1037 passed, 18 skipped, 0 failed	2026-06-06 19:27:09 +08:00
chiguyong	cd5b39087e	feat(memory): add HttpRAGService for config-driven knowledge base integration	2026-06-06 18:36:05 +08:00
chiguyong	f858d279f3	feat(agentkit): Phase 3 upgrade - persistence, memory, evolution, observability 10 Implementation Units across 3 phases: Phase A - Infrastructure: - U1: RedisTaskStore with Redis/memory backend + factory function - U2: TraceRecorder for execution trace recording - U3: PersistentEvolutionStore with SQLite backend Phase B - Core Capabilities: - U4: MemoryRetriever integration into ReAct engine - U5: Embedder abstraction + EpisodicMemory vector search - U6: LLMReflector for LLM-in-the-loop reflection - U7: SkillPipeline for multi-skill orchestration Phase C - Enhancement: - U8: SKILL.md format + progressive disclosure levels - U9: ContextCompressor + prompt cache rendering - U10: Structured logging + metrics endpoint + enhanced health check Tests: 924 passed, 18 skipped, 0 failed	2026-06-06 17:17:45 +08:00
chiguyong	acec8ff743	feat(evolution): Phase A - lifecycle hooks + EvolutionConfig U11: EvolutionMixin integrated into ConfigDrivenAgent lifecycle - on_task_complete triggers evolve_after_task - on_task_failed records failure patterns - Evolution errors never break main task flow U12: EvolutionConfig added to SkillConfig - enabled, reflect_on_failure, auto_apply, min_quality_threshold - Backward compatible: defaults to enabled=False 21 new tests passing, no regression.	2026-06-06 12:05:56 +08:00
chiguyong	5f1c51cf9a	feat(server): Phase B - auth, rate limiting, SSRF protection, handler whitelist U1: API Key authentication middleware (dev mode skip, health whitelist) U2: Rate limiting middleware (fixed-window, 60 req/min default) U3: Callback URL SSRF protection (private IP blocking) U4: custom_handler module prefix whitelist 65 tests passing. CORS conflict fixed.	2026-06-05 23:37:36 +08:00
chiguyong	f87b790c0f	feat(agentkit): v2 Phase 1 - ReAct/LLM Gateway/Skill/Server + review fixes 535 unit + 52 integration tests passing. README added.	2026-06-05 23:32:16 +08:00
chiguyong	5a90824c77	feat(core): add ConfigDrivenAgent with YAML-driven agent definition - AgentConfig: YAML/dict config model with validation - ConfigDrivenAgent: 3 task modes (llm_generate, tool_call, custom) - StandaloneRunner: auto-discover YAML configs and build agents - 25 new tests covering all modes and edge cases - Total: 56 tests passing	2026-06-04 22:39:25 +08:00

21 Commits