fischer-agentkit

Commit Graph

Author	SHA1	Message	Date
Fischer	8633f60831	feat: complex-task-quality-loop (R1-R12) — 11 P1 blockers fixed (#22 ) Deploy to Production / deploy (push) Waiting to run Details Test / backend-test (push) Waiting to run Details Test / frontend-unit (push) Waiting to run Details Test / api-e2e (push) Waiting to run Details Test / frontend-e2e (push) Waiting to run Details Merge feat/complex-task-quality-loop into main. Includes U1-U9/R1-R12 implementation + 11 P1 blocker fixes from ce-code-review. P1 fixes: trace_outcome propagation, portal execute_stream routing, network_block reentrancy, spec review gate wiring, max_reflections threading, phase budgets, plan aggregation, failure status mapping, evolution drain timeout, portal spec_review_reply, spec_review persistence.	2026-07-05 22:31:21 +08:00
chiguyong	e5e76697a9	fix(review): resolve 11 P1 blockers from ce-code-review Test / backend-test (pull_request) Waiting to run Details Test / frontend-unit (pull_request) Waiting to run Details Test / api-e2e (pull_request) Waiting to run Details Test / frontend-e2e (pull_request) Waiting to run Details P1#1 config_driven: propagate trace_outcome into output_data so lifecycle._is_failure_path() detects non-success outcomes P1#2 portal: route through ConfigDrivenAgent.execute_stream (not react_engine.execute_stream directly) so evolution hooks fire and trace_outcome propagates; add pre-built messages support in _build_llm_messages P1#3 sandbox: make network_block reentrant via module-level reference counter + threading.Lock - concurrent VERIFICATION phases no longer permanently block all new connections P1#4 chat: replace dead isinstance(_PlanExecEngine) check with hasattr(_spec_review_handler) to wire the spec review gate P1#5 plan_exec_engine: complete max_reflections threading chain (PlanExecEngine + ReActStepExecutor constructors) P1#6 plan_exec_engine: enforce phase budgets (max_steps from phase_budgets, not hardcoded 5) P1#7 plan_exec_engine: use current plan (not stale plan var) in aggregation after replan P1#8 plan_exec_engine: map failure to failed status (not success) P1#9 app: add drain timeout for pending evolution tasks on shutdown P1#10 portal: handle spec_review_reply in WS handler P1#11 chat: persist spec_review_request/reply/timeout to conversation store so reload can reconstruct gate state Tests: 116 related tests pass; 26 pre-existing failures unchanged (stash-verified). ruff lint clean.	2026-07-04 01:10:01 +08:00
chiguyong	229dc0b2f3	feat(bitable): U6 R15a BitableTool 4 new actions + DELETE /views endpoint Extend BitableTool from 6 to 10 actions (create_view, update_view, update_field, delete_view) and add the DELETE /views/{view_id} backend endpoint with 404-before-403 ownership, 409 last-view protection, and X-Internal-Token passthrough (KTD11). Backend: - repository.py: add delete_view() — DELETE row by view_id, returns rowcount > 0 - service.py: add LastViewDeletionError domain exception + delete_view() with last-view guard (siblings <= 1 → raise → route maps to 409) - routes/bitable.py: add DELETE /views/{view_id} (204 No Content), 404-before-403 ownership pattern, 409 on LastViewDeletionError, X-Internal-Token passthrough via require_bitable_auth - tools/bitable_tool.py: add 4 new actions (_create_view, _update_view, _update_field, _delete_view), register in BOTH handlers dict AND input_schema.action.enum (KTD10 — 10 actions each) Frontend: - api/bitable.ts: add deleteView(viewId): Promise<void> - stores/bitable.ts: add deleteView action — removes from local state, switches to first remaining view if active was deleted, 409 warning - ViewSwitcher.vue: add delete button (a-popconfirm "确认删除此视图？"), hidden when views.length <= 1 (preempt last-view 409) - BitableFileDetailView.vue: handle @delete event from ViewSwitcher Tests: - test_routes.py: 6 new DELETE /views tests (204, 404 missing, 404 non-owner, 409 last-view, internal-token passthrough, internal-token 404) - test_bitable_tool.py: 13 new tests (action count = 10, handlers = 10, 4 action happy paths, missing-field errors, 409 last-view, R3/R4 config parity, X-Internal-Token passthrough on all 4 new actions) - e2e/bitable-agent-parity.spec.ts: 10 scenarios (P1-P10) covering delete button visibility, popconfirm, 204/409/404 flows, tab removal, view switch after delete, create view adds tab Verification: - ruff check: all files pass - pytest: 62 passed, 12 pre-existing failures (unchanged from `e931fbe` baseline) - typecheck: pass (EXIT_CODE=0) - build:frontend: pass (BUILD_EXIT=0) - action count: ENUM=10, HANDLERS=10, delete_view in both - no blue hex colors in ViewSwitcher.vue Pre-existing test failures (12, unchanged from `e931fbe`): test_create_table_success, test_create_field_success, test_list_fields, test_create_records_batch, test_upsert_inserts_then_updates, test_upsert_preserves_user_columns, test_create_view_success, test_batch_upsert_1200_records, test_resume_from_partial_failure, test_query_records, test_query_records_with_limit, test_collect_api Constraints honored: - No emojis, no `any` type, no blue hex colors, no pyproject.toml changes - 404-before-403 for non-owned resources (Pattern 4) - X-Internal-Token transparent passthrough (KTD11) - KTD10: actions registered in both handlers dict AND enum	2026-07-03 23:13:46 +08:00
chiguyong	e931fbef2d	feat(bitable): U5 R4 grouping (max 3 fields) + conditional formatting (7 operators) - GroupingEditor: multi-select field picker (max 3), per-level direction toggle, reorder buttons, "已知限制：不支持跨分组多选" note, empty state - ConditionalFormatEditor: per-rule enable/field/operator/value/color/bold, 8 color keys, WCAG 1.4.1 bold default true, first-match-wins footer legend - BitableGrid: unified section rendering (grouped/ungrouped via single vxe-grid declaration), group headers as separate divs (CF only on data cells), CF via row-config.className, multi-grid instance map for refresh - groupingRulesUtils: pure functions for CF matching (7 operators), group tree builder, SUM/AVG aggregation, CSS var mappers, self-check on load - view_config.py: Pydantic v2 validation (MAX_GROUP_BY_FIELDS=3, 7 operators, 8 color keys, extra="forbid" on sub-models) - routes/bitable.py: validate_view_config on PATCH (HTTP 422 on error) - stores/bitable.ts: updateViewConfig action (merges U5 sub-keys, preserves filters/sort/hidden_fields) - ViewConfigPanel: grouping + conditional-format tabs - E2E: 8 scenarios (G1-G8: single/multi grouping, collapse/expand, CF equals/between, combined, aggregation) - Tests: 54 unit tests (19 grouping + 35 CF), 2 PG-marked skipped	2026-07-03 22:33:18 +08:00
chiguyong	ffb7a51d77	fix(review): wire pitfall_detector/spec_review to PlanExecEngine + fix restore_budget_state reset order	2026-07-03 22:05:51 +08:00
chiguyong	120892e305	feat(chat): TEAM_COLLAB surfaces failure instead of silent REACT fall-back (U9, R7) - chat.py: TEAM_COLLAB execution_mode sends error + returns (no REACT fall-back) - REWOO/REFLEXION-as-mode keep deferred fall-back (RV10) - AGENTS.md: update stale "not yet supported" claim - Known gap: portal.py REST path still falls back (out of U9 scope)	2026-07-03 15:47:45 +08:00
chiguyong	786f921c5e	feat(core): spec review gate - pause PLAN_EXEC for user review (U8, R8) Add a spec review gate to PlanExecEngine that pauses execution after the first Spec is generated, awaiting the user's confirm/reject decision. On approval execution continues; on rejection the engine replans (capped at 2 replans); on 30-min timeout the Spec is parked (not failed) so the user can resume later. - spec_manager: add parked status + park()/resume() methods - plan_exec_engine: add spec_review_handler param, wire gate into both execute_stream and _execute_loop with replan cap, emit spec_review_request/spec_review_reply events, handle timeout to park - chat.py: whitelist new events, add spec_review_reply WS handler, wire _spec_review_handler closure (30-min timeout), cleanup on disconnect - portal.py: persist spec_review_id/decision/feedback for page reload - tests: 20 unit tests covering happy path, rejection/replan, timeout, cancellation, backward compat, handler errors, park/resume round-trips	2026-07-03 15:20:38 +08:00
chiguyong	a763396011	feat(evolution): pitfall retrieval/injection at planning phase (U7, R12)	2026-07-03 14:27:48 +08:00
chiguyong	91a61f9b49	feat(evolution): auto-trigger + quality gate + actor marking (U6, R5/R6) U6 of the complex task quality loop plan. R5 (auto evolution trigger + quality gate): - EvolutionConfig (Pydantic v2): success_sample_rate=0.1, min_confidence=0.5, min_examples=3, observe_only=True, cross_workspace_sharing=False - Success path gated by success_sample_rate; failure path always runs (100%) - Observe-only mode records reflections without feeding optimizer (RV14: avoids noise-driven prompt degradation during initial rollout) - PromptOptimizer.can_optimize() consumption gate: sample count >= min_examples AND mean quality >= min_confidence - PitfallDetector confidence threshold: low-confidence warnings marked observe-only; confidence = failure_rate * min(1.0, total/3) linear ramp (ponytail: upgrade to Wilson interval) R6 (actor marking + cross-workspace sharing): - All evolution artifacts (EvolutionLogEntry, Module, PitfallWarning) carry actor field; defaults to result.agent_name - can_share_artifact(): same-workspace always allowed; cross-workspace requires explicit opt-in via EvolutionConfig.cross_workspace_sharing=True KTD-8: gave_up_after_reflections treated as failure path (triggers 100% evolution) even when stream wrapper marks status as COMPLETED. Detection via output_data.trace_outcome or error_message substring (ponytail: heuristic; upgrade path is a dedicated TaskResult.trace_outcome field). Backward compat: all gates conditional on auto_evolution_config is not None; existing EvolutionMixin usage without config preserves prior behavior. Tests: tests/unit/test_evolution_auto_trigger.py (37 tests) covers R5/R6 scenarios - sample rate gate, observe-only, consumption gate, pitfall confidence, actor marking, cross-workspace sharing, gave_up_after_reflections, error handling, fire-and-forget, backpressure cap, AE3 happy path.	2026-07-03 13:54:37 +08:00
chiguyong	1d09fafec9	feat(core): reflexion in main flow - verify fail → reflect → retry (U5, R4)	2026-07-03 13:29:54 +08:00
chiguyong	4255cb33ba	feat(core): step budget phases + keep working bias (U4, R11/R10)	2026-07-03 13:10:28 +08:00
chiguyong	b8418968c2	feat(core): verification defaults for PLAN_EXEC/TEAM_COLLAB + minimum sandbox (U3, R2/R3/RV3)	2026-07-03 12:32:22 +08:00
chiguyong	dd259153fa	feat(core): wire evolution hooks into execute_stream path (U2, OQ6 fix) ConfigDrivenAgent.execute_stream() now fires on_task_complete/on_task_failed evolution hooks in its finally block, achieving lifecycle parity with the sync execute() path. This fixes the OQ6 gap where WebSocket-routed streaming tasks bypassed evolution entirely. Implementation: - Module-level backpressure manager (_schedule_evolution / drain_pending_evolution_tasks) with cap = max(2, max_concurrency * 2), drop + log + counter on exceed, and shutdown drain via asyncio.gather(return_exceptions=True). - _trigger_evolution_hooks / _evolve_safe methods on ConfigDrivenAgent: fire-and-forget via asyncio.create_task, evolution errors swallowed (never fail the stream). - execute_stream finally block distinguishes cancelled (CancelledError / TaskCancelledError -> CANCELLED), failed (Exception -> FAILED), completed (final_answer received -> COMPLETED), and early-close (no completion, no error -> CANCELLED "stream closed before completion"). - app.py shutdown drains pending evolution tasks. - plan_exec_engine.py / reflexion.py: doc comments noting hooks fire at the ConfigDrivenAgent layer (single chokepoint, no double-fire). - portal.py: verification comments at 3 execute_stream call sites (these call react_engine.execute_stream directly, bypassing ConfigDrivenAgent - known gap tracked separately). Tests (8 new in test_execute_stream_hooks.py): - Happy path: success fires COMPLETED, failure fires FAILED. - Edge cases: cancellation fires CANCELLED, early aclose fires CANCELLED, evolution error suppressed, backpressure cap drops + counts. - Parity: REST on_task_complete vs execute_stream both fire COMPLETED. - Disabled: _evolution_enabled=False fires no hooks.	2026-07-03 12:16:02 +08:00
chiguyong	2932ee51ed	feat(tools): add str_replace_editor tool with workspace-root security (U1, R1) Replaces the broken write_file placeholder (no real implementation, only _FakeTool stubs in cli/benchmark.py) with a structured editor offering four commands: create, str_replace, insert_at_line, view. Security model (file-system analog of the 6-layer terminal security paradigm, reject-by-default + prefix match): 1. Reject absolute paths (force relative interpretation vs workspace root). 2. Reject any .. path component (path traversal). 3. Path.resolve() follows symlinks, then relative_to(workspace_root) rejects symlink escape and residual traversal. Data-loss guard: create refuses to overwrite existing files. str_replace requires a unique anchor (0 or >1 matches error). insert_at_line is 1-based (0 = prepend, > EOF = append). All FS I/O wrapped in asyncio.to_thread. Registers str_replace_editor in _DEFAULT_CORE_TOOLS (replacing write_file) so its full description is always injected into the LLM prompt. Updates test_tool_search.py which used write_file as a sample core tool. Tests: 34 cases in test_str_replace_editor.py cover happy path, edge cases (empty file, multi-match, insert at 0/beyond EOF, view range), error paths (overwrite refusal, anchor not found, path traversal, absolute path, symlink escape, unknown command, missing args), and integration contract (in _DEFAULT_CORE_TOOLS, exported from agentkit.tools, schema enum, prompt injection via _build_tool_use_prompt). Verification: ruff check clean; targeted regression suite 412 passed (the single failure in test_calendar_tool.py is a pre-existing date-sensitive bug in an untouched file, today 2026-07-03 Friday makes the next-Wednesday assertion fail).	2026-07-03 11:42:59 +08:00
Fischer	00b2dad36e	feat(compressor): CJK-aware token estimation + linear compress flow (#21 ) Deploy to Production / deploy (push) Waiting to run Details Test / backend-test (push) Waiting to run Details Test / frontend-unit (push) Waiting to run Details Test / api-e2e (push) Waiting to run Details Test / frontend-e2e (push) Waiting to run Details Squash merge PR #21: CJK-aware token estimation + linear compress flow + solution doc	2026-07-03 09:40:28 +08:00
chiguyong	1599d193c7	test: fix async generator mock for U3 streaming orchestrator Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details U3 streaming refactor switched orchestrator from agent.execute() to agent.execute_stream() (async gen), but tests still mocked execute(). AsyncMock() returns a coroutine lacking __aiter__, causing: - 'async for' requires an object with __aiter__ method, got coroutine - RuntimeWarning: coroutine was never awaited Add shared helpers in tests/unit/experts/_helpers.py: - make_chat_stream_mock: async gen for gateway.chat_stream - make_execute_stream_mock: async gen yielding final_answer event - make_execute_stream_raising_mock: async gen that raises (for failure tests) Update 3 test files to use the helpers: - test_team_orchestrator.py: _make_mock_expert, _make_mock_pool, failure tests (phase_failed, all_phases_fail, fallback_uses_lead, phase_failure_marks_dependents), assertion updates (execute_stream instead of execute), synthesizer warning cleanup - test_pm_collaboration.py: _make_mock_expert, _make_mock_llm_gateway, collaboration/risk/rework assertions - test_board_orchestrator.py: _make_mock_gateway (warning cleanup) All 483 experts/ tests pass with 0 warnings.	2026-07-02 22:52:10 +08:00
chiguyong	b98e7cb42f	test: update login test to expect standardized port 18001 The test was asserting port 8001 (old default) but config.py now loads .env.dev which sets AGENTKIT_SERVER_PORT=18001 per the project port standardization (18001/18002/15173/15174).	2026-07-02 21:30:21 +08:00
chiguyong	7376005868	fix: 修复 transient state 重置口径 + ReAct 工具调用规则 Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details Bug 1: chatStore 三个 action 重置 boardState/debateState/collaborationState - createConversation: 新增三态重置（原缺失，旧私董会状态泄漏到新会话） - selectConversation: 统一为条件重置（prevConvId !== id），避免 force-reload 误清空 - deleteConversation: 补全 collaborationState 重置 - 附带：selectConversation 中 board_speech/board_summary 消息缺失 expert_avatar/expert_color 时从 boardState.experts 兜底补全 Bug 2: ReAct _build_tool_use_prompt L0 规则调整 - 新增规则 1：涉及外部信息/实时数据/多步骤分析/不确定事实时必须使用工具 - 原规则 3 降为规则 4，收窄为仅在确实无需工具时可直接回答 - base_prompt 与工具描述不动（L1/L2 拆为独立 plan）测试：5 前端 transient-state reset matrix + 6 后端 prompt rules 断言 Plan: docs/plans/2026-07-02-002-fix-transient-state-reset-and-react-tool-guidance-plan.md	2026-07-02 20:51:57 +08:00
chiguyong	78a7faa17b	refactor: remove all emoji from agentkit Deploy to Production / deploy (push) Waiting to run Details Test / backend-test (push) Waiting to run Details Test / frontend-unit (push) Waiting to run Details Test / api-e2e (push) Waiting to run Details Test / frontend-e2e (push) Waiting to run Details Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details Replace emoji across codebase: YAML avatars -> first char, frontend banners -> Ant Design Vue components, CLI status -> OK/FAIL/WARN labels, terminal -> [WARN]/[OK]/[PENDING], Bitable DB default -> table, App.vue font cleanup, test fixtures -> first char letters. shell.avatar type upgraded to string \| Component.	2026-07-02 01:33:28 +08:00
chiguyong	36b0296730	fix: 私董会数据持久化修复 + emoji 移除计划 - 修复 board_started/expert_speech/round_summary/board_concluded 事件持久化 - 添加 is_board 标记到会话列表和详情接口 - 实现 restoreBoardStateFromMessages 从持久化消息恢复 boardState - 添加 ChatSidebar 私董会徽章 - 添加 emoji 移除计划文档 (docs/plans/2026-07-02-001)	2026-07-02 01:07:12 +08:00
chiguyong	47a437c5e3	fix(experts): resolve residual review findings from PR #13 Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details Addresses 4 actionable findings (1 P1 + 3 P2) from ce-code-review of feat/ui-ue-enhancement (PR #13), now merged to main (`8066e0b`). P1 — expert_step payload alignment (_phase_executor.py) The thinking/tool_call/tool_result event payloads were missing the fields the frontend WsServerMessage contract requires (expert_name/expert_color/content/step). Frontend code consuming these events silently degraded. Now all expert_step broadcasts carry the full contract; tool_call/tool_result keep step_data for the raw payload. P2 #1 — execute_stream CancellationToken registration (config_driven.py) execute_stream() bypassed BaseAgent.execute() and never registered a CancellationToken, so cancel_task() could not cooperatively cancel a streaming task. Now registers the token and cleans it up in finally. P2 #2 — team_synthesis orphan milestone cleanup (orchestrator.py) If synthesis streaming was interrupted (cancel/exception), no terminal team_synthesis event was emitted, leaving the frontend streaming milestone spinning forever. Now an inner try/except emits a terminal team_synthesis with status=cancelled\|error before re-raising, so the frontend can finalize the milestone. The success path also carries the synthesis_id. P2 #3 — synthesis_id dedup (orchestrator.py + types.ts + chatStream.ts) Without an identifier, the frontend could not precisely match a team_synthesis terminal event to its streaming milestone (especially across retries/concurrent teams). The backend now injects a stable synthesis_id (`{plan.id}:synthesis`) into both team_synthesis_chunk and team_synthesis events; the frontend uses it for exact milestone matching and treats error/cancelled status as terminal. Test updates - Updated test_thinking_events_forwarded_as_expert_step to assert the new payload contract (expert_id/name/color/content/step). - Added test_tool_call_events_forwarded_as_expert_step covering tool_call/tool_result payload shape (content=tool_name摘要 + step_data=原始 payload). Verification - ruff check: clean - pytest tests/unit/experts/test_phase_executor_streaming.py: 14/14 - npm run typecheck: clean - vitest: 126/127 (1 unrelated baseline failure in tauri-auth.test.ts) Residuals doc: docs/residual-review-findings/feat-ui-ue-enhancement.md	2026-07-01 13:26:19 +08:00
chiguyong	f872a3fac6	feat: UI/UE enhancement — streaming, sticky header, hover actions, calendar tokens Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details U1 ThinkingBlock: streaming cursor + auto-collapse to summary bar U2 StickyModeHeader: new component replacing ExpertTeamView + BoardStatusView U3 Backend _phase_executor: execute_stream() with token/thinking/final_answer forwarding U4 Frontend chatStream: expert_result_chunk/team_synthesis_chunk token accumulation U5 AssistantText: routing tag hover fade-in U6 UserBubble: hover actions (copy/delete/refill) U7 CalendarGrid: token-based color redesign Review fixes (ce-code-review): - P0: _VALID_TEAM_EVENT_TYPES whitelist adds 3 new streaming event types - P0: final_answer no longer double-accumulates token content - P2: exception handling expanded to except Exception for LLMProviderError etc. Simplification (ce-simplify-code): - _synthesizer.py: O(n²) concat -> list+join, _concat_results extraction - config_driven.py: 4 duplicate _handle_*_stream -> _wrap_sync_as_stream - chatStream.ts: 5x [...messages].reverse().find() -> findLastMessage helper Tests: pytest 13/13, vitest 126/127 (1 baseline), typecheck pass, ruff clean	2026-07-01 12:51:45 +08:00
chiguyong	be5c4e09f8	refactor(core,experts): classify except Exception + structured ReviewResult (U3) ReviewResult dataclass (passed/degraded/feedback) replaces tuple+[DEGRADED] prefix in _review_phase_output; 3 review_result WS payloads now carry degraded field (AE3). except Exception narrowed to specific types across 10 files (core/react, rewoo, base, orchestrator, dispatcher, plan_exec_engine + experts/orchestrator, _phase_executor, _review_gate + orchestrator/pipeline_engine). Baseline 140 -> 66 occurrences (>=50% reduction). Fix RuntimeError regression: review-gate + compression paths now catch RuntimeError (LLM/provider internal errors) to preserve degradation semantics. Test side_effect switched to functional form to avoid StopIteration on list exhaustion. ruff clean; 135 key + 469 experts + 163 core tests pass.	2026-06-30 18:03:58 +08:00
chiguyong	e61f98898f	refactor(core): unify ReActEngine execute/execute_stream via async generator (U1) - Convert _execute_loop to async generator yielding ReActEvent; both execute and execute_stream delegate to it, eliminating ~760 lines of duplicated loop logic (execute_stream 813 -> 53 lines). - Add 'final_result' event_type carrying ReActResult; execute extracts result from final event, execute_stream forwards events (backward-compatible 'final_answer' retained). - Unify _drain_phase_violations across both paths. - Add 14 golden-trajectory characterization tests. - Fix test_execute_stream_with_compressor mock gateway (chat_stream test-infra gap). 130 react tests pass, 762 core+experts pass, no regressions.	2026-06-30 16:07:00 +08:00
chiguyong	a3cecd4b50	fix(review): apply P0/P2 findings from dual-agent review - Dockerfile: split ENTRYPOINT/CMD to align with docker-compose serve - test_termbase: guard jieba import with pytest.importorskip - orchestrator: mark silent review-degradation with [DEGRADED] prefix - chat.py: accurate ExecutionMode log message - agentkit.yaml: document OTel exporter config - skill_routing: replace 12 Any with object/typed (AGENTS.md compliance) - AssistantText.vue: add aria-live/role for a11y	2026-06-30 14:27:46 +08:00
chiguyong	8627777f87	fix(review): apply ce-code-review findings Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details Six safe fixes from Stage 5c review: phase.py: delete dead _DEFAULT_BASH_FILTER constant (no references after U1) chat.py: drop Any from _build_phase_engine params (AGENTS.md prohibits any) chat.ts: delete stale comment about phase_changed emission chat-phase.test.ts: rename misleading 'capped at 5' test name test_chat_plan_exec_ws.py: tighten test_rest_react_mode_still_works assertion test_plan_exec_e2e.py: clarify test_auto_advance assertion comment Known limitations documented in PR description (not fixed): loop detector + advance_phase (P1), parallel path phase_violation ordering (P2), REST cancellation_token (P2), Callable filter exceptions (P3).	2026-06-30 12:42:15 +08:00
chiguyong	b032e08866	feat(U3): extract _build_phase_engine helper + wire REST PLAN_EXEC Extract the WS path's inline phase_policy construction into a shared _build_phase_engine helper so the REST send_message endpoint can reuse it. Replace the former 501 stub with actual PLAN_EXEC execution: - REST POST /chat/sessions/{id}/messages with execution_mode=plan_exec now builds a phase-policy-backed ReActEngine, calls execute() (non-streaming), and returns a MessageResponse. - KTD5: PLAN_EXEC bypasses execute_with_fallback_chain — phase policy and fallback chain are mutually exclusive. - When plan_exec.enabled=False, REST falls through to the REACT path (matching WS behavior). - WS path refactored to call the same helper; behavior unchanged. Tests: - Replace TestRestPlanExec501 with TestRestPlanExec (happy path, bad config → 500, disabled → falls through to REACT, REACT mode unchanged). - Add TestBuildPhaseEngineHelper covering all return branches: not-PLAN_EXEC, disabled, empty-config, invalid-config, tool append, default-policy fallback. - All 109 tests pass across the three PLAN_EXEC test files.	2026-06-30 10:59:43 +08:00
chiguyong	4dc58c24bc	feat(U2): emit phase_violation WS event alongside LLM reinjection Wave 3 only injected the violation error dict back to the LLM as a tool result. Wave 4 U2 adds a parallel WS event so the frontend PhaseIndicator can surface violations to the user. - ReActEngine: add _phase_violations accumulator (list[dict]). Cleared in reset(). _check_phase_permission appends a structured violation dict (with new violation_kind field: tool_not_allowed \| bash_command_blocked) before returning the error. - Add _drain_phase_violations(step) helper that pops pending violations and returns ReActEvent(event_type="phase_violation", ...) list. Events carry a shallow copy of the violation dict so callers can't mutate the accumulator. - execute_stream: drain after each tool_result yield at all 3 tool execution sites (parallel, serial-with-confirmation, parsed_calls). Non-streaming execute() ignores the accumulator (the LLM reinjection via the error dict is the only signal there). - chat.py WS handler: new elif branch forwards phase_violation ReActEvents to the client as {"type": "phase_violation", "data": ...} WS messages. - Tests: 11 new tests covering accumulator lifecycle, drain semantics, shallow-copy isolation, and execute_stream event emission for both tool_block and bash_block paths. 2 new WS forwarding tests pin the chat.py path (forward + characterization for REACT mode).	2026-06-30 10:48:35 +08:00
chiguyong	9e28ab315e	feat(U1): widen PhasePolicy bash_command_filter to accept Callable Reuses ShellTool._is_dangerous as the default bash filter for PLANNING and VERIFICATION phases, closing the regex ceiling documented in Wave 3. - Convert ShellTool._is_dangerous and _is_single_command_dangerous to @staticmethod (backward-compatible; instance calls still work via Python's descriptor protocol). - Widen PhasePolicy.bash_command_filter field type to dict[PhaseState, Callable[[str], bool] \| re.Pattern \| None]. - is_bash_command_allowed dispatches on callable vs pattern at call time. Empty commands short-circuit to allowed (Wave 3 contract; ShellTool emits the clearer empty-command error). - to_dict serializes callables as <callable> for log readability. - default_policy() now wires ShellTool._is_dangerous for PLANNING and VERIFICATION. _DEFAULT_BASH_FILTER kept for backward compat with configs that pass a re.Pattern. - Tests: characterization tests pin Wave 3 behavior (rm/mv/cp/echo > still blocked) plus new edge-case coverage for ceiling closed (dd of=/dev/sda, :>file, chain operators, pipe segments).	2026-06-30 10:39:44 +08:00
Fischer	2b8a7d8909	feat(agent): Wave 3 strategic coupling (G5/G6) (#6 ) Deploy to Production / deploy (push) Waiting to run Details Test / backend-test (push) Waiting to run Details Test / frontend-unit (push) Waiting to run Details Test / api-e2e (push) Waiting to run Details Test / frontend-e2e (push) Waiting to run Details	2026-06-30 09:17:19 +08:00
Fischer	a2dcde01b8	feat(agent): Wave 2 medium coupling (G4/G7/G9) (#5 ) Deploy to Production / deploy (push) Waiting to run Details Test / backend-test (push) Waiting to run Details Test / frontend-unit (push) Waiting to run Details Test / api-e2e (push) Waiting to run Details Test / frontend-e2e (push) Waiting to run Details	2026-06-30 09:09:33 +08:00
chiguyong	d7ca6e8065	fix(review): W1 ServerConfig from_dict wiring, W3 internal kwargs filter, N3 status docstring Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details Code review fixes for Wave 1: - W1: ServerConfig.from_dict now wires prompt_cache/streaming/verification sections from YAML to constructor (previously these params existed but were never read) - W3: Tool._validate_input filters _-prefixed kwargs (e.g. _skip_dangerous_check) before jsonschema.validate, preventing additionalProperties:false schemas from rejecting internal control parameters - N3: ReActResult.status docstring now lists "empty_fallback" and "verify_failed" Added test test_internal_kwargs_underscore_prefixed_skipped_by_validation for W3.	2026-06-29 21:58:40 +08:00
chiguyong	cd211c6cd9	feat(U4): G1 verify 失败回灌 ReAct - ReActEngine 新增 max_reinjections 构造参数(默认 1,=0 等价原行为) - execute()/execute_stream() verify 块从循环后移到循环内 final-answer 检测点: - verify 通过 → 正常 break - verify 失败 + reinjections < max + step < max_steps → errors 作为 user 消息回灌 conversation, continue 让 LLM 自纠正 - verify 失败 + 达到 max_reinjections 或 max_steps → 记录 verify log 到 trajectory, trace_outcome="verify_failed", break - execute_stream 的 final_answer 事件在 verify 通过后才 yield,避免客户端过早收到完成信号 - ReActResult.status 现在传递 trace_outcome(原默认 "success") - ServerConfig.verification 配置项(max_reinjections) - test_verify_reinjection.py 10 测试:characterization(max=0)+ 新行为(R1/R2/R3/R14)	2026-06-29 21:35:08 +08:00
chiguyong	0f3f0a7550	feat(U3): G8 delta_flush_interval 调速 - ReActEngine 新增 flush_interval_ms 构造参数(默认 0 = 逐 chunk yield 向后兼容) - execute_stream chunk 循环用 time.monotonic 节流,累积 _flush_buffer 批量 yield - flush_interval_ms=0 条件短路为 True 逐 chunk yield 保当前行为 - 流结束 mid-interval 最终 flush 剩余 buffer 不丢字符 - ServerConfig.streaming 配置项(flush_interval_ms) - test_delta_flush.py 覆盖 R11/R12/R14	2026-06-29 20:49:52 +08:00
chiguyong	c4aaef05aa	feat(U2): G2 prompt cache 双块结构 - ReActEngine 新增 _build_system_message(stable+volatile) 双块构造 - Anthropic provider 返回 content blocks,stable 块带 cache_control - 非 Anthropic provider 返回字符串拼接,依赖 stable 前缀命中自动前缀缓存 - execute_stream/execute 记忆注入从 system_prompt 末尾移到 volatile 层 - LLMGateway.get_provider_name_for_model 暴露 provider 检测能力 - anthropic.py _convert_messages 支持 list-type system content 透传 - ServerConfig.prompt_cache 配置项(默认 enable=True) - ReActEngine.prompt_cache_enable 构造参数(默认 True 保当前行为) - test_prompt_cache_layers.py 覆盖 R4-R7/R13	2026-06-29 20:47:23 +08:00
chiguyong	c66a7773b5	feat(U1): G3 工具调用 schema 校验 - base.py 新增 ToolValidationError(error_code/details)与 _validate_input - safe_execute 在 execute 前用 jsonschema.validate 校验 kwargs - input_schema=None 跳过校验保持向后兼容 - _execute_tool 优先捕获 ToolValidationError 保留 error_code - function_tool._infer_schema 修复 VAR_KEYWORD/VAR_POSITIONAL 误入 schema - test_tool_schema_validation.py 覆盖 R8-R10	2026-06-29 20:34:14 +08:00
chiguyong	2747bb4e64	chore(prior): malformed tool call handling, auth whitelist, dev scripts, wave1 plan	2026-06-29 20:25:03 +08:00
chiguyong	a6e1bf5884	feat(bitable): 多维表格文件层 + 默认字段 + 表内字段操作 + ce-code-review 修复 (Stage 1) Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details 实现多维表格 UI 完整性 Stage 1（U1-U6），补齐飞书/twenty 对齐缺失的文件层、默认字段与表内字段操作能力，并修复 ce-code-review 走查发现的 P0/P1 级问题。后端（U1-U2）: - 新增 BitableFile 实体（models/db/repository/service/routes），三级层级：文件→数据表→字段/记录 - Schema V2 迁移：bitable_files 表 + tables.file_id 列，幂等（IF NOT EXISTS），保留 V1 孤儿表 - 新建数据表自动创建 5 个默认字段（标题/状态/日期/创建人/创建时间） - agent-owned 字段在 create_record 时自动填充（按 type+owner 匹配，传 actor_user_id） - 7 个文件 REST 端点 + IDOR ownership 检查（404-before-403，internal token 旁路）前端（U3-U5）: - 文件列表页（FileCard 网格 + 新建/重命名/删除）+ 文件详情页（侧栏表格列表 + vxe-table 网格） - Vue Router 嵌套路由 /bitable → /bitable/:fileId → /bitable/:fileId/:tableId - 列头菜单（编辑/隐藏/删除字段）+ 末尾 + 列新增字段 - select/multiselect 字段自定义单元格编辑器 + Tag 展示 - Pinia store 扩展 file 状态与动作，深链直访回退 getFile，fileId 切换 watch 测试（U6）: - 文件 CRUD（12 例）+ 默认字段（10 例）单元测试 - 3 个 E2E spec（视图加载、文件流、字段操作），后端不可用时优雅跳过 ce-code-review 修复（P0/P1）: - P0 路由冲突：GET /files/{file_id} 遮蔽下载端点 → 下载改 /uploads/{filename} - P0 IDOR：update/delete field/record/view 五端点补 ownership 检查 - P1 is_initialized property 缺失致二次初始化崩溃 - P1 直接 URL 导航失效（files 数组为空）→ selectFile 回退 getFile - P1 fileId 切换不重载 → 增加 watch - P1 轮询丢弃最终公式值（wasCalculating 守卫）+ 复用视图 filters - P1 测试断言 200→201；test_db 无 URL 用例解除 postgres 标记得以执行 - P2 _check_table_ownership 403→404；输入长度校验；upload field-table 一致性校验 - P2 multiselect 浅比较 → 深比较；E2E bitable-view 补 waitForServer 守卫验证：ruff check 通过；pytest 91 passed/116 skipped；vue-tsc --noEmit 通过。	2026-06-29 04:07:45 +08:00
chiguyong	5c15238a5a	fix(calendar): 修复 agent 创建日历事件后 UI 不刷新 + 文档化三根因三部曲 Test / backend-test (pull_request) Has been cancelled Details Test / frontend-unit (pull_request) Has been cancelled Details Test / api-e2e (pull_request) Has been cancelled Details Test / frontend-e2e (pull_request) Has been cancelled Details 代码修复 (ce-debug): - CalendarService.create_event 注入 notify_callback，成功后广播 calendar_event_created WS 消息 - app.py 调整 _calendar_ws_sender 闭包定义顺序，注入 CalendarService（与 ReminderScheduler 共享） - tauri-auth.ts keychain fallback 修复（localStorage 始终作为备份） - 新增 2 个广播回归测试文档 (ce-compound + ce-compound-refresh): - 新增 docs/solutions/ui-bugs/calendar-agent-create-no-refresh.md（第三根因：WS 广播缺失） - 更新 calendar-capability-and-ui-fixes.md：刷新 test count + 加 Related Issues 前向引用 - 更新 jwt-secret-dev-mode-user-id-mismatch.md：扩展 e2e bullet + 加第三个根因引用 - CONCEPTS.md 新增 Service Broadcast Callback 条目 (Real-Time Fan-Out 节) 测试: - 新增 E2E 测试套件 (admin/auth-persistence/bitable/calendar/conversation/documents/evolution/settings/skills) - 新增 tests/e2e/test_api_coverage.py - CI: .gitea/.github workflows/test.yml	2026-06-29 02:20:33 +08:00
chiguyong	c9ce15fa4b	fix(code-review): 修复走查发现的 13 High + Medium 安全/可靠性问题代码修复（8 High + 9 Medium）： - portal.py — C1 IDOR 文档 / C2 类型修复 / C3 WS 连接上限 16 / C4 ws_user_id 早初始化 / M silent swallow 日志化 - auth/middleware.py — C5 WS sid 补齐 - calendar_tool.py — C6 偏移量 ±43200 双向校验 + reminder_channels 类型/白名单校验 - sqlite_conversation_store.py — C7 DELETE 事务回滚 - chat.ts (Pinia) — C8 deleteConversation 清理 pending 缓存 - app.py — M except: pass → logger.debug(exc_info=True) - Scene6Error.vue — M onUnmounted 清理 setTimeout - DocumentsTab.vue — M Invalid Date 守卫 - ChatSidebar/RightPanel/TopNav.vue — M aria-label 无障碍标签 - SystemMonitorPanel.vue — M v-else 兜底 + active 边框色 + tablist 键盘导航 - CalendarDrawer.vue — M overflow-y: auto - CalendarGrid.vue — M ResizeObserver 反馈循环防护 - SkillsTab.vue — M onMounted 始终 fetchSkills 文档修复（5 High + 6 Medium）： - portal-platform-security-reliability-fixes.md — D2 测试路径 / D3 Root Cause+Impact 章节 / D4 severity: mixed / 标题中文化 / 12 处绝对路径转相对 / P2 #12 数字口径 - AGENTS.md — D5 路由表 22→28 / 专家模板 5→15 / LiteLLM U15 迁移 / 配置查找 fallback - README.md — 8 处端口 8000→8001 新增测试： - tests/unit/calendar/test_calendar_tool.py — ponytail 自检断言验证： - ruff check (5 文件) — All checks passed - vue-tsc --noEmit — exit 0 - git stash baseline 验证 — portal 17 个 401 失败为预存在问题已知限制（预存在）： - 17 个 portal 测试 401 失败 — 需另起 ce-debug 调查 - README.md 7 处 CostAwareRouter 引用过时 — 文档同步另起任务	2026-06-28 15:06:41 +08:00
chiguyong	31c65e01b8	fix(security): P0 安全加固 + 多实例部署一致性 (U1-U4 + U5c) Deploy to Production / deploy (push) Has been cancelled Details U1: LLM gateway KB 缓存 fail-closed — 异常时默认禁用缓存防止 KB 数据泄漏 U2: MCP 危险工具黑名单过滤 — 6+1 端点覆盖，防止绕过 chat confirmation U3: SecretsStore Redis 迁移 — 多 worker 共享凭证，内存降级保留开发模式 U4: channels webhook Redis 状态 — ZSET 滑动窗口限流 + nonce dedup + backpressure U5c: ce-code-review 修复批次: - P0: 统一 MCP 黑名单与 publisher.py 一致 (terminal_execute -> terminal, +file_read) - P1: ZSET 限流 member 加 uuid 后缀避免同时间戳碰撞 - P1: SecretsStore redis 参数 Any -> aioredis.Redis \| None (AGENTS.md 合规) - P1: Redis client 添加 socket_timeout 防止单点故障请求挂死测试: 171 scoped tests pass, ruff clean	2026-06-26 04:05:33 +08:00
chiguyong	53faa60472	fix(review): ce-code-review P1+P2 修复 — 安全/可靠性/性能 P1 安全与可靠性（4 项）： - wecom: verify_signature 增加时间戳新鲜度校验（5 分钟窗口防重放） - cache: should_cache 在 per_user_namespace 开启时拒绝 user_id=None 匿名请求，避免跨用户缓存泄漏（安全要求 a/e） - channels: webhook receive_message 异常兜底，防止 500 触发平台重试风暴 - app: shutdown 调用 close_all_adapters + await _pending_webhook_tasks，防止 httpx 连接泄漏和丢失 IM 回复 P2 效率与可维护性（5 项）： - feishu: _TOKEN_CACHE_TTL 300 → 6900（2h 减 5min 余量，避免 24x 过频刷新） - channels: _pending_webhook_tasks 有界化（2x 并发上限时 429 拒绝） - gateway: quota 检查每 period 单次 get_usage，复用 summary 检查 token+cost - cache_key: generate_cache_key 合并为单次 SHA-256（消除 8-10 次冗余哈希） - config: ProviderConfig.get_api_key 移除未用的 secrets_store 参数 P3 去重（1 项）： - channels: _process_inbound_message DIRECT_CHAT 路径提取 _direct_chat 辅助函数测试： - test_wecom: 时间戳改用 int(time.time())，新增 test_expired_timestamp_rejected - test_cache: should_cache 测试覆盖匿名拒绝 + namespace_off 兼容 - test_config_migration: get_api_key 测试适配新签名 - channels/config_migration/quota_enforcement 测试全部通过	2026-06-26 01:40:31 +08:00
chiguyong	1ccaf56b9a	refactor: ce-simplify-code 审查修复 — 去重 + 效率 + 死代码清理 3 个审查代理（复用/质量/效率）发现 15 个问题，全部修复：效率与安全（6 项）： - MCPClient 缓存 MultiServerMCPClient 单例 + aclose()，修复连接/子进程泄漏 - _rate_limits 清理空 IP 条目，修复 X-Forwarded-For 欺骗下内存泄漏 - _seen_nonces 改用 OrderedDict，O(1) 摊销过期清理 - webhook 后台任务加 Semaphore(20) + 任务引用追踪，限制无界并发 - _build_adapter 用 asyncio.gather 并行解密 secrets - 适配器实例缓存（_adapter_cache），token TTL 缓存跨请求命中去重（4 项）： - header_get 提取到 channels/base.py，4 个适配器统一 import - _get_client/close() 移入 MessageAdapter 基类，子类继承 - URLVerificationChallenge 统一到 base.py，feishu/slack/wecom 共用 - Transport ABC 添加 endpoint_url 属性，from_transport 不再访问私有字段死代码与类型安全（5 项）： - detect_cache_hit 死方法替换为 record_cache_result 公开 API - execution_mode.value == "direct_chat" 改用枚举比较 - 删除 yielded_any 死变量、重复 from fastapi import Request、多余 getattr 防御 453 tests passed, ruff clean（预存 F841 非本次引入）	2026-06-25 23:54:14 +08:00
chiguyong	793476cafa	feat(llm): U17 — LiteLLM 语义缓存替换 + per-user/ACL scope 安全隔离 - 新增 LitellmCacheManager：配置 litellm.cache 全局，三级后端 fallback (RedisSemanticCache -> RedisCache -> InMemoryCache)，redisvl lazy import - cache_key 扩展 user_id + kb_acl_hash 参数（安全要求 a/b/e） - gateway 集成：读取 KB caching_disabled flag（安全要求 c），构建带 scope 的 cache_key，命中时 cost=0 - LLMResponse 新增 cache_hit 字段；LLMRequest 新增 cache 参数 - litellm_provider 透传 cache 参数 + 检测 _hidden_params 缓存命中 - 33 个新测试覆盖 13 场景（含 User A != User B 缓存隔离） - 旧 InMemoryLLMCache/RedisLLMCache 保留向后兼容	2026-06-25 22:49:59 +08:00
chiguyong	86541d7172	feat(mcp): U16 — langchain-mcp-adapters client replacement + transport deprecation - 重写 MCPClient：URL scheme 自动检测（stdio/http/sse）→ langchain config - 旧 Transport 注入路径保留（DeprecationWarning），向后兼容 - transport.py 模块级弃用警告 - 28 个新测试覆盖 URL 检测、list_tools、call_tool、legacy 路径、ImportError - 修复 manager.py / transport.py 预存 F401/F841	2026-06-25 22:04:37 +08:00
chiguyong	069dbc22b1	feat(llm): U15 — LiteLLM unified provider + api_key encrypted secrets migration	2026-06-25 21:41:15 +08:00
chiguyong	13c516a54f	feat(mcp): U14 — Skill/Team MCP publish with admin auth + dangerous-tool opt-in	2026-06-25 21:10:06 +08:00
chiguyong	16c33be295	feat(mcp): U13 — refactor MCPServer to route factory + mount at /api/v1/mcp with auth	2026-06-25 20:58:41 +08:00
chiguyong	8998f94c42	feat(channels): U12 — DingTalk/WeCom/Slack adapters + multi-channel webhook dispatch	2026-06-25 20:45:43 +08:00
chiguyong	4b58e8f661	feat(channels): U11 — Feishu IM adapter end-to-end (webhook + signature + AES-CBC decrypt + chat integration)	2026-06-25 20:24:21 +08:00

1 2 3 4 5

205 Commits