Commit Graph

368 Commits

Author SHA1 Message Date
chiguyong 4255cb33ba feat(core): step budget phases + keep working bias (U4, R11/R10) 2026-07-03 13:10:28 +08:00
chiguyong b8418968c2 feat(core): verification defaults for PLAN_EXEC/TEAM_COLLAB + minimum sandbox (U3, R2/R3/RV3) 2026-07-03 12:32:22 +08:00
chiguyong dd259153fa feat(core): wire evolution hooks into execute_stream path (U2, OQ6 fix)
ConfigDrivenAgent.execute_stream() now fires on_task_complete/on_task_failed
evolution hooks in its finally block, achieving lifecycle parity with the
sync execute() path. This fixes the OQ6 gap where WebSocket-routed streaming
tasks bypassed evolution entirely.

Implementation:
- Module-level backpressure manager (_schedule_evolution / drain_pending_evolution_tasks)
  with cap = max(2, max_concurrency * 2), drop + log + counter on exceed, and
  shutdown drain via asyncio.gather(return_exceptions=True).
- _trigger_evolution_hooks / _evolve_safe methods on ConfigDrivenAgent: fire-and-forget
  via asyncio.create_task, evolution errors swallowed (never fail the stream).
- execute_stream finally block distinguishes cancelled (CancelledError /
  TaskCancelledError -> CANCELLED), failed (Exception -> FAILED), completed
  (final_answer received -> COMPLETED), and early-close (no completion, no
  error -> CANCELLED "stream closed before completion").
- app.py shutdown drains pending evolution tasks.
- plan_exec_engine.py / reflexion.py: doc comments noting hooks fire at the
  ConfigDrivenAgent layer (single chokepoint, no double-fire).
- portal.py: verification comments at 3 execute_stream call sites (these call
  react_engine.execute_stream directly, bypassing ConfigDrivenAgent - known gap
  tracked separately).

Tests (8 new in test_execute_stream_hooks.py):
- Happy path: success fires COMPLETED, failure fires FAILED.
- Edge cases: cancellation fires CANCELLED, early aclose fires CANCELLED,
  evolution error suppressed, backpressure cap drops + counts.
- Parity: REST on_task_complete vs execute_stream both fire COMPLETED.
- Disabled: _evolution_enabled=False fires no hooks.
2026-07-03 12:16:02 +08:00
chiguyong 2932ee51ed feat(tools): add str_replace_editor tool with workspace-root security (U1, R1)
Replaces the broken write_file placeholder (no real implementation, only
_FakeTool stubs in cli/benchmark.py) with a structured editor offering four
commands: create, str_replace, insert_at_line, view.

Security model (file-system analog of the 6-layer terminal security paradigm,
reject-by-default + prefix match):
  1. Reject absolute paths (force relative interpretation vs workspace root).
  2. Reject any .. path component (path traversal).
  3. Path.resolve() follows symlinks, then relative_to(workspace_root)
     rejects symlink escape and residual traversal.

Data-loss guard: create refuses to overwrite existing files. str_replace
requires a unique anchor (0 or >1 matches error). insert_at_line is 1-based
(0 = prepend, > EOF = append). All FS I/O wrapped in asyncio.to_thread.

Registers str_replace_editor in _DEFAULT_CORE_TOOLS (replacing write_file)
so its full description is always injected into the LLM prompt. Updates
test_tool_search.py which used write_file as a sample core tool.

Tests: 34 cases in test_str_replace_editor.py cover happy path, edge cases
(empty file, multi-match, insert at 0/beyond EOF, view range), error paths
(overwrite refusal, anchor not found, path traversal, absolute path, symlink
escape, unknown command, missing args), and integration contract (in
_DEFAULT_CORE_TOOLS, exported from agentkit.tools, schema enum, prompt
injection via _build_tool_use_prompt).

Verification: ruff check clean; targeted regression suite 412 passed
(the single failure in test_calendar_tool.py is a pre-existing date-sensitive
bug in an untouched file, today 2026-07-03 Friday makes the next-Wednesday
assertion fail).
2026-07-03 11:42:59 +08:00
Fischer 00b2dad36e feat(compressor): CJK-aware token estimation + linear compress flow (#21)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Squash merge PR #21: CJK-aware token estimation + linear compress flow + solution doc
2026-07-03 09:40:28 +08:00
Fischer 2296d0b209 refactor: remove all emoji from source code (#20)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Replace emoji/glyph characters with Ant Design Vue Outlined icons (frontend), text labels with ANSI colors (CLI/shell), and ASCII art (docstrings). Add pre-commit guard (scripts/check-no-emoji.sh) and style guide to prevent regression.

Closes: docs/plans/2026-07-02-001-refactor-remove-all-emoji-plan.md
2026-07-03 02:46:40 +08:00
Fischer 76c9c08756 feat(ui): private board restrictions + scheme B assistant/user bubbles (#19)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Implements U1-U4 from plan docs/plans/2026-07-02-001-feat-private-board-restrictions-and-scheme-b-bubbles-plan.md

U1: ChatInput @board button blocks existing-conversation board creation with modal
U2: BoardBannerCard simplified to plain title + round meta
U3: MessageShell assistant bubble (scheme B neutral grayscale) with F4-A card exclusion + G1 empty-bubble hide
U4: UserBubble dark text bubble for plain text

Code review fixes: P1 color token, P2 CARD_BEARING_TYPES error type, P2 expertColor dead code, P0/P1 bubbleUtils.ts + 42 tests

Tests: 180/181 pass (1 pre-existing tauri-auth failure). Typecheck clean.
2026-07-03 01:58:19 +08:00
chiguyong e04e2868c3 docs(compound): message bubble empty-content and card-type exclusion pattern
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Documents the G1 (:empty never matches Vue root), F4-A (card-bearing type
exclusion via messageType prop + Set), and pure-function extraction pattern
for testability without @vue/test-utils.
2026-07-03 01:58:00 +08:00
chiguyong cc6634b2ab feat(ui): private board restrictions + scheme B assistant/user bubbles
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
U1: ChatInput @board button blocks existing-conversation board creation
    with modal — enforces "one board per conversation" constraint.
U2: BoardBannerCard simplified to plain title + round meta
    (no icons/bars/progress/expert chips).
U3: MessageShell assistant bubble (方案B neutral grayscale) with
    F4-A card-type exclusion + G1 empty-bubble hide.
U4: UserBubble dark text bubble for plain text
    (command card/file keep light bg).

Code review fixes (ce-code-review step 5):
- P1: UserBubble focus-visible --accent-primary → --color-primary
  (dark mode visibility fix).
- P2: CARD_BEARING_TYPES adds 'error' (ErrorCard double-bubble regression).
- P2: Remove dead expertColor prop (scheme B leftover).
- P0/P1: Extract bubbleUtils.ts pure functions + add 42 tests
  covering G1/F4-A/U4/U2 key decisions.

Tests: 180/181 pass (1 pre-existing tauri-auth failure unrelated).
Typecheck: clean.
2026-07-03 01:47:37 +08:00
chiguyong 981a794a54 docs(plan): private-board restrictions + scheme B bubbles plan ready
Plan document finalized after 4 rounds of ce-doc-review:
- F4-A exclusion list extended from 5 to 9 card-bearing types
- Verified root class names for all 9 card components
- Corrected chrome description (2 full chrome + 7 partial chrome)
- Added U1 modal focus restoration note (WAI-ARIA)
- Documented R4-DA1/R4-A3/R4-A4 as Open Questions for implementation
2026-07-03 01:14:37 +08:00
Fischer 6826ceb2a9 Merge PR #18: fix async generator mock for U3 streaming orchestrator
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
2026-07-02 22:57:16 +08:00
chiguyong 1599d193c7 test: fix async generator mock for U3 streaming orchestrator
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
U3 streaming refactor switched orchestrator from agent.execute() to
agent.execute_stream() (async gen), but tests still mocked execute().
AsyncMock() returns a coroutine lacking __aiter__, causing:
- 'async for' requires an object with __aiter__ method, got coroutine
- RuntimeWarning: coroutine was never awaited

Add shared helpers in tests/unit/experts/_helpers.py:
- make_chat_stream_mock: async gen for gateway.chat_stream
- make_execute_stream_mock: async gen yielding final_answer event
- make_execute_stream_raising_mock: async gen that raises (for failure tests)

Update 3 test files to use the helpers:
- test_team_orchestrator.py: _make_mock_expert, _make_mock_pool,
  failure tests (phase_failed, all_phases_fail, fallback_uses_lead,
  phase_failure_marks_dependents), assertion updates (execute_stream
  instead of execute), synthesizer warning cleanup
- test_pm_collaboration.py: _make_mock_expert, _make_mock_llm_gateway,
  collaboration/risk/rework assertions
- test_board_orchestrator.py: _make_mock_gateway (warning cleanup)

All 483 experts/ tests pass with 0 warnings.
2026-07-02 22:52:10 +08:00
chiguyong d17863d01d Merge PR #17: fix transient state reset + ReAct tool guidance + scheme B UI + dev port isolation
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
2026-07-02 22:25:52 +08:00
chiguyong 23160be055 fix(types): resolve 3 pre-existing typecheck errors in transient-state test
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
message_type: 'board_started' as const (line 93) fixes TS2322 on lines
107 and 122 — TypeScript was inferring message_type as string instead
of the literal 'board_started'.

boardState local variable: replace 'as never' with proper shape +
'status: discussing' as const (line 159-160) fixes TS2339 on line 168
where .topic was accessed on type 'never'.

All 5 transient-state tests still pass. vue-tsc --noEmit now clean.
2026-07-02 22:13:28 +08:00
chiguyong 53347ed1fe test(u6): add L4 real-LLM smoke test for ReAct tool-use prompt
Manual smoke test verifying U4 L0 prompt rule rearrangement under real
LLM calls (bailian-coding/qwen3.7-plus). 5 probe queries covering
external_info / realtime_data / multi_step / realtime_simple / no_tool.

Results:
- Probe #1 external_info: PASS (8 web_search calls, 99.9s)
- Probe #2 realtime_data: ERROR (120s timeout, not LLM refusal)
- Probe #3 multi_step: PASS (8 web_search calls, 62.6s)
- Probe #4 realtime_data_simple: PASS (3 web_search calls, 23.8s)
- Probe #5 no_tool_escape_hatch: PASS (0 tool calls, direct answer, 4.2s)

Verdict: 3/4 tool-call pass (>=3/4 threshold) + 1/1 direct pass
Bug 2 status upgraded to 'L4 verified'.

Plan Progress table updated: U6 done, U7 done.
2026-07-02 22:08:45 +08:00
chiguyong 44f4f1c46f fix: add null check for chatStore.conversations in StickyModeHeader
Optional chaining prevents TypeError when test mocks don't provide conversations array.
2026-07-02 21:48:41 +08:00
chiguyong b98e7cb42f test: update login test to expect standardized port 18001
The test was asserting port 8001 (old default) but config.py now loads .env.dev which sets AGENTKIT_SERVER_PORT=18001 per the project port standardization (18001/18002/15173/15174).
2026-07-02 21:30:21 +08:00
chiguyong 96f459c27d docs: add brainstorm/plan decision artifacts + plan progress update
Add ce-brainstorm requirements doc and ce-plan plan doc for private board restrictions and scheme B bubbles (decision artifacts). Update 2026-07-02-002 plan with U6/U7 progress table. Add .compound-engineering/config.local.example.yaml from ce-setup. gitignore tmp_*.html and delete_old_cluster.sh.
2026-07-02 21:27:20 +08:00
chiguyong 8188e8861d feat(ui): scheme B neutral grayscale for board messages + assistant bubbles
expertIdentity.ts PALETTE -> neutral grayscale; useMessageRenderer.ts removes assistant fallback for board_* events; BoardRoundCard/MessageShell apply GitHub-style gray; chatStream.ts prefers event-provided moderator avatar/color; StickyModeHeader/Scene4/LoginView/types aligned.
2026-07-02 21:26:22 +08:00
chiguyong 32746652aa fix(board): persist moderator avatar/color in round_summary events
board_orchestrator.py: include moderator_avatar and moderator_color in
the round_summary event payload so downstream consumers have the
moderator's identity metadata.

chat.py: persist expert_avatar and expert_color from the event data into
the board_summary message metadata, ensuring avatar/color survive page
reload instead of falling back to defaults.
2026-07-02 21:24:13 +08:00
chiguyong 484b7ddb95 fix(dev): isolate dev environment ports and fix env loading
- docker-compose.yaml: production mode uses expose (container-only) for
  Redis/PostgreSQL instead of ports (host-mapped)
- docker-compose.dev.yml: dev override maps Redis 6381 and PostgreSQL 5435
  to avoid conflicts with other projects (pms-redis 6379, geo_redis 6380,
  geo_db 5433)
- config.py: fix empty env var handling — only skip .env override when
  os.environ[key] is non-empty; load .env, .env.dev, .env.local in sequence
- scripts/dev-start.sh: manage agentkit-specific Docker containers
- .gitignore: add .env.dev and .env.local (contain API keys)
2026-07-02 21:23:50 +08:00
chiguyong 754d70623c refactor(experts): replace brand colors with neutral grayscale palette
Update color field in 15 expert YAML configs to use neutral grayscale
and deep accent tones (gray 400-800, stone, amber, dark blue/green),
consistent with the expertIdentity.ts PALETTE and the project convention
for GitHub-style neutral UI coloring.
2026-07-02 21:22:50 +08:00
chiguyong 9e2ccf5ac9 chore: gitignore .understand-anything (local knowledge graph index)
The .understand-anything/ directory is a tool-generated local index,
not project code. Remove 4 tracked files from index and add to .gitignore.
2026-07-02 21:22:00 +08:00
chiguyong 7376005868 fix: 修复 transient state 重置口径 + ReAct 工具调用规则
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Bug 1: chatStore 三个 action 重置 boardState/debateState/collaborationState
- createConversation: 新增三态重置(原缺失,旧私董会状态泄漏到新会话)
- selectConversation: 统一为条件重置(prevConvId !== id),避免 force-reload 误清空
- deleteConversation: 补全 collaborationState 重置
- 附带:selectConversation 中 board_speech/board_summary 消息缺失
  expert_avatar/expert_color 时从 boardState.experts 兜底补全

Bug 2: ReAct _build_tool_use_prompt L0 规则调整
- 新增规则 1:涉及外部信息/实时数据/多步骤分析/不确定事实时必须使用工具
- 原规则 3 降为规则 4,收窄为仅在确实无需工具时可直接回答
- base_prompt 与工具描述不动(L1/L2 拆为独立 plan)

测试:5 前端 transient-state reset matrix + 6 后端 prompt rules 断言

Plan: docs/plans/2026-07-02-002-fix-transient-state-reset-and-react-tool-guidance-plan.md
2026-07-02 20:51:57 +08:00
chiguyong 78a7faa17b refactor: remove all emoji from agentkit
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Replace emoji across codebase: YAML avatars -> first char, frontend banners -> Ant Design Vue components, CLI status -> OK/FAIL/WARN labels, terminal -> [WARN]/[OK]/[PENDING], Bitable DB default -> table, App.vue font cleanup, test fixtures -> first char letters. shell.avatar type upgraded to string | Component.
2026-07-02 01:33:28 +08:00
chiguyong 36b0296730 fix: 私董会数据持久化修复 + emoji 移除计划
- 修复 board_started/expert_speech/round_summary/board_concluded 事件持久化
- 添加 is_board 标记到会话列表和详情接口
- 实现 restoreBoardStateFromMessages 从持久化消息恢复 boardState
- 添加 ChatSidebar 私董会徽章
- 添加 emoji 移除计划文档 (docs/plans/2026-07-02-001)
2026-07-02 01:07:12 +08:00
Fischer ba0baabfcd Merge PR #15: docs: compound streaming-event-contract-residuals learning
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Docs-only change. ce-compound Full mode knowledge sedimentation for PR #14 residuals.
2026-07-01 13:53:46 +08:00
chiguyong fe93b0f2a4 docs: compound streaming-event-contract-residuals learning
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Knowledge sedimentation for PR #14's 4 residual findings (1 P1 + 3 P2)
from ce-code-review of feat/ui-ue-enhancement. ce-compound Full mode run.

Created:
- docs/solutions/integration-issues/streaming-event-contract-residuals.md
  Bug-track doc covering the 4-fix cluster: expert_step payload alignment,
  execute_stream CancellationToken registration, team_synthesis orphan
  milestone cleanup, synthesis_id dedup. Includes code examples, root cause
  analysis, and prevention strategies (streaming contract testing,
  cancellation registration checklist, terminal event symmetry, milestone
  identifier pattern).

Updated:
- AGENTS.md: WebSocket Chat 协议 section expanded with streaming event
  types (expert_step/expert_result_chunk/team_synthesis_chunk), synthesis_id
  dedup contract, and execute_stream cancellation contract.
- CONCEPTS.md: Added "Streaming Milestone" entry to Expert Orchestration
  cluster — the UI pattern for streaming progress indicators that transition
  through streaming → completed|error states, including orphan failure mode
  and synthesis_id matching semantics.

Overlap with existing docs/solutions/runtime-errors/streaming-event-whitelist-and-accumulation.md
is MODERATE (same area, different specific bugs). Flagged for potential
consolidation via ce-compound-refresh.
2026-07-01 13:53:10 +08:00
Fischer 8e8843c363 Merge PR #14: fix(experts): resolve residual review findings from PR #13
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Merges fix/ui-ue-residual-findings → main.

Addresses 4 actionable findings (1 P1 + 3 P2) from ce-code-review of PR #13:
- P1: expert_step payload alignment (_phase_executor.py)
- P2 #1: execute_stream CancellationToken registration (config_driven.py)
- P2 #2: team_synthesis orphan milestone cleanup (orchestrator.py)
- P2 #3: synthesis_id dedup (orchestrator.py + types.ts + chatStream.ts)

Verification: ruff clean, pytest 14/14, typecheck clean, vitest 126/127 (1 baseline)
2026-07-01 13:37:52 +08:00
chiguyong 47a437c5e3 fix(experts): resolve residual review findings from PR #13
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Addresses 4 actionable findings (1 P1 + 3 P2) from ce-code-review of
feat/ui-ue-enhancement (PR #13), now merged to main (8066e0b).

P1 — expert_step payload alignment (_phase_executor.py)
  The thinking/tool_call/tool_result event payloads were missing the
  fields the frontend WsServerMessage contract requires
  (expert_name/expert_color/content/step). Frontend code consuming these
  events silently degraded. Now all expert_step broadcasts carry the
  full contract; tool_call/tool_result keep step_data for the raw payload.

P2 #1 — execute_stream CancellationToken registration (config_driven.py)
  execute_stream() bypassed BaseAgent.execute() and never registered a
  CancellationToken, so cancel_task() could not cooperatively cancel a
  streaming task. Now registers the token and cleans it up in finally.

P2 #2 — team_synthesis orphan milestone cleanup (orchestrator.py)
  If synthesis streaming was interrupted (cancel/exception), no terminal
  team_synthesis event was emitted, leaving the frontend streaming
  milestone spinning forever. Now an inner try/except emits a terminal
  team_synthesis with status=cancelled|error before re-raising, so the
  frontend can finalize the milestone. The success path also carries
  the synthesis_id.

P2 #3 — synthesis_id dedup (orchestrator.py + types.ts + chatStream.ts)
  Without an identifier, the frontend could not precisely match a
  team_synthesis terminal event to its streaming milestone (especially
  across retries/concurrent teams). The backend now injects a stable
  synthesis_id (`{plan.id}:synthesis`) into both team_synthesis_chunk
  and team_synthesis events; the frontend uses it for exact milestone
  matching and treats error/cancelled status as terminal.

Test updates
  - Updated test_thinking_events_forwarded_as_expert_step to assert the
    new payload contract (expert_id/name/color/content/step).
  - Added test_tool_call_events_forwarded_as_expert_step covering
    tool_call/tool_result payload shape (content=tool_name摘要 +
    step_data=原始 payload).

Verification
  - ruff check: clean
  - pytest tests/unit/experts/test_phase_executor_streaming.py: 14/14
  - npm run typecheck: clean
  - vitest: 126/127 (1 unrelated baseline failure in tauri-auth.test.ts)

Residuals doc: docs/residual-review-findings/feat-ui-ue-enhancement.md
2026-07-01 13:26:19 +08:00
Fischer 8066e0bf8b Merge PR #13: feat: UI/UE enhancement — streaming, sticky header, hover actions, calendar tokens
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Implements U1-U7 UI/UE enhancement with streaming, sticky header, hover actions, calendar tokens.

Review fixes applied (P0 whitelist + P0 double accumulation + P2 exception handling).
See PR #13 description for details.
2026-07-01 13:15:33 +08:00
chiguyong 4866a16109 docs: compound streaming-event-whitelist-and-accumulation learning
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Captures the ReAct streaming contract bug + WS event whitelist governance
from PR #13's review fixes. Three intertwined runtime issues documented:

1. P0: final_answer double-accumulated token content (logic_error)
2. P0: _VALID_TEAM_EVENT_TYPES whitelist missing 3 new streaming event types
3. P2: except (RuntimeError, TimeoutError, ConnectionError) too narrow for
   LLMProviderError/ConfigValidationError in async generator

Adds ReAct Streaming Contract entry to CONCEPTS.md — defines the protocol
execute_stream() yields (token events with incremental content, then one
final_answer event with the concatenated full text). Consumers must pick
one accumulation strategy, cannot mix both without doubled output.
2026-07-01 13:15:01 +08:00
chiguyong f872a3fac6 feat: UI/UE enhancement — streaming, sticky header, hover actions, calendar tokens
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
U1 ThinkingBlock: streaming cursor + auto-collapse to summary bar
U2 StickyModeHeader: new component replacing ExpertTeamView + BoardStatusView
U3 Backend _phase_executor: execute_stream() with token/thinking/final_answer forwarding
U4 Frontend chatStream: expert_result_chunk/team_synthesis_chunk token accumulation
U5 AssistantText: routing tag hover fade-in
U6 UserBubble: hover actions (copy/delete/refill)
U7 CalendarGrid: token-based color redesign

Review fixes (ce-code-review):
- P0: _VALID_TEAM_EVENT_TYPES whitelist adds 3 new streaming event types
- P0: final_answer no longer double-accumulates token content
- P2: exception handling expanded to except Exception for LLMProviderError etc.

Simplification (ce-simplify-code):
- _synthesizer.py: O(n²) concat -> list+join, _concat_results extraction
- config_driven.py: 4 duplicate _handle_*_stream -> _wrap_sync_as_stream
- chatStream.ts: 5x [...messages].reverse().find() -> findLastMessage helper

Tests: pytest 13/13, vitest 126/127 (1 baseline), typecheck pass, ruff clean
2026-07-01 12:51:45 +08:00
Ether 521f573d4a docs: compound any-and-except-exception-governance convention (#12)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
2026-07-01 08:16:54 +08:00
chiguyong 975b7c4e57 docs: compound any-and-except-exception-governance convention
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Record the strategies established during PR #8-#11 (1214+ tech debt
governance) for Any replacement priority, except Exception classification,
framework boundary preservation, and intentional-design retention.
2026-07-01 08:16:02 +08:00
Fischer c005642851 refactor: tech debt Wave 3+4 (tools/skills/mcp/rag/calendar/auth/cli/quality/channels/telemetry/session/bus/documents Any 治理) (#11)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
2026-07-01 08:08:36 +08:00
Fischer a778f816c5 refactor: tech debt Wave 1+2 (except Exception 收尾 + core/experts Any 治理) (#10)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
2026-07-01 03:54:53 +08:00
Fischer 838a05772e refactor: follow-up tech debt cleanup (except Exception + Any 治理) (#9)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
2026-07-01 03:03:02 +08:00
Fischer cc531d0663 refactor: systematic tech debt cleanup (U1-U5) (#8)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Merge PR #8: U1-U5 系统性技术债清理
2026-07-01 00:45:34 +08:00
chiguyong ec9a0a1f70 refactor(frontend): split chat.ts (2025 lines) into chatStore/chatSocket/chatStream (U5)
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
chatStore.ts (498 lines, <=500 target met): Pinia store entry composing useChatSocket + useChatStream; retains all actions + backward-compat export aliases.

chatSocket.ts (165 lines): resolveIncomingConvId pure fn + useChatSocket composable (connect/disconnect/heartbeat/reconnect).

chatStream.ts (1557 lines): dispatchWsEvent pure fn for 30+ WS event types + useChatStream composable. Exceeds plan ~300 estimate due to discriminated union breadth (each case 30-50 lines); core testability goal met.

8 components + chat-phase.test.ts migrated from @/stores/chat to @/stores/chatStore.

vitest: 35 new tests (chatStream 19 + chatSocket 13 + chat-phase 3) all green; typecheck passes.
2026-06-30 22:32:48 +08:00
chiguyong 1033346913 refactor(bitable,tools): replace Any with concrete types + Protocol (U4)
BitableRecord/FormulaResult/SessionState TypeAlias replace dict[str, Any]; _redis/_engine/_session_factory typed as object | None with TYPE_CHECKING Protocol (_RedisLike, _RecalcWorker); Coroutine[Any, Any, Any] retained as legitimate type param.

Baseline 40 : Any occurrences -> 0 across 6 in-scope files (target <=5). Deferred: repository.py/recalc_worker.py/ingestion/* (10 occurrences, separate PR).

ruff clean; 367 passed + 116 skipped (bitable + pipeline_state + tools).
2026-06-30 22:32:30 +08:00
chiguyong be5c4e09f8 refactor(core,experts): classify except Exception + structured ReviewResult (U3)
ReviewResult dataclass (passed/degraded/feedback) replaces tuple+[DEGRADED] prefix in _review_phase_output; 3 review_result WS payloads now carry degraded field (AE3).

except Exception narrowed to specific types across 10 files (core/react, rewoo, base, orchestrator, dispatcher, plan_exec_engine + experts/orchestrator, _phase_executor, _review_gate + orchestrator/pipeline_engine). Baseline 140 -> 66 occurrences (>=50% reduction).

Fix RuntimeError regression: review-gate + compression paths now catch RuntimeError (LLM/provider internal errors) to preserve degradation semantics. Test side_effect switched to functional form to avoid StopIteration on list exhaustion.

ruff clean; 135 key + 469 experts + 163 core tests pass.
2026-06-30 18:03:58 +08:00
chiguyong 47ee2449df refactor(experts): split TeamOrchestrator god class into 7 mixins (U2)
- Split 2085-line orchestrator.py into main class (592 lines) + 7 responsibility-focused mixins: PhaseExecutor, DebateRunner, ReviewGate, DivergenceDetector, RollbackHandler, Synthesizer, InterventionHandler.

- Mixin pattern preserves self access to shared state (_experts/_workspace/_broadcast_event); method bodies moved verbatim to minimize regression risk. Each mixin declares TYPE_CHECKING Protocol for shared state.

- Split _execute_execution_phase (~290 lines) into _prepare_phase_context/_run_agent_steps/_finalize_phase (each <=100 lines).

- All mixins <=400 lines, main class <=600 lines. [DEGRADED] prefix annotations preserved in ReviewGateMixin.

- 60 team_orchestrator tests pass (behavior unchanged), 469 experts tests pass, ruff clean.
2026-06-30 16:47:20 +08:00
chiguyong e61f98898f refactor(core): unify ReActEngine execute/execute_stream via async generator (U1)
- Convert _execute_loop to async generator yielding ReActEvent; both execute and execute_stream delegate to it, eliminating ~760 lines of duplicated loop logic (execute_stream 813 -> 53 lines).

- Add 'final_result' event_type carrying ReActResult; execute extracts result from final event, execute_stream forwards events (backward-compatible 'final_answer' retained).

- Unify _drain_phase_violations across both paths.

- Add 14 golden-trajectory characterization tests.

- Fix test_execute_stream_with_compressor mock gateway (chat_stream test-infra gap). 130 react tests pass, 762 core+experts pass, no regressions.
2026-06-30 16:07:00 +08:00
chiguyong 03b1e3d751 docs: add systematic tech debt cleanup plan (U1-U5) 2026-06-30 14:27:47 +08:00
chiguyong a3cecd4b50 fix(review): apply P0/P2 findings from dual-agent review
- Dockerfile: split ENTRYPOINT/CMD to align with docker-compose serve
- test_termbase: guard jieba import with pytest.importorskip
- orchestrator: mark silent review-degradation with [DEGRADED] prefix
- chat.py: accurate ExecutionMode log message
- agentkit.yaml: document OTel exporter config
- skill_routing: replace 12 Any with object/typed (AGENTS.md compliance)
- AssistantText.vue: add aria-live/role for a11y
2026-06-30 14:27:46 +08:00
Fischer 0962df11b5 feat(agent): Wave 4 PLAN_EXEC Hardening (U1-U5) (#7)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Merge PR #7: Wave 4 PLAN_EXEC Hardening — U1-U5 + ce-code-review fixes + PLAN_EXEC concepts docs.
2026-06-30 12:46:35 +08:00
chiguyong a872a459a6 docs: add PLAN_EXEC concepts + commit Wave 4 plan
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
CONCEPTS.md: new PLAN_EXEC section (Phase State Machine, PhasePolicy, Phase Violation, AdvancePhaseTool, _build_phase_engine).

docs/plans/: commit the Wave 4 plan document (was untracked).
2026-06-30 12:46:24 +08:00
chiguyong 8627777f87 fix(review): apply ce-code-review findings
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Six safe fixes from Stage 5c review:

phase.py: delete dead _DEFAULT_BASH_FILTER constant (no references after U1)
chat.py: drop Any from _build_phase_engine params (AGENTS.md prohibits any)
chat.ts: delete stale comment about phase_changed emission
chat-phase.test.ts: rename misleading 'capped at 5' test name
test_chat_plan_exec_ws.py: tighten test_rest_react_mode_still_works assertion
test_plan_exec_e2e.py: clarify test_auto_advance assertion comment

Known limitations documented in PR description (not fixed): loop detector + advance_phase (P1), parallel path phase_violation ordering (P2), REST cancellation_token (P2), Callable filter exceptions (P3).
2026-06-30 12:42:15 +08:00
chiguyong cbbe937940 chore(shell): fix ruff F401/F841 + apply ruff format
Pre-existing ruff errors surfaced during Wave 4 QC:
- F401: drop unused `TerminalSession` import (only `TerminalSessionManager` is used)
- F841: drop unused `start = time.monotonic()` local in `_execute_standalone`

`ruff format` then reformatted a few long lines in the same file
(frozenset literal, curl exfiltration regex, pipe operators, session.env
call). No behavior change — formatting only.

Why now: shell.py was already touched by U1 (widen
`bash_command_filter`). Leaving known ruff failures in a file this PR
modifies would make future CI gates noisy.
2026-06-30 11:52:51 +08:00