Test / backend-test (pull_request) Has been cancelledDetails
Test / frontend-unit (pull_request) Has been cancelledDetails
Test / api-e2e (pull_request) Has been cancelledDetails
Test / frontend-e2e (pull_request) Has been cancelledDetails
Replace emoji across codebase: YAML avatars -> first char, frontend banners -> Ant Design Vue components, CLI status -> OK/FAIL/WARN labels, terminal -> [WARN]/[OK]/[PENDING], Bitable DB default -> table, App.vue font cleanup, test fixtures -> first char letters. shell.avatar type upgraded to string | Component.
Test / backend-test (pull_request) Has been cancelledDetails
Test / frontend-unit (pull_request) Has been cancelledDetails
Test / api-e2e (pull_request) Has been cancelledDetails
Test / frontend-e2e (pull_request) Has been cancelledDetails
Addresses 4 actionable findings (1 P1 + 3 P2) from ce-code-review of
feat/ui-ue-enhancement (PR #13), now merged to main (8066e0b).
P1 — expert_step payload alignment (_phase_executor.py)
The thinking/tool_call/tool_result event payloads were missing the
fields the frontend WsServerMessage contract requires
(expert_name/expert_color/content/step). Frontend code consuming these
events silently degraded. Now all expert_step broadcasts carry the
full contract; tool_call/tool_result keep step_data for the raw payload.
P2 #1 — execute_stream CancellationToken registration (config_driven.py)
execute_stream() bypassed BaseAgent.execute() and never registered a
CancellationToken, so cancel_task() could not cooperatively cancel a
streaming task. Now registers the token and cleans it up in finally.
P2 #2 — team_synthesis orphan milestone cleanup (orchestrator.py)
If synthesis streaming was interrupted (cancel/exception), no terminal
team_synthesis event was emitted, leaving the frontend streaming
milestone spinning forever. Now an inner try/except emits a terminal
team_synthesis with status=cancelled|error before re-raising, so the
frontend can finalize the milestone. The success path also carries
the synthesis_id.
P2 #3 — synthesis_id dedup (orchestrator.py + types.ts + chatStream.ts)
Without an identifier, the frontend could not precisely match a
team_synthesis terminal event to its streaming milestone (especially
across retries/concurrent teams). The backend now injects a stable
synthesis_id (`{plan.id}:synthesis`) into both team_synthesis_chunk
and team_synthesis events; the frontend uses it for exact milestone
matching and treats error/cancelled status as terminal.
Test updates
- Updated test_thinking_events_forwarded_as_expert_step to assert the
new payload contract (expert_id/name/color/content/step).
- Added test_tool_call_events_forwarded_as_expert_step covering
tool_call/tool_result payload shape (content=tool_name摘要 +
step_data=原始 payload).
Verification
- ruff check: clean
- pytest tests/unit/experts/test_phase_executor_streaming.py: 14/14
- npm run typecheck: clean
- vitest: 126/127 (1 unrelated baseline failure in tauri-auth.test.ts)
Residuals doc: docs/residual-review-findings/feat-ui-ue-enhancement.md
U4: ExpertTeam accepts redis_client, passes to SharedWorkspace. After phase
completion, full result is written to workspace and in-memory phase.result
is replaced with a 500-char summary + _ref_key. Dependency output reading
resolves offloaded content from workspace on demand, with graceful fallback
to summary on read failure.
Tests: 8 scenarios (offload creation, short content, dependency resolution,
workspace failure fallback, non-offloaded passthrough, redis_client wiring,
memory dict fallback, pipeline integration) — all pass.
U2: Add asyncio.Semaphore to bound concurrent phase execution and debate
argument generation. Default limit=3, configurable via max_concurrent_phases.
Prevents LLM rate-limit spikes when many phases run in the same layer.
Tests: 5 scenarios (happy path, 5-phase edge case, serial mode, failure
release, debate integration) — all pass.
Implement _execute_debate_phase() with Lead-facilitated structured debate:
- Lead opens with divergence point + dependency context
- Experts argue in parallel per round (asyncio.gather)
- Lead summarizes each round, then adjudicates final verdict
- Verdict produces decision (adopt/compromise/shelve/inconclusive) + conclusion
- Conclusion written to SharedWorkspace for downstream phases
Escape hatches:
- debate_config.skip=true short-circuits with template text
- MAX_DEBATE_ROUNDS=4 hard cap on rounds
- User /stop intervention ends debate early (U4-compatible via getattr fallback)
- LLM unavailable falls back to template verdict, no crash
New events: debate_started, expert_argument, debate_round_summary,
debate_resolved (plus existing phase_completed for consistency).
Phase dispatcher (_execute_phase) routes by phase_type:
EXECUTION to _execute_execution_phase, DEBATE to _execute_debate_phase.
36 new tests in test_orchestrator_debate.py covering happy path (2 rounds,
2 experts), max_rounds=1 boundary, empty participants, user stop, skip
escape hatch, LLM unavailable, SharedWorkspace integration, event
broadcasting, intervention channel compatibility, and helper methods.
All 377 expert tests pass.
Also includes planning artifacts (brainstorm requirements + implementation
plan with 6 units U1-U6).
U1: Data model foundation for structured debate collaboration.
- Add PhaseType enum (EXECUTION | DEBATE)
- Add phase_type and debate_config fields to PlanPhase
- Update to_dict/from_dict for serialization with backward compatibility
- Add tests for PhaseType, debate phase creation, serialization, and
mixed EXECUTION+DEBATE topological sort