Commit Graph

3 Commits

Author SHA1 Message Date
chiguyong 53347ed1fe test(u6): add L4 real-LLM smoke test for ReAct tool-use prompt
Manual smoke test verifying U4 L0 prompt rule rearrangement under real
LLM calls (bailian-coding/qwen3.7-plus). 5 probe queries covering
external_info / realtime_data / multi_step / realtime_simple / no_tool.

Results:
- Probe #1 external_info: PASS (8 web_search calls, 99.9s)
- Probe #2 realtime_data: ERROR (120s timeout, not LLM refusal)
- Probe #3 multi_step: PASS (8 web_search calls, 62.6s)
- Probe #4 realtime_data_simple: PASS (3 web_search calls, 23.8s)
- Probe #5 no_tool_escape_hatch: PASS (0 tool calls, direct answer, 4.2s)

Verdict: 3/4 tool-call pass (>=3/4 threshold) + 1/1 direct pass
Bug 2 status upgraded to 'L4 verified'.

Plan Progress table updated: U6 done, U7 done.
2026-07-02 22:08:45 +08:00
chiguyong 96f459c27d docs: add brainstorm/plan decision artifacts + plan progress update
Add ce-brainstorm requirements doc and ce-plan plan doc for private board restrictions and scheme B bubbles (decision artifacts). Update 2026-07-02-002 plan with U6/U7 progress table. Add .compound-engineering/config.local.example.yaml from ce-setup. gitignore tmp_*.html and delete_old_cluster.sh.
2026-07-02 21:27:20 +08:00
chiguyong 7376005868 fix: 修复 transient state 重置口径 + ReAct 工具调用规则
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Bug 1: chatStore 三个 action 重置 boardState/debateState/collaborationState
- createConversation: 新增三态重置(原缺失,旧私董会状态泄漏到新会话)
- selectConversation: 统一为条件重置(prevConvId !== id),避免 force-reload 误清空
- deleteConversation: 补全 collaborationState 重置
- 附带:selectConversation 中 board_speech/board_summary 消息缺失
  expert_avatar/expert_color 时从 boardState.experts 兜底补全

Bug 2: ReAct _build_tool_use_prompt L0 规则调整
- 新增规则 1:涉及外部信息/实时数据/多步骤分析/不确定事实时必须使用工具
- 原规则 3 降为规则 4,收窄为仅在确实无需工具时可直接回答
- base_prompt 与工具描述不动(L1/L2 拆为独立 plan)

测试:5 前端 transient-state reset matrix + 6 后端 prompt rules 断言

Plan: docs/plans/2026-07-02-002-fix-transient-state-reset-and-react-tool-guidance-plan.md
2026-07-02 20:51:57 +08:00