fix: 修复 transient state 重置口径 + ReAct 工具调用规则 #17

Closed
fischer wants to merge 0 commits from fix/transient-state-reset-and-react-tool-guidance into main
Owner

Summary

修复两个 bug:创建新对话时私董会顶部标题未清空(transient state 泄漏),以及 Agent 面对复杂需求时未自动调用工具(ReAct prompt 规则措辞问题)。

Changes

Bug 1: Transient state 重置口径不一致

chatStoreboardState / debateState / collaborationState 三个 stream-owned ref 在三个 action 中的重置逻辑不一致:

Action 修复前 修复后
createConversation 缺失三态重置 新增三态重置
selectConversation 仅无条件重置 collaborationState 条件重置三态(prevConvId !== id),避免 force-reload 误清空
deleteConversation 漏了 collaborationState 补全三态重置

附带selectConversationboard_speech / board_summary 消息缺失 expert_avatar / expert_color 时从 boardState.experts 兜底补全,保证 StickyModeHeader 头像与 MessageShell 头像颜色一致。

Bug 2: ReAct _build_tool_use_prompt L0 规则调整

原规则 3 "如果不需要工具就能回答,直接回答即可" 给 LLM 偷懒窗口,导致面对 GitHub Trending 等复杂需求时不调用工具。

调整:

  • 新增规则 1:涉及外部信息、实时数据、多步骤分析或你不确定的事实时必须使用工具
  • 原规则 3 降为规则 4:仅在确实无需工具时可直接回答
  • base_prompt 与工具描述不动(L1/L2 拆为独立 plan)

Bug 2 状态:hypothesis applied, pending L4 verification(非 fixed)。

Test Coverage

  • 前端:5 个 transient state reset matrix 测试(chatStore.transient-state.test.ts)
    • createConversation 重置三态
    • selectConversation 切换会话重置三态
    • selectConversation force-reload 同一会话不重置
    • deleteConversation 当前会话重置三态
    • deleteConversation 非当前会话不动三态
  • 后端:6 个 prompt rules 断言(test_react_engine.py)
    • TestReActToolUsePromptRules(4 tests):新规则 1 存在、旧规则 3 消失、规则顺序正确、XML 格式保留
    • TestBug2L0PromptRules(2 tests):web_search 工具描述在 prompt 中、带工具时新规则 1 存在

Plan

docs/plans/2026-07-02-002-fix-transient-state-reset-and-react-tool-guidance-plan.md

Validation

  • ruff check src/agentkit/core/react.py — clean
  • npm run typecheck — clean
  • 前端单测 5/5 pass
  • 后端单测 6/6 pass

Notes

  • agent-browser CLI 未安装,ce-test-browser 步骤跳过;变更已由 11 个单测覆盖
  • Bug 2 L1(工具描述扩展)和 L2(PLAN_EXEC 启用)拆为独立 plan
## Summary 修复两个 bug:创建新对话时私董会顶部标题未清空(transient state 泄漏),以及 Agent 面对复杂需求时未自动调用工具(ReAct prompt 规则措辞问题)。 ## Changes ### Bug 1: Transient state 重置口径不一致 `chatStore` 中 `boardState` / `debateState` / `collaborationState` 三个 stream-owned ref 在三个 action 中的重置逻辑不一致: | Action | 修复前 | 修复后 | |--------|--------|--------| | `createConversation` | 缺失三态重置 | 新增三态重置 | | `selectConversation` | 仅无条件重置 `collaborationState` | 条件重置三态(`prevConvId !== id`),避免 force-reload 误清空 | | `deleteConversation` | 漏了 `collaborationState` | 补全三态重置 | **附带**:`selectConversation` 中 `board_speech` / `board_summary` 消息缺失 `expert_avatar` / `expert_color` 时从 `boardState.experts` 兜底补全,保证 StickyModeHeader 头像与 MessageShell 头像颜色一致。 ### Bug 2: ReAct `_build_tool_use_prompt` L0 规则调整 原规则 3 "如果不需要工具就能回答,直接回答即可" 给 LLM 偷懒窗口,导致面对 GitHub Trending 等复杂需求时不调用工具。 调整: - 新增规则 1:涉及外部信息、实时数据、多步骤分析或你不确定的事实时必须使用工具 - 原规则 3 降为规则 4:仅在确实无需工具时可直接回答 - `base_prompt` 与工具描述不动(L1/L2 拆为独立 plan) Bug 2 状态:hypothesis applied, pending L4 verification(非 fixed)。 ## Test Coverage - 前端:5 个 transient state reset matrix 测试(chatStore.transient-state.test.ts) - createConversation 重置三态 - selectConversation 切换会话重置三态 - selectConversation force-reload 同一会话不重置 - deleteConversation 当前会话重置三态 - deleteConversation 非当前会话不动三态 - 后端:6 个 prompt rules 断言(test_react_engine.py) - TestReActToolUsePromptRules(4 tests):新规则 1 存在、旧规则 3 消失、规则顺序正确、XML 格式保留 - TestBug2L0PromptRules(2 tests):web_search 工具描述在 prompt 中、带工具时新规则 1 存在 ## Plan docs/plans/2026-07-02-002-fix-transient-state-reset-and-react-tool-guidance-plan.md ## Validation - ruff check src/agentkit/core/react.py — clean - npm run typecheck — clean - 前端单测 5/5 pass - 后端单测 6/6 pass ## Notes - agent-browser CLI 未安装,ce-test-browser 步骤跳过;变更已由 11 个单测覆盖 - Bug 2 L1(工具描述扩展)和 L2(PLAN_EXEC 启用)拆为独立 plan
fischer added 1 commit 2026-07-02 20:53:08 +08:00
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
7376005868
fix: 修复 transient state 重置口径 + ReAct 工具调用规则
Bug 1: chatStore 三个 action 重置 boardState/debateState/collaborationState
- createConversation: 新增三态重置(原缺失,旧私董会状态泄漏到新会话)
- selectConversation: 统一为条件重置(prevConvId !== id),避免 force-reload 误清空
- deleteConversation: 补全 collaborationState 重置
- 附带:selectConversation 中 board_speech/board_summary 消息缺失
  expert_avatar/expert_color 时从 boardState.experts 兜底补全

Bug 2: ReAct _build_tool_use_prompt L0 规则调整
- 新增规则 1:涉及外部信息/实时数据/多步骤分析/不确定事实时必须使用工具
- 原规则 3 降为规则 4,收窄为仅在确实无需工具时可直接回答
- base_prompt 与工具描述不动(L1/L2 拆为独立 plan)

测试:5 前端 transient-state reset matrix + 6 后端 prompt rules 断言

Plan: docs/plans/2026-07-02-002-fix-transient-state-reset-and-react-tool-guidance-plan.md
fischer added 10 commits 2026-07-02 22:13:55 +08:00
9e2ccf5ac9 chore: gitignore .understand-anything (local knowledge graph index)
The .understand-anything/ directory is a tool-generated local index,
not project code. Remove 4 tracked files from index and add to .gitignore.
754d70623c refactor(experts): replace brand colors with neutral grayscale palette
Update color field in 15 expert YAML configs to use neutral grayscale
and deep accent tones (gray 400-800, stone, amber, dark blue/green),
consistent with the expertIdentity.ts PALETTE and the project convention
for GitHub-style neutral UI coloring.
484b7ddb95 fix(dev): isolate dev environment ports and fix env loading
- docker-compose.yaml: production mode uses expose (container-only) for
  Redis/PostgreSQL instead of ports (host-mapped)
- docker-compose.dev.yml: dev override maps Redis 6381 and PostgreSQL 5435
  to avoid conflicts with other projects (pms-redis 6379, geo_redis 6380,
  geo_db 5433)
- config.py: fix empty env var handling — only skip .env override when
  os.environ[key] is non-empty; load .env, .env.dev, .env.local in sequence
- scripts/dev-start.sh: manage agentkit-specific Docker containers
- .gitignore: add .env.dev and .env.local (contain API keys)
32746652aa fix(board): persist moderator avatar/color in round_summary events
board_orchestrator.py: include moderator_avatar and moderator_color in
the round_summary event payload so downstream consumers have the
moderator's identity metadata.

chat.py: persist expert_avatar and expert_color from the event data into
the board_summary message metadata, ensuring avatar/color survive page
reload instead of falling back to defaults.
8188e8861d feat(ui): scheme B neutral grayscale for board messages + assistant bubbles
expertIdentity.ts PALETTE -> neutral grayscale; useMessageRenderer.ts removes assistant fallback for board_* events; BoardRoundCard/MessageShell apply GitHub-style gray; chatStream.ts prefers event-provided moderator avatar/color; StickyModeHeader/Scene4/LoginView/types aligned.
96f459c27d docs: add brainstorm/plan decision artifacts + plan progress update
Add ce-brainstorm requirements doc and ce-plan plan doc for private board restrictions and scheme B bubbles (decision artifacts). Update 2026-07-02-002 plan with U6/U7 progress table. Add .compound-engineering/config.local.example.yaml from ce-setup. gitignore tmp_*.html and delete_old_cluster.sh.
b98e7cb42f test: update login test to expect standardized port 18001
The test was asserting port 8001 (old default) but config.py now loads .env.dev which sets AGENTKIT_SERVER_PORT=18001 per the project port standardization (18001/18002/15173/15174).
44f4f1c46f fix: add null check for chatStore.conversations in StickyModeHeader
Optional chaining prevents TypeError when test mocks don't provide conversations array.
53347ed1fe test(u6): add L4 real-LLM smoke test for ReAct tool-use prompt
Manual smoke test verifying U4 L0 prompt rule rearrangement under real
LLM calls (bailian-coding/qwen3.7-plus). 5 probe queries covering
external_info / realtime_data / multi_step / realtime_simple / no_tool.

Results:
- Probe #1 external_info: PASS (8 web_search calls, 99.9s)
- Probe #2 realtime_data: ERROR (120s timeout, not LLM refusal)
- Probe #3 multi_step: PASS (8 web_search calls, 62.6s)
- Probe #4 realtime_data_simple: PASS (3 web_search calls, 23.8s)
- Probe #5 no_tool_escape_hatch: PASS (0 tool calls, direct answer, 4.2s)

Verdict: 3/4 tool-call pass (>=3/4 threshold) + 1/1 direct pass
Bug 2 status upgraded to 'L4 verified'.

Plan Progress table updated: U6 done, U7 done.
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
23160be055
fix(types): resolve 3 pre-existing typecheck errors in transient-state test
message_type: 'board_started' as const (line 93) fixes TS2322 on lines
107 and 122 — TypeScript was inferring message_type as string instead
of the literal 'board_started'.

boardState local variable: replace 'as never' with proper shape +
'status: discussing' as const (line 159-160) fixes TS2339 on line 168
where .topic was accessed on type 'never'.

All 5 transient-state tests still pass. vue-tsc --noEmit now clean.
fischer closed this pull request 2026-07-02 22:26:01 +08:00
Some checks failed
Test / backend-test (pull_request) Has been cancelled
Test / frontend-unit (pull_request) Has been cancelled
Test / api-e2e (pull_request) Has been cancelled
Test / frontend-e2e (pull_request) Has been cancelled

Pull request closed

Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: fischer/fischer-agentkit#17
No description provided.