fischer-agentkit

History

chiguyong 53347ed1fe test(u6): add L4 real-LLM smoke test for ReAct tool-use prompt Manual smoke test verifying U4 L0 prompt rule rearrangement under real LLM calls (bailian-coding/qwen3.7-plus). 5 probe queries covering external_info / realtime_data / multi_step / realtime_simple / no_tool. Results: - Probe #1 external_info: PASS (8 web_search calls, 99.9s) - Probe #2 realtime_data: ERROR (120s timeout, not LLM refusal) - Probe #3 multi_step: PASS (8 web_search calls, 62.6s) - Probe #4 realtime_data_simple: PASS (3 web_search calls, 23.8s) - Probe #5 no_tool_escape_hatch: PASS (0 tool calls, direct answer, 4.2s) Verdict: 3/4 tool-call pass (>=3/4 threshold) + 1/1 direct pass Bug 2 status upgraded to 'L4 verified'. Plan Progress table updated: U6 done, U7 done.		2026-07-02 22:08:45 +08:00
..
brainstorms	docs: add brainstorm/plan decision artifacts + plan progress update	2026-07-02 21:27:20 +08:00
plans	test(u6): add L4 real-LLM smoke test for ReAct tool-use prompt	2026-07-02 22:08:45 +08:00
research	feat: 私董会讨论模式 + 回测集成 + WS持久化修复	2026-06-17 23:52:53 +08:00
residual-review-findings	feat: UI/UE enhancement — streaming, sticky header, hover actions, calendar tokens	2026-07-01 12:51:45 +08:00
solutions	docs: compound streaming-event-contract-residuals learning	2026-07-01 13:53:10 +08:00
DEPLOYMENT-GITEA-ACTIONS.md	feat: 私董会讨论模式 + 回测集成 + WS持久化修复	2026-06-17 23:52:53 +08:00
GEO-INTEGRATION-GUIDE.md	feat: accumulated frontend enhancements, docs, and static assets	2026-06-14 16:35:01 +08:00