28 KiB
| title | date | type | status | origin | execution |
|---|---|---|---|---|---|
| feat: Agent Wave 3 strategic coupling (G5/G6) | 2026-06-29 | feat | draft | docs/brainstorms/2026-06-29-advanced-agent-gap-optimization-requirements.md | code |
Wave 3 Strategic Coupling — G5 Function-level Sharding + G6 SOLO Phase Constraints
Summary
Wave 3 of the advanced-agent gap optimization closes two strategic gaps deferred from Waves 1-2:
- G5 — Function-level code sharding (R22, R23): file reading gains an optional
symbolparameter for symbol/function-granularity slicing, backward compatible with full-file reads. - G6 — SOLO four-stage state machine (R24, R25): ReAct loop enforces per-phase tool whitelists (Planning → Building → Verification → Delivery). Extends existing
ExecutionMode.PLAN_EXECrather than introducing a new mode.
Wave 1 (G1/G2/G3/G8 — PR #4 merged) and Wave 2 (G4/G7/G9 — PR #5 open) shipped independently. Wave 3 is the strategic-risk wave: it introduces a new tool (G5) and touches ReAct core (G6). Per the brainstorm's KTD6/KTD7 locked decisions, G5 integration approach is decided here (not deferred further), and G6 extends PLAN_EXEC rather than adding a new mode.
Problem Frame
The brainstorm (docs/brainstorms/2026-06-29-advanced-agent-gap-optimization-requirements.md) identified nine gaps across three dimensions. Waves 1-2 closed seven (G1-G4, G7-G9). The remaining two gaps are strategic:
- G5 (Long-context cost): Large files (e.g. 5000-line modules) blow context budget when the agent only needs one function. Today the agent shells out to
cat file.pyor greps; both pull the whole file into context. A symbol-aware slice would cut context cost 10-50x for typical edits. - G6 (Unsafe tool sequencing): ReAct loop lets the LLM call
write_fileduring early exploration before committing to a plan. This wastes tokens on premature edits, causes half-baked refactors, and breaks the "plan-then-build" discipline that production agents (Qoder, Trae Work) enforce via phase state machines.
KTD6 (brainstorm) defers the G5 integration decision to this plan. KTD7 locks G6 to extend PLAN_EXEC rather than introduce a new mode.
Requirements
Carried forward from the brainstorm, Wave 3 section:
- R22: File reading supports symbol/function granularity sharding.
- R23: Sharding capability exposed as a tool parameter (
symbol="function_name"), backward compatible with full-file reads. - R24: ReAct loop enforces phase constraints — Planning phase only allows
think/search; Building phase only allowswrite_file(and similar write tools). - R25: Phase state is configurable; extends
ExecutionMode.PLAN_EXEC, does NOT introduce a new mode.
Cross-cutting (brainstorm):
- R26: All optimizations configurable via
agentkit.yaml(followServerConfig.from_dictpattern established in Waves 1-2). - R27: Each optimization ships a minimal self-check test (ponytail rule).
Acceptance examples relevant to Wave 3:
- AE5 already covered by Wave 2 (G9) — not in Wave 3 scope.
- No explicit AE for G5/G6 in the brainstorm — the plan below specifies test scenarios as the acceptance contract.
Key Technical Decisions
KTD1: G5 uses Python ast + language-aware regex, NOT tree-sitter
Decision: Implement symbol extraction with the Python stdlib ast module for Python files, and a small regex-based extractor for TypeScript/JavaScript/Go/Rust/Java. No new dependency.
Rationale:
tree-sitterrequires native compilation + per-language grammar files (~30MB installed) — violates the ponytail rule "no new dependency if it can be avoided" and AGENTS.md "禁止使用 any 类型" → prefers minimal stack.astis stdlib, always available, parses Python accurately.- Regex extractor covers 80% case for TS/JS/Go/Rust/Java (function/class/struct declarations); falls back to "no symbols found → read full file" gracefully.
- If a future Wave 4 needs more accurate multi-language parsing,
tree-sittercan replace the regex layer behind the sameSymbolExtractorinterface.
Upgrade path: replace RegexSymbolExtractor with TreeSitterSymbolExtractor implementing the same SymbolExtractor protocol; no caller changes.
KTD2: G5 adds a new ReadFileTool, does not extend ShellTool
Decision: Add a dedicated ReadFileTool in src/agentkit/tools/file_read.py with path + optional symbol + optional start_line/end_line parameters.
Rationale:
ShellToolis for shell execution; grafting symbol extraction onto it muddies the contract.- A dedicated tool gives the LLM a clear schema (
{"path": "...", "symbol": "function_name"}) and a focused system-prompt description. - Aligns with the existing
_DEFAULT_CORE_TOOLSlist incore/react.py:148which already referencesread_file— the name is reserved but the implementation is missing.
KTD3: G6 phase state machine lives in ReActEngine, not in the skill config
Decision: Phase state (Planning/Building/Verification/Delivery) is tracked as a mutable field on ReActEngine instance. Transitions are driven by LLM-detected phase-completion signals (e.g., the LLM emits Phase: Building in its thinking) OR by an explicit advance_phase tool.
Rationale:
- Skill config declares the policy (which tools per phase, auto-advance vs manual); the engine enforces it per-step. This matches R24 ("ReAct 循环加阶段约束").
- Alternative considered: phase state in the agent instance (not engine). Rejected because ReActEngine already owns
max_steps/verification_enabledetc.; phase state belongs with the loop that enforces it.
KTD4: PLAN_EXEC mode is wired at chat.py WebSocket path (REST already has fallback chain from Wave 2)
Decision: chat.py:1084 (currently warns "not yet supported, falling back to REACT") will route ExecutionMode.PLAN_EXEC to a new _execute_plan_exec_ws handler that constructs PhasePolicy from ServerConfig.plan_exec and passes it to ReActEngine.execute.
Rationale:
- REST
send_messagealready uses the Wave 2 three-tier fallback chain; PLAN_EXEC at REST would also need that wrapper. Out of scope for Wave 3 — only WebSocket path is wired. REST PLAN_EXEC remains "not yet supported" and explicitly raises if invoked. - Single integration point keeps Wave 3 scope bounded; REST wiring is a one-line follow-up once WebSocket path is proven.
KTD5: Default phase whitelist matches brainstorm R24
Decision: Default whitelist:
Planning:search,tool_search,read_file,bash(read-only commands likegit status,ls)Building:write_file,bash(write commands),read_file,searchVerification:bash(test commands),read_file,searchDelivery: all tools (final synthesis)
Rationale:
- R24 explicitly names
think/searchfor Planning andwrite_filefor Building. bashis split: read-only in Planning, full in Building. Enforced by adding abash_command_filtercallback (regex-based, blocksrm/mv/>/>>in Planning/Verification).Deliveryallows all tools to support last-mile formatting/cleanup.
KTD6: Phase transitions are LLM-driven via advance_phase tool (opt-in auto-advance)
Decision: Add an AdvancePhaseTool that the LLM can call to transition Planning→Building→Verification→Delivery. Auto-advance (after N steps in current phase) is opt-in via plan_exec.auto_advance_after_steps.
Rationale:
- LLM-driven transitions match Qoder/Trae Work pattern: LLM declares "planning done" explicitly.
- Auto-advance is a safety net for LLMs that forget to call
advance_phase; default off (ponytail: less code is better).
Scope Boundaries
In Scope
- New
ReadFileToolwithsymbolparameter (G5, R22/R23). SymbolExtractorprotocol +AstSymbolExtractor(Python) +RegexSymbolExtractor(TS/JS/Go/Rust/Java).PhasePolicydataclass +PhaseStateenum + per-step tool whitelist enforcement inReActEngine.execute(G6, R24/R25).AdvancePhaseToolfor LLM-driven phase transitions.- WebSocket chat path routes
PLAN_EXECto new_execute_plan_exec_wshandler (KTD4). plan_execconfig section inagentkit.yaml+ServerConfig.from_dictextension (R26).- Tests for each new module (R27).
Out of Scope (Deferred to Follow-Up Work)
- REST
send_messagePLAN_EXEC wiring — once WebSocket path is proven, REST wiring is a follow-up commit. tree-sitterintegration for more accurate multi-language parsing (KTD1 upgrade path).- Phase-aware prompt engineering (per-phase system prompt templates) — current plan keeps a single system prompt; phase-specific guidance is a prompt-engineering concern, not a code change.
- Phase persistence across session resume (U7 checkpoint already saves plan state; phase state restoration is a separate concern).
- Phase rollback on
Building→Planningregression (Wave 2 G9 rollback handles file-level rollback; phase regression is a UX/prompt concern). - Tool-filter UI in the frontend (Wave 3 ships backend-only; frontend surfaces phase via existing event channel if needed in a follow-up).
Outside This Product's Identity
- Replacing the existing ReAct loop with LangGraph (inherited from brainstorm).
- Disc-based file system à la DeerFlow (inherited).
- Docker sandbox (inherited; only command-level safety via
bash_command_filter).
High-Level Technical Design
flowchart LR
subgraph G5[Function Sharding]
RF[ReadFileTool] --> SE[SymbolExtractor protocol]
SE --> AST[AstSymbolExtractor<br/>Python stdlib ast]
SE --> RX[RegexSymbolExtractor<br/>TS/JS/Go/Rust/Java]
end
subgraph G6[Phase State Machine]
PP[PhasePolicy config] --> PS[PhaseState enum<br/>Planning/Building/Verify/Delivery]
PS --> Filt[Tool filter per step]
Filt --> RE[ReActEngine.execute]
AP[AdvancePhaseTool] -->|transitions| PS
end
RF -->|file content for symbol| RE
RE -->|enforces| Filt
The two subsystems compose at the ReAct engine boundary: ReadFileTool is one of the tools the LLM can call during any phase (filtered by PhasePolicy); PhaseState is enforced at the tool-call step before dispatch.
Implementation Units
U1. SymbolExtractor + ReadFileTool (G5)
Goal: Add ReadFileTool with optional symbol parameter; implement SymbolExtractor protocol with AstSymbolExtractor (Python) and RegexSymbolExtractor (TS/JS/Go/Rust/Java).
Requirements: R22, R23, R27.
Dependencies: none.
Files:
src/agentkit/tools/file_read.py(new)src/agentkit/tools/symbol_extractor.py(new)src/agentkit/tools/__init__.py(modify — registerReadFileTool)tests/unit/test_symbol_extractor.py(new)tests/unit/test_read_file_tool.py(new)
Approach:
SymbolExtractoris aProtocolwith one method:extract_symbols(content: str, language: str) -> list[SymbolSpan].SymbolSpancarriesname,kind(function/class/method/struct),start_line,end_line.AstSymbolExtractorwalksast.parse(content); for eachFunctionDef/AsyncFunctionDef/ClassDefcollects name + line range. Usesast.get_source_segmentstyle (line-based, not node-based, to keep the API simple).RegexSymbolExtractorships patterns for TS/JS (function X,const X = (...) =>,class X), Go (func X), Rust (fn X,struct X,impl X), Java (public ... X(...)). Falls back to "no symbols" if no pattern matches.ReadFileTool.execute(path, symbol=None, start_line=None, end_line=None):symbol=None→ read full file (backward compat with the existing_FakeToolbenchmark shape).symbol="foo"→ detect language from extension; callextract_symbols; return the line range of the first matching symbol; if no match, return an error result with available symbol names listed (so the LLM can retry).start_line/end_lineoverrides symbol; allows manual slicing.
- Tool registered as
read_file(matches the reserved name incore/react.py:148).
Execution note: characterization-first — write a test that asserts the tool returns the full file content when symbol=None (matches pre-existing benchmark _FakeTool shape) before adding symbol-extraction behavior.
Patterns to follow:
src/agentkit/tools/document_tool.pyfor tool structure (dataclass,Toolbase class,input_schema).src/agentkit/tools/schema_tools.py:SchemaExtractToolfor "extract-from-source" pattern.
Test scenarios (covers R22, R23):
- Happy paths:
- Python file,
symbol="MyClass"→ returns class body only (lines fromclass MyClass:through end of class). - Python file,
symbol="my_func"→ returns function body only. - TypeScript file,
symbol="renderComponent"→ returns arrow/function body. - Go file,
symbol="HandleRequest"→ returns func body.
- Python file,
- Edge cases:
symbol=None→ returns full file content (characterization).symbol="nonexistent"→ returns error result listing available symbols ("Available symbols: foo, bar, baz").- Unsupported file extension (
.md,.txt) → returns full file withnote: symbol extraction not supported for .md. - Empty file → returns empty content.
- File with nested classes → outer class symbol returns including inner class.
- Error paths:
- Path does not exist → raises
FileNotFoundError(or returns error result matching other tools' convention). - Path is a directory → returns error result.
- Permission denied → returns error result.
- Path does not exist → raises
- Integration scenarios:
- Symbol extraction + line slicing:
symbol="foo",end_line=50truncates at line 50 even if symbol extends further. - Round-trip: extract symbol, write back via
ShellToolsed(not in scope for tool — just verify extracted range is well-formed).
- Symbol extraction + line slicing:
Verification:
python3 -m pytest tests/unit/test_symbol_extractor.py tests/unit/test_read_file_tool.py -qpasses.ruff check src/agentkit/tools/file_read.py src/agentkit/tools/symbol_extractor.pyclean.ReadFileToolappears inToolRegistry.list_tools()after registration.
U2. PhasePolicy + PhaseState + ServerConfig (G6 core)
Goal: Add PhasePolicy dataclass, PhaseState enum, default whitelist config. Extend ServerConfig.from_dict with plan_exec section. Wire config to agentkit.yaml.
Requirements: R25, R26.
Dependencies: none.
Files:
src/agentkit/core/phase.py(new) —PhaseStateenum,PhasePolicydataclass,default_policy()factory.src/agentkit/server/config.py(modify — addplan_execfield +from_dictparsing).agentkit.yaml(modify — documentplan_exec:section).tests/unit/test_phase_policy.py(new).
Approach:
PhaseState = enum("planning building verification delivery").PhasePolicycarries:whitelist: dict[PhaseState, set[str]]— tool names allowed per phase.bash_command_filter: dict[PhaseState, re.Pattern | None]— regex that bash args must NOT match (e.g.,r"\b(rm|mv|>|>>)\b"in Planning).auto_advance_after_steps: int | None— None = manual (LLM callsadvance_phase); int = auto-advance after N steps.start_phase: PhaseState = PhaseState.PLANNING.
default_policy()returns the KTD5 whitelist above.ServerConfig.from_dictreadsplan_execsection:enabled,whitelist_override(dict),auto_advance_after_steps.agentkit.yamlgains a commented-outplan_exec:block (commented to preserve default behavior — opt-in).
Patterns to follow:
src/agentkit/core/fallback.pyfor dataclass + classmethod factory pattern.src/agentkit/server/config.pyfrom_dictextension template (established in Wave 1 forprompt_cache/streaming/verification; Wave 2 addedrollback/fallback_chain; Wave 3 addsplan_exec).
Test scenarios (covers R25, R26):
- Happy paths:
default_policy()returns policy with all four phases; Planning whitelist containssearch,read_file; Building containswrite_file.PhasePolicy.is_tool_allowed("search", PhaseState.PLANNING)returns True.PhasePolicy.is_tool_allowed("write_file", PhaseState.PLANNING)returns False.PhasePolicy.is_tool_allowed("write_file", PhaseState.BUILDING)returns True.
- Edge cases:
- Empty whitelist for a phase → all tools rejected (raises
ValueErrorat construction time — fail-fast). Deliveryphase whitelist contains"*"(wildcard) → all tools allowed.- Custom whitelist override merges with default (override wins on conflict).
- Empty whitelist for a phase → all tools rejected (raises
- Error paths:
- Invalid phase name in config →
ValueErrorwith message naming the bad value. bash_command_filterregex compile failure →ValueError.
- Invalid phase name in config →
- Config integration:
ServerConfig.from_dict({"plan_exec": {"enabled": True, "auto_advance_after_steps": 5}})populates fields correctly.ServerConfig.from_dict({})→plan_exec = {}(default).
Verification:
python3 -m pytest tests/unit/test_phase_policy.py -qpasses.ruff check src/agentkit/core/phase.pyclean.
U3. AdvancePhaseTool + ReActEngine phase enforcement (G6 wiring)
Goal: Add AdvancePhaseTool. Wire PhasePolicy into ReActEngine.execute so each tool-call step checks is_tool_allowed(tool_name, current_phase) before dispatch; blocked calls return a structured error to the LLM ("Tool 'write_file' not allowed in Planning phase — call advance_phase first").
Requirements: R24.
Dependencies: U2.
Files:
src/agentkit/tools/advance_phase.py(new) —AdvancePhaseToolcallsreact_engine.advance_phase().src/agentkit/core/react.py(modify — addphase_policyparam to__init__+execute; add_current_phasefield; addadvance_phase()method; enforce in_execute_loop).tests/unit/test_react_phase_enforcement.py(new).
Approach:
ReActEngine.__init__acceptsphase_policy: PhasePolicy | None = None. None = no enforcement (backward compat — all existing callers unaffected)._current_phase: PhaseState | Noneinitialized fromphase_policy.start_phaseif policy set, else None.advance_phase()advances_current_phaseto next enum value; raisesValueErrorif already atDELIVERY.- In
_execute_loop, before dispatching a tool call:if self._phase_policy is not None and self._current_phase is not None: if not self._phase_policy.is_tool_allowed(tool_name, self._current_phase): # Inject structured error into conversation, do NOT dispatch tool. # This counts as a "step" for max_steps purposes. observation = { "error": "phase_violation", "message": f"Tool '{tool_name}' not allowed in {self._current_phase.value} phase", "current_phase": self._current_phase.value, "hint": "Call advance_phase to move to Building phase" } continue # next loop iteration - Auto-advance: if
phase_policy.auto_advance_after_stepsis set and_steps_in_phase >= auto_advance_after_steps, calladvance_phase()automatically. AdvancePhaseTool.execute()calls the bound engine'sadvance_phase()and returns the new phase name. Registered only whenphase_policyis not None.
Execution note: characterization-first — test that ReActEngine with phase_policy=None behaves identically to pre-change (no enforcement, no advance_phase tool, no _current_phase mutation). Then add enforcement tests.
Patterns to follow:
src/agentkit/core/react.pyverification_enabledpattern (feature flag + step-level check).src/agentkit/tools/ask_human.pyfor tool that interacts with engine state.
Test scenarios (covers R24):
- Characterization (no policy):
ReActEngine(phase_policy=None)— all tools allowed in all steps; noadvance_phasetool registered; behavior matches pre-change.
- Happy paths:
- Planning phase: LLM calls
search→ executes; LLM callsadvance_phase→ phase becomes Building. - Building phase: LLM calls
write_file→ executes; LLM callsadvance_phase→ phase becomes Verification. - Verification phase: LLM calls
bashwithpytest→ executes; LLM callsadvance_phase→ phase becomes Delivery. - Delivery phase: LLM calls any tool → executes (wildcard).
- Planning phase: LLM calls
- Edge cases:
advance_phasecalled at Delivery → returns error "Already at final phase".- Auto-advance after 3 steps in Planning → phase transitions automatically on 4th step.
bashcommand in Planning containsrm file→ blocked bybash_command_filter.
- Error paths:
- LLM calls
write_filein Planning → tool NOT dispatched; structured error returned to LLM; loop continues. - LLM calls non-existent tool → existing error path (not phase-related).
- LLM calls
- Integration scenarios:
- Phase transition emits a
phase_changedevent (use existing_broadcast_eventpattern fromexperts/orchestrator.py). max_stepsreached mid-phase →ReActResult.status = "max_steps_reached"(existing path, no change).
- Phase transition emits a
Verification:
python3 -m pytest tests/unit/test_react_phase_enforcement.py -qpasses.- Existing
tests/unit/test_react_engine.pystill passes (characterization — no policy = no change). ruff check src/agentkit/core/react.py src/agentkit/tools/advance_phase.pyclean.
U4. Wire PLAN_EXEC at chat.py WebSocket path (G6 chat integration)
Goal: Replace the chat.py:1084 "not yet supported, falling back to REACT" warning with a real PLAN_EXEC handler that constructs PhasePolicy from ServerConfig.plan_exec and dispatches to ReActEngine.execute with the policy set.
Requirements: R24, R25 (end-to-end wiring).
Dependencies: U2, U3.
Files:
src/agentkit/server/routes/chat.py(modify — add_execute_plan_exec_wshandler; branch onExecutionMode.PLAN_EXEC).tests/unit/test_chat_plan_exec_ws.py(new).
Approach:
- New helper
_execute_plan_exec_ws(websocket, agent, routing, messages, ...):- Read
server_config.plan_exec(may be{}if not configured → usedefault_policy()). - Build
PhasePolicyfrom config (apply overrides). - Construct
ReActEngine(..., phase_policy=policy). - Register
AdvancePhaseToolbound to this engine. - Call
engine.execute_stream(...)— reuses existing streaming path. - Emit
phase_changedevents through the WebSocket (frontend can render phase indicator).
- Read
- chat.py:1084 changes from
if execution_mode not in (REACT, SKILL_REACT): warn + fall backto:if routing.execution_mode == ExecutionMode.PLAN_EXEC: await _execute_plan_exec_ws(websocket, agent, routing, ...) return if routing.execution_mode not in (ExecutionMode.REACT, ExecutionMode.SKILL_REACT): # existing warning for REWOO/REFLEXION/TEAM_COLLAB ... - REST
send_messagepath: explicitly raiseHTTPException(501, "PLAN_EXEC via REST not yet supported; use WebSocket")— Wave 3 does NOT wire REST (KTD4).
Execution note: characterization-first — test that existing REWOO/REFLEXION/TEAM_COLLAB modes still fall back to REACT with the warning (no regression). Then add PLAN_EXEC wiring.
Patterns to follow:
src/agentkit/server/routes/chat.pyexisting WebSocket handler structure (lines 1082-1100).src/agentkit/server/_fallback_chain.py(Wave 2 U3) for "construct engine per-request with config" pattern.
Test scenarios (covers end-to-end):
- Characterization:
ExecutionMode.REWOOvia WebSocket → still falls back to REACT with warning (existing behavior unchanged).ExecutionMode.REFLEXION→ same.ExecutionMode.TEAM_COLLAB→ same.
- Happy paths:
ExecutionMode.PLAN_EXECvia WebSocket →_execute_plan_exec_wsinvoked;ReActEngineconstructed withphase_policy;AdvancePhaseToolregistered.- Planning phase: LLM emits
searchtool call → executed; tool result streamed. - LLM emits
advance_phase→phase_changedevent sent to WebSocket client; subsequentwrite_filecall now allowed.
- Edge cases:
plan_execconfig absent →default_policy()used; behavior matches KTD5 whitelist.plan_exec.enabled=False→ falls back to REACT (opt-out).- Phase violation: LLM calls
write_filein Planning → structured error returned; loop continues;phase_violationevent emitted.
- Error paths:
- REST
send_messagewith PLAN_EXEC → 501 error. - Phase policy construction fails (bad config) → 500 error with message.
- REST
- Integration scenarios:
- Existing fallback chain (Wave 2 U3) NOT applied to PLAN_EXEC — phase policy and fallback chain are mutually exclusive (KTD5 from Wave 2 plan: chain only wraps REACT/SKILL_REACT at REST). Document this in chat.py comment.
Verification:
python3 -m pytest tests/unit/test_chat_plan_exec_ws.py -qpasses.ruff check src/agentkit/server/routes/chat.pyclean.- Manual test:
agentkit chatwith@skill:plan_exec_demoskill config → WebSocket stream includesphase_changedevents.
Risks & Dependencies
Risks
- ReAct core modification risk (high): U3 modifies
ReActEngine._execute_loop. Mitigation: characterization-first tests (U3 Execution note);phase_policy=Nonedefault preserves all existing behavior; fulltest_react_engine.pyregression. - Symbol extraction accuracy (medium): Regex extractor may miss edge cases (decorated functions, nested generics, multi-line signatures). Mitigation: fall back to "no symbols found → read full file" gracefully; never raise on extraction failure.
- PLAN_EXEC phase deadlock (medium): LLM may never call
advance_phase, leaving the agent stuck in Planning. Mitigation:auto_advance_after_stepsconfig (default 5); timeout via existingmax_steps. - Tool name drift (low): Phase whitelist references tool names (
write_file,search, etc.) that may be renamed in future. Mitigation: whitelist is config-driven; rename only requires config update.
Dependencies
- Wave 2 PR #5 (
feat/agent-wave2-medium-coupling) should be merged first — Wave 3 builds on theServerConfig.from_dictextension pattern and the_fallback_chain.pyintegration shape established there. If PR #5 is still open, Wave 3 branches fromfeat/agent-wave2-medium-couplingrather thanmain. - No external library dependencies (KTD1).
System-Wide Impact
- Agents using PLAN_EXEC mode: gain phase enforcement. Existing REACT/SKILL_REACT/DIRECT_CHAT agents: zero change (phase_policy defaults to None).
- Tool registry: gains two new tools (
read_file,advance_phase). Frontend tool list display may need updating to show the new icons — out of scope for Wave 3 (frontend follows up). agentkit.yaml: gainsplan_exec:section (commented by default). Existing configs unaffected.- WebSocket clients: gain
phase_changedevent type. Existing clients ignore unknown event types (verified in Wave 2 —phase_rollback_*events follow the same pattern).
Sources & Research
- Origin brainstorm:
docs/brainstorms/2026-06-29-advanced-agent-gap-optimization-requirements.md(Wave 3 section, KTD6/KTD7). - Wave 1 plan:
docs/plans/2026-06-29-002-feat-agent-wave1-quick-wins-plan.md(PR #4 merged). - Wave 2 plan:
docs/plans/2026-06-29-003-feat-agent-wave2-medium-coupling-plan.md(PR #5 open). - Trae Work architecture research (cited in brainstorm): SOLO four-stage state machine pattern.
- Qoder architecture research (cited in brainstorm): Spec→Coding→Verify closed loop.
- Codebase:
src/agentkit/core/react.py:148reservesread_file/write_filetool names in_DEFAULT_CORE_TOOLS— Wave 3 U1 delivers the missingread_fileimplementation. - Codebase:
src/agentkit/server/routes/chat.py:1084documents that PLAN_EXEC is "not yet supported" — Wave 3 U4 closes this gap.
Deferred to Implementation
- Exact regex patterns for non-Python symbol extraction (U1) — design above gives the shape; implementer finalizes patterns based on real-world test fixtures.
bash_command_filterregex precision (U2) — defaults blockrm/mv/>/>>; implementer may add more based on test scenarios.phase_changedevent payload shape (U3/U4) — minimal viable shape:{"phase": "building", "previous": "planning"}; frontend rendering concerns are out of scope.- Whether
AdvancePhaseToolaccepts atarget_phaseargument for skipping phases (e.g., Planning → Verification) — default no (sequential only); add if test scenarios reveal a need.