feat(agent): Wave 3 strategic coupling (G5/G6) #6

Merged
fischer merged 7 commits from feat/agent-wave3-strategic into main 2026-06-30 09:17:20 +08:00
Owner

feat(agent): Wave 3 strategic coupling (G5/G6)

Summary

Implements the PLAN_EXEC execution mode with a 4-phase state machine (PLANNING → BUILDING → VERIFICATION → DELIVERY) and per-phase tool whitelists with a bash command filter. Also adds SymbolExtractor + ReadFileTool with symbol slicing (G5).

This is Wave 3 of the advanced-agent gap optimization, building on top of Wave 2 (PR #5).

Implementation Units

U1 — G5: SymbolExtractor + ReadFileTool (50885fb)

  • SymbolExtractor protocol with AstSymbolExtractor (tree-sitter-style AST) and RegexSymbolExtractor (fallback)
  • ReadFileTool gains symbol parameter for slicing specific functions/classes
  • 35+ tests covering AST extraction, regex fallback, symbol lookup, edge cases

U2 — G6: PhaseState + PhasePolicy (abf758f)

  • PhaseState enum: PLANNING → BUILDING → VERIFICATION → DELIVERY
  • PhasePolicy dataclass with per-phase tool whitelist + bash command filter
  • default_policy() implements R24 whitelist (KTD5)
  • policy_from_config() builds policy from plan_exec config section
  • ServerConfig gains plan_exec field
  • 37 tests

U3 — G6: AdvancePhaseTool + ReActEngine phase enforcement (c0a44b4)

  • ReActEngine gains phase_policy param, current_phase property, advance_phase() method
  • _check_phase_permission() enforces whitelist + bash filter before tool dispatch
  • _maybe_auto_advance() opt-in safety net (configurable step threshold)
  • AdvancePhaseTool — LLM's manual escape hatch to signal phase completion
  • 27 tests

U4 — G6: PLAN_EXEC wiring at WebSocket chat path (7869cad)

  • PLAN_EXEC mode wired at chat.py WebSocket handler (KTD4: WebSocket only; REST raises 501)
  • phase_changed event emitted on phase transitions
  • Builds PhasePolicy from server_config.plan_exec with error handling
  • 9 tests covering REST 501, characterization, happy path, edge cases, error path, phase events

Code Review Fixes (ce-code-review)

P1: bash/shell tool name mismatch

PhasePolicy whitelist used "bash" but ShellTool registers as "shell". The bash_command_filter was dead code — never matched the real tool name. Fixed in phase.py whitelist, react.py filter check, agentkit.yaml config, and all tests.

P1: AdvancePhaseTool missing import

AdvancePhaseTool was in __all__ but never imported in tools/__init__.py. from agentkit.tools import AdvancePhaseTool raised ImportError. Added the import.

P2: Error message truncation

chat.py phase policy error echoed verbatim to WS client. Truncated to 200 chars to match nearby error paths and avoid leaking config internals.

P2: policy_from_config refactor

Replaced 3× full-field PhasePolicy reconstruction with dataclasses.replace() so new fields are not silently dropped in future reconstructions.

Simplification (ce-simplify-code)

Removed over-engineered _previous_value static method in AdvancePhaseTool that did index math on a hardcoded phase list. Instead, capture the previous phase before the transition — clearer intent, fewer lines, same behavior.

Test Results

  • 139 Wave 3 tests pass (U1-U4 + review fixes)
  • 72 regression tests pass (chat routes, react engine)
  • 1 pre-existing failure in test_react_compression.py (confirmed unrelated to Wave 3)
  • Ruff check + format clean

Plan

docs/plans/2026-06-29-004-feat-agent-wave3-strategic-plan.md

Depends on

  • #5 (Wave 2: G4/G7/G9 medium coupling)
# feat(agent): Wave 3 strategic coupling (G5/G6) ## Summary Implements the PLAN_EXEC execution mode with a 4-phase state machine (PLANNING → BUILDING → VERIFICATION → DELIVERY) and per-phase tool whitelists with a bash command filter. Also adds SymbolExtractor + ReadFileTool with symbol slicing (G5). This is Wave 3 of the advanced-agent gap optimization, building on top of Wave 2 (PR #5). ## Implementation Units ### U1 — G5: SymbolExtractor + ReadFileTool (50885fb) - `SymbolExtractor` protocol with `AstSymbolExtractor` (tree-sitter-style AST) and `RegexSymbolExtractor` (fallback) - `ReadFileTool` gains `symbol` parameter for slicing specific functions/classes - 35+ tests covering AST extraction, regex fallback, symbol lookup, edge cases ### U2 — G6: PhaseState + PhasePolicy (abf758f) - `PhaseState` enum: PLANNING → BUILDING → VERIFICATION → DELIVERY - `PhasePolicy` dataclass with per-phase tool whitelist + bash command filter - `default_policy()` implements R24 whitelist (KTD5) - `policy_from_config()` builds policy from `plan_exec` config section - `ServerConfig` gains `plan_exec` field - 37 tests ### U3 — G6: AdvancePhaseTool + ReActEngine phase enforcement (c0a44b4) - `ReActEngine` gains `phase_policy` param, `current_phase` property, `advance_phase()` method - `_check_phase_permission()` enforces whitelist + bash filter before tool dispatch - `_maybe_auto_advance()` opt-in safety net (configurable step threshold) - `AdvancePhaseTool` — LLM's manual escape hatch to signal phase completion - 27 tests ### U4 — G6: PLAN_EXEC wiring at WebSocket chat path (7869cad) - PLAN_EXEC mode wired at `chat.py` WebSocket handler (KTD4: WebSocket only; REST raises 501) - `phase_changed` event emitted on phase transitions - Builds `PhasePolicy` from `server_config.plan_exec` with error handling - 9 tests covering REST 501, characterization, happy path, edge cases, error path, phase events ## Code Review Fixes (ce-code-review) ### P1: bash/shell tool name mismatch `PhasePolicy` whitelist used `"bash"` but `ShellTool` registers as `"shell"`. The `bash_command_filter` was dead code — never matched the real tool name. Fixed in `phase.py` whitelist, `react.py` filter check, `agentkit.yaml` config, and all tests. ### P1: AdvancePhaseTool missing import `AdvancePhaseTool` was in `__all__` but never imported in `tools/__init__.py`. `from agentkit.tools import AdvancePhaseTool` raised `ImportError`. Added the import. ### P2: Error message truncation `chat.py` phase policy error echoed verbatim to WS client. Truncated to 200 chars to match nearby error paths and avoid leaking config internals. ### P2: policy_from_config refactor Replaced 3× full-field PhasePolicy reconstruction with `dataclasses.replace()` so new fields are not silently dropped in future reconstructions. ## Simplification (ce-simplify-code) Removed over-engineered `_previous_value` static method in `AdvancePhaseTool` that did index math on a hardcoded phase list. Instead, capture the previous phase before the transition — clearer intent, fewer lines, same behavior. ## Test Results - 139 Wave 3 tests pass (U1-U4 + review fixes) - 72 regression tests pass (chat routes, react engine) - 1 pre-existing failure in `test_react_compression.py` (confirmed unrelated to Wave 3) - Ruff check + format clean ## Plan [docs/plans/2026-06-29-004-feat-agent-wave3-strategic-plan.md](docs/plans/2026-06-29-004-feat-agent-wave3-strategic-plan.md) ## Depends on - #5 (Wave 2: G4/G7/G9 medium coupling)
fischer added 7 commits 2026-06-30 05:09:10 +08:00
50885fbc62 feat(U1): G5 SymbolExtractor + ReadFileTool with symbol slicing
- SymbolExtractor protocol + SymbolSpan dataclass
- AstSymbolExtractor for Python (stdlib ast, no tree-sitter dep — KTD1)
- RegexSymbolExtractor for TS/JS/Go/Rust/Java (language-aware regex)
- ReadFileTool with path/symbol/start_line/end_line params
- symbol=None returns full file (characterization baseline matching _FakeTool)
- symbol='foo' returns first matching symbol's line range
- symbol not found returns available_symbols list (soft error)
- Unsupported extension returns full file with note
- Manual start_line/end_line overrides symbol
- 66 unit tests covering R22/R23 + characterization + edge cases
abf758fa9c feat(U2): G6 PhaseState + PhasePolicy + ServerConfig.plan_exec
- PhaseState enum (PLANNING/BUILDING/VERIFICATION/DELIVERY) with next_of/from_string
- PhasePolicy dataclass with whitelist + bash_command_filter + auto_advance_after_steps
- default_policy() factory — KTD5 whitelist matching R24 (Planning: search/read_file;
  Building: write_file; Delivery: wildcard)
- bash_command_filter blocks rm/mv/cp/>/>> in PLANNING/VERIFICATION phases
- policy_from_config() parses plan_exec YAML section (R26) with override merge
- ServerConfig.plan_exec field + from_dict parsing (extends Wave 1/2 pattern)
- agentkit.yaml gains commented plan_exec section (opt-in)
- 37 unit tests covering PhaseState, default_policy, is_tool_allowed,
  bash filter, config parsing, and ServerConfig integration
c0a44b413b feat(U3): G6 AdvancePhaseTool + ReActEngine phase enforcement
- AdvancePhaseTool calls engine.advance_phase(), returns new phase or error
- ReActEngine.__init__ accepts phase_policy param (None = no enforcement, backward compat)
- _current_phase + _steps_in_phase fields track state machine
- advance_phase() transitions PLANNING → BUILDING → VERIFICATION → DELIVERY
- _check_phase_permission() returns structured error dict if tool blocked
- _execute_tool checks phase before dispatch (advance_phase name bypasses)
- Auto-advance safety net via _maybe_auto_advance() + auto_advance_after_steps
- Phase reset in reset() method
- 27 unit tests covering characterization, permission, transitions, auto-advance, tool integration
7869caddc7 feat(U4): G6 PLAN_EXEC wiring at chat WebSocket path
- PLAN_EXEC branch builds PhasePolicy from ServerConfig.plan_exec
- Empty config → default_policy(); enabled=False → falls back to REACT
- Bad config → error event sent, returns early (no engine constructed)
- ReActEngine created with phase_policy; AdvancePhaseTool registered
- phase_changed events emitted on phase transitions (PLAN_EXEC only)
- REST send_message with execution_mode=plan_exec → HTTP 501 (KTD4)
- REWOO/REFLEXION/TEAM_COLLAB still fall back to REACT (no regression)
- 9 unit tests covering REST 501, characterization, happy path, edge cases, error path, phase_changed events
ce6eb004a0 refactor(advance_phase): simplify previous_phase capture
Remove over-engineered _previous_value static method that did index
math on a hardcoded phase list. Instead, capture the previous phase
before the transition — clearer intent, fewer lines, same behavior.

ce-simplify-code step of LFG pipeline.
92fc38de7e fix(review): Wave 3 code review fixes
P1: bash/shell tool name mismatch. PhasePolicy whitelist used "bash" but
ShellTool registers as "shell". The bash_command_filter was dead code
(never matched the real tool name). Fixed in phase.py whitelist,
react.py filter check, agentkit.yaml config, and all tests.

P1: AdvancePhaseTool missing import in tools/__init__.py. Was in
__all__ but never imported. Added the import.

P2: chat.py phase policy error message echoed verbatim to WS client.
Truncated to 200 chars to match nearby error paths and avoid leaking
config internals.

P2: policy_from_config rebuilt PhasePolicy 3x via full-field copy.
Replaced with dataclasses.replace() so new PhasePolicy fields are not
silently dropped in future reconstructions.

ce-code-review (mode:agent) step of LFG pipeline.
fischer changed target branch from feat/agent-wave2-medium-coupling to main 2026-06-30 09:10:41 +08:00
fischer force-pushed feat/agent-wave3-strategic from 92fc38de7e to f50d3485ea 2026-06-30 09:14:11 +08:00 Compare
fischer merged commit 2b8a7d8909 into main 2026-06-30 09:17:20 +08:00
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: fischer/fischer-agentkit#6
No description provided.