Commit Graph

1 Commits

Author SHA1 Message Date
chiguyong 91a61f9b49 feat(evolution): auto-trigger + quality gate + actor marking (U6, R5/R6)
U6 of the complex task quality loop plan.

R5 (auto evolution trigger + quality gate):
- EvolutionConfig (Pydantic v2): success_sample_rate=0.1, min_confidence=0.5,
  min_examples=3, observe_only=True, cross_workspace_sharing=False
- Success path gated by success_sample_rate; failure path always runs (100%)
- Observe-only mode records reflections without feeding optimizer (RV14:
  avoids noise-driven prompt degradation during initial rollout)
- PromptOptimizer.can_optimize() consumption gate: sample count >= min_examples
  AND mean quality >= min_confidence
- PitfallDetector confidence threshold: low-confidence warnings marked
  observe-only; confidence = failure_rate * min(1.0, total/3) linear ramp
  (ponytail: upgrade to Wilson interval)

R6 (actor marking + cross-workspace sharing):
- All evolution artifacts (EvolutionLogEntry, Module, PitfallWarning) carry
  actor field; defaults to result.agent_name
- can_share_artifact(): same-workspace always allowed; cross-workspace requires
  explicit opt-in via EvolutionConfig.cross_workspace_sharing=True

KTD-8: gave_up_after_reflections treated as failure path (triggers 100%
evolution) even when stream wrapper marks status as COMPLETED. Detection via
output_data.trace_outcome or error_message substring (ponytail: heuristic;
upgrade path is a dedicated TaskResult.trace_outcome field).

Backward compat: all gates conditional on auto_evolution_config is not None;
existing EvolutionMixin usage without config preserves prior behavior.

Tests: tests/unit/test_evolution_auto_trigger.py (37 tests) covers R5/R6
scenarios - sample rate gate, observe-only, consumption gate, pitfall
confidence, actor marking, cross-workspace sharing, gave_up_after_reflections,
error handling, fire-and-forget, backpressure cap, AE3 happy path.
2026-07-03 13:54:37 +08:00