Commit Graph

20 Commits

Author SHA1 Message Date
chiguyong 47ee2449df refactor(experts): split TeamOrchestrator god class into 7 mixins (U2)
- Split 2085-line orchestrator.py into main class (592 lines) + 7 responsibility-focused mixins: PhaseExecutor, DebateRunner, ReviewGate, DivergenceDetector, RollbackHandler, Synthesizer, InterventionHandler.

- Mixin pattern preserves self access to shared state (_experts/_workspace/_broadcast_event); method bodies moved verbatim to minimize regression risk. Each mixin declares TYPE_CHECKING Protocol for shared state.

- Split _execute_execution_phase (~290 lines) into _prepare_phase_context/_run_agent_steps/_finalize_phase (each <=100 lines).

- All mixins <=400 lines, main class <=600 lines. [DEGRADED] prefix annotations preserved in ReviewGateMixin.

- 60 team_orchestrator tests pass (behavior unchanged), 469 experts tests pass, ruff clean.
2026-06-30 16:47:20 +08:00
chiguyong a3cecd4b50 fix(review): apply P0/P2 findings from dual-agent review
- Dockerfile: split ENTRYPOINT/CMD to align with docker-compose serve
- test_termbase: guard jieba import with pytest.importorskip
- orchestrator: mark silent review-degradation with [DEGRADED] prefix
- chat.py: accurate ExecutionMode log message
- agentkit.yaml: document OTel exporter config
- skill_routing: replace 12 Any with object/typed (AGENTS.md compliance)
- AssistantText.vue: add aria-live/role for a11y
2026-06-30 14:27:46 +08:00
Fischer a2dcde01b8 feat(agent): Wave 2 medium coupling (G4/G7/G9) (#5)
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
2026-06-30 09:09:33 +08:00
chiguyong bbbf9cd40a feat(bitable): add bitable companion service with full P0-P2 fixes
Bitable is a multi-dimensional table companion service that runs alongside
the main AgentKit server. It provides structured data storage with formula
fields, views, and ingestion pipelines.

Major components:
- Domain models (Pydantic v2): Table, Field, Record, View, RecalcTask
- SQLAlchemy 2 async ORM with independent bitable PostgreSQL schema
- Formula engine: AST parser, DAG, Kahn topological sort, safe eval
- RecalcWorker: atomic task claiming (FOR UPDATE SKIP LOCKED), topo-order
  processing, stale-threshold reaper for crash recovery
- REST API (/api/v1/bitable): tables, fields, records, views, files
- BitableTool: agent-facing tool with batch chunking (500/batch)
- CLI: agentkit bitable subcommands (create, list, import-excel, etc.)
- Frontend: Vue 3 + vxe-table grid with field management, views, filters
- Ingestion: Excel (openpyxl), database reflection, API collector

Security fixes (ce-code-review P0 + ce-debug P1):
- SQL injection prevention (field_id validation, parameterized queries)
- IDOR protection (_check_table_ownership on all table-level endpoints)
- SSRF prevention (URL scheme + private IP validation in parse_excel_url)
- OOM prevention (streaming file upload, batch delete, batch insert)
- Atomic recalc task claiming (FOR UPDATE SKIP LOCKED)
- Formula engine cache invalidation on field changes
- Composite cursor pagination for non-id sort orders
- Batch upsert (eliminates N+1 queries)
- Sync I/O offloaded to thread pool in async contexts
- Internal token auth (X-Internal-Token, hmac.compare_digest)
- PK unique index enforcement

Test coverage: 88 unit tests (95 skipped without Docker)
2026-06-25 01:09:59 +08:00
chiguyong dfd188b1a4 feat(orchestrator): add pipeline checkpoint and crash recovery (U7)
Add PipelineCheckpoint for stage-level crash recovery with Redis-first
+ memory fallback. TeamOrchestrator saves checkpoints after each phase
finalizes and supports resume(plan_id) to continue from the last
completed phase. New POST /api/v1/tasks/{id}/resume endpoint recreates
the team from saved plan and calls resume.
2026-06-24 21:04:18 +08:00
chiguyong ef84e3fd53 feat(experts): add SharedWorkspace state offloading for long-horizon runs
U4: ExpertTeam accepts redis_client, passes to SharedWorkspace. After phase
completion, full result is written to workspace and in-memory phase.result
is replaced with a 500-char summary + _ref_key. Dependency output reading
resolves offloaded content from workspace on demand, with graceful fallback
to summary on read failure.

Tests: 8 scenarios (offload creation, short content, dependency resolution,
workspace failure fallback, non-offloaded passthrough, redis_client wiring,
memory dict fallback, pipeline integration) — all pass.
2026-06-24 20:32:10 +08:00
chiguyong 717aad1303 feat(experts): add concurrency limit to TeamOrchestrator parallel phases
U2: Add asyncio.Semaphore to bound concurrent phase execution and debate
argument generation. Default limit=3, configurable via max_concurrent_phases.
Prevents LLM rate-limit spikes when many phases run in the same layer.

Tests: 5 scenarios (happy path, 5-phase edge case, serial mode, failure
release, debate integration) — all pass.
2026-06-24 20:23:30 +08:00
chiguyong 574db8458f fix(experts): PM 协同代码审查全量修复
P0: 跨阶段契约状态同步 — _notify_collaborators 更新接收方契约状态为 received
P0: 4 个 PM 事件加入 _VALID_TEAM_EVENT_TYPES 白名单

P1: 验收 fail-open 改标注降级原因
P1: 返工失败抛 RuntimeError 而非返回 dict
P1: 验收 prompt injection 防护 — 专家输出用 XML 标签包裹
P1: 契约字段校验 _EXPERT_NAME_RE
P1: bool("false") 修复 — 显式比较避免字符串真值陷阱
P1: _parse_risk_flags(None) 防御

P2: _notify_collaborators 移到验收通过后
P2: SharedWorkspace 写入移到验收通过后
P2: 验收贪婪正则修复
P2: 风险标记数量上限 MAX_RISK_FLAGS=10
P2: 返工 feedback 截断
P2: 前端会话隔离 — 切换会话时清除/恢复 collaborationState
P2: 前端契约状态更新 — collaboration_notice 时标记 delivered
P2: CLI 死代码标注 + 异常改 debug 日志
P2: 模块级 _RISK_FLAG_RE 预编译
2026-06-24 18:56:27 +08:00
chiguyong 5487cca199 feat(experts): U4 专家风险标记 + risk_flagged 事件
- orchestrator 新增 _parse_risk_flags 静态方法,正则解析 [RISK: ...] 标记
- _execute_execution_phase 在协作通知后、验收前解析风险标记
- 风险标记通过 risk_flagged 事件广播,供前端/CLI 渲染
- 无风险标记时行为不变,向后兼容
- 新增 TestRiskFlagging 7 个测试(单/多/无/格式错误/事件发出/内容/兼容)
2026-06-24 14:17:58 +08:00
chiguyong 62fcbc0feb feat(experts): U3 Lead 验收环节 + 返工机制
- PlanPhase 添加 rework_count 和 review_feedback 字段
- 添加 _review_phase_output 方法,Lead 用 LLM 验收阶段输出
- _execute_execution_phase 重构为返工循环(MAX_REWORKS=2)
- 验收通过/返工/失败三种路径,发出 review_result 事件
- LLM 不可用时优雅降级直接通过
- 6 个新测试,全套 449 passed 无回归
2026-06-24 14:09:18 +08:00
chiguyong c46cf06f6d feat(experts): U2 协作契约执行 — 专家可见 + 主动通知
- _execute_execution_phase 按协作契约读取相关专家输出(可见性)
- 添加 _notify_collaborators 方法,完成后通知相关专家(可协助)
- 发出 collaboration_notice 事件,契约状态更新为 delivered
- 7 个新测试,全套 443 passed 无回归
2026-06-24 13:54:38 +08:00
chiguyong f219c5f016 feat(experts): U1 协作契约数据模型 + Lead 生成契约
- PlanPhase 添加 collaboration_contracts 字段(CollaborationContract dataclass)
- 修改 _decompose_task prompt,要求 Lead 分解任务时定义协作契约
- 修改 _parse_phases 解析 LLM 返回的协作契约信息
- plan_update 事件自动包含协作契约(通过 to_dict 序列化)
- 71 + 9 = 80 个新测试,全套 436 passed 无回归
2026-06-24 13:44:50 +08:00
chiguyong c831e925b6 feat(experts): U4 用户干预通道 + 手动辩论触发
建立 @team 执行期间的用户干预通道,支持 /stop、/debate <topic>、
普通文本追加上下文。

ExpertTeam (src/agentkit/experts/team.py):
- 新增 _interventions: asyncio.Queue (maxsize=64) 干预队列
- add_user_intervention(msg): 广播 + 入队
- consume_user_interventions(): 排空并返回待处理干预
- broadcast_user_message 现在同时入队干预队列

TeamOrchestrator (src/agentkit/experts/orchestrator.py):
- 新增 _user_context: list[str] 累积普通文本干预
- 新增 _process_interventions(lead, plan) 在每层执行前调用:
  * /stop → 终止执行,广播 plan_update(stopped_by_user)
  * /debate <topic> → 动态插入 DEBATE phase(受 MAX_DEBATES 限制)
  * 普通文本 → 累积到 _user_context
- _synthesize_results 将 _user_context 追加到 synthesis prompt

WS 路由 (src/agentkit/server/routes/chat.py):
- 模块级 _active_teams dict 跟踪每个 session 的活跃团队
- _execute_team_collab 执行前注册、finally 注销
- WS 消息循环:若 session 有活跃团队,message 路由为干预而非新任务
- 新增 team_intervention_ack 确认消息

测试:tests/unit/experts/test_team_intervention.py(20 测试),
覆盖队列基础、/stop、/debate、普通文本、混合消息、synthesis 影响。
同步更新 test_orchestrator_debate.py 的干预通道兼容性测试
(U4 已实现 consume_user_interventions)。

全部 418 experts 测试 + 325 server 测试通过。
2026-06-24 12:17:09 +08:00
chiguyong ac26d417b3 feat(experts): U3 分歧检测 + 方案评审辩论自动触发
在 TeamOrchestrator 中新增 4 个方法实现自动辩论触发:

- _maybe_add_plan_review_debate: 任务分解后可选插入方案评审 DEBATE
  phase(phases > 2 且 LLM 判断需要时),所有执行阶段依赖它
- _detect_divergence: 每层执行后用 LLM 判断已完成阶段产出是否与其他
  阶段存在分歧,偏好 false negative
- _insert_debate_phase: 动态插入 DEBATE phase 并重 wiring 依赖
  (原依赖 trigger 的 phase 现在依赖 DEBATE)
- _check_divergence_and_insert_debates: 每层完成后的协调入口,
  受 MAX_DEBATES=3 上限保护

主循环从 `for layer in layers` 改为 `while True` + 重新计算
topological_sort(),以支持动态插入 DEBATE phase 后的依赖分层。

测试:tests/unit/experts/test_divergence_detection.py(21 测试),
覆盖 happy path / 边界 / 错误路径 / 集成分层。同步修复
test_team_orchestrator.py 的 mock gateway 以适配 U3 的额外 LLM 调用。

全部 398 测试通过。
2026-06-24 11:09:53 +08:00
chiguyong fbe08cb1e2 feat(experts): add debate phase executor to TeamOrchestrator (U2)
Implement _execute_debate_phase() with Lead-facilitated structured debate:
- Lead opens with divergence point + dependency context
- Experts argue in parallel per round (asyncio.gather)
- Lead summarizes each round, then adjudicates final verdict
- Verdict produces decision (adopt/compromise/shelve/inconclusive) + conclusion
- Conclusion written to SharedWorkspace for downstream phases

Escape hatches:
- debate_config.skip=true short-circuits with template text
- MAX_DEBATE_ROUNDS=4 hard cap on rounds
- User /stop intervention ends debate early (U4-compatible via getattr fallback)
- LLM unavailable falls back to template verdict, no crash

New events: debate_started, expert_argument, debate_round_summary,
debate_resolved (plus existing phase_completed for consistency).

Phase dispatcher (_execute_phase) routes by phase_type:
EXECUTION to _execute_execution_phase, DEBATE to _execute_debate_phase.

36 new tests in test_orchestrator_debate.py covering happy path (2 rounds,
2 experts), max_rounds=1 boundary, empty participants, user stop, skip
escape hatch, LLM unavailable, SharedWorkspace integration, event
broadcasting, intervention channel compatibility, and helper methods.
All 377 expert tests pass.

Also includes planning artifacts (brainstorm requirements + implementation
plan with 6 units U1-U6).
2026-06-24 10:54:51 +08:00
chiguyong 771756814f fix(review): 修复代码审查发现的 P0/P1/P2 问题
P0 (Critical):
- orchestrator: plan_update 事件 key 从 phases 改为 plan_phases 匹配前端契约
- orchestrator: team_formed 事件 payload 从 string[] 改为 IExpertInfo[] + plan_phases:[]

P1 (High):
- orchestrator: 新增 phase_failed 事件广播 (3处: gather 失败/_execute_phase 异常/_mark_dependents_failed 级联)
- orchestrator: 新增 team_dissolved 事件广播 (3处: 正常完成/ValueError/Exception)
- orchestrator: _mark_dependents_failed 改为 async 以支持事件广播
- orchestrator: gather 结果检查增加 asyncio.CancelledError (Python 3.11+ BaseException)
- plan: PhaseStatus.RUNNING 值从 running 改为 in_progress 匹配前端联合类型
- team.ts: updatePhaseStatus 增加 plan_phases undefined 防御守卫
- chat.py: 增加 asyncio.CancelledError 处理 + team.dissolve() 移入 finally 块

P2 (Medium):
- orchestrator: _get_isolated_agent 返回类型 Any 改为 ConfigDrivenAgent
- orchestrator: _get_llm_gateway 返回类型 Any 改为 LLMGateway | None
- orchestrator: 依赖输出从 SharedWorkspace 读取改为内存 dep_phase.result (减少冗余 I/O)
- plan: PlanPhase.to_dict() result 序列化为 string 匹配前端 ITeamPlanPhase.result 类型
- types.ts: expert_step.step 类型从 number 改为 string (后端发送 phase ID)

Tests: 377 passed (experts + chat_team + expert_team)
2026-06-18 13:00:59 +08:00
chiguyong 0f8ea6e21e feat(experts):重写 TeamOrchestrator 为流水线模式 + TeamStatus.PLANNING 2026-06-18 01:39:22 +08:00
chiguyong bbedfff597 feat: hub-and-spoke experts, tiered tool injection, unified event model (U3/U7/U10) 2026-06-17 10:46:16 +08:00
chiguyong 64d62a2b60 feat: autonomous task execution - connect PlanExecEngine + TeamOrchestrator
U1: TeamOrchestrator._execute_phase real execution (Expert.agent.execute)
U2: LLM-based merge strategies (BEST/VOTE/FUSION) with fallback
U3: ReActStepExecutor replacing _LLMStepAgent for tool-enabled steps
U4: SharedWorkspace integration for cross-phase/cross-execution state
U5: GoalPlanner prompt tuning with few-shot and verb pattern matching
U6: Replan-before-fallback in TeamOrchestrator
U7: End-to-end validation tests for multi-step research tasks
U8: WebSocket progress events (step_event_callback + new event types)

Code review fixes: P0 response.strip fix, P1 competitor status check,
milestone real impl, VOTE self-bias fix, confirmation_handler wiring,
ExpertTeam public API, DRY _build_result_summaries, replan tests

Also: geo_server.py refactor (ServerConfig.from_yaml), delete llm_config.yaml
2026-06-15 12:41:32 +08:00
chiguyong 7384ecb03e feat: Expert Team Mode — plan-execute collaboration with conversation UI
Implements B+C hybrid Expert Team Mode with ExpertConfig, CollaborationPlan,
TeamOrchestrator, ExpertTeamRouter, HandoffTransport, SharedWorkspace, and
Expert wrapper. Frontend includes ExpertTeamView, ExpertMessage,
PlanVisualization, team store, and WS event handlers.

Code review fixes: sentinel-based close, per-phase retry, name validation,
Vue component integration, teamState dedup, Redis reset, plan reassign,
event_type validation, hmac timing-safe compare, message dedup,
reactive updatePhases, O(1) phase lookup, iterative DFS, bounded Queue.

232 unit tests passing.
2026-06-14 22:20:14 +08:00