chiguyong
fa152e24ac
feat(skills): add progressive skill loading with disclosure_level=0 (U5)
...
When disclosure_level=0, system prompt only injects skill name + description
(summary mode). SkillDetailTool is injected into the tool set, allowing the
LLM to load full instructions on-demand via skill_detail(query). This reduces
context window consumption when many skills are registered.
2026-06-24 21:49:00 +08:00
chiguyong
cac9c73dd5
fix(routing): U1-U6 路由优化 + 修复方案 + 代码审查修复
...
实现 6 个修复单元(U1-U6)并应用 ce-code-review 发现的 5 项安全修复。
## U1: benchmark 超时阈值
- 按 difficulty 分级超时:easy=45s, medium=60s, hard=90s
- 替换原单一 60s 硬编码
## U2: OpenAICompatibleProvider httpx 超时
- 新增 timeout 参数(默认 120s),替换硬编码 60s
- ProviderConfig.timeout 透传到 Provider
- 新增 2 项单元测试
## U3: 激活 QualityGate skill_match 校验
- BaseAgent._build_skill_context() 构造 skill_context
- 在 base.py / tasks.py / runner.py 三处传入 QualityGate.validate()
## U4: 添加 disambiguation_keywords 字段
- IntentConfig 新增 disambiguation_keywords 字段
- 8 个 skill YAML 补充该字段
## U5: 优化 RequestPreprocessor 路由正则
- 拆分 _FACTUAL_RE 为 CN/EN 双正则(中文无空格)
- 新增 _MATH_RE / _TRANSLATION_RE 纯模式
- _TOOL_CONTEXT_RE 排除需要工具的实时查询
- 多行输入守卫 + 结尾标点支持
- 新增 21 项单元测试(共 40 项全通过)
## U6: 重新基准测试
- 真实 LLM benchmark:准确率 60% -> 93.3%
- 4/5 通过,p50=40.8s,一致性=100%
- 旧基线备份至 baseline_2026-06-17_old_arch.json
## ce-code-review 修复(5 项)
- 修复 \s 字符类匹配换行符的安全隐患
- 添加事实/数学正则的结尾标点支持
- 修复 geo_optimizer.yaml 关键词重复
- 修复 _login_with_retry 不可达 return
- 修复 real_llm_server fixture stderr_fh 资源泄漏
测试:tests/unit/chat/ 63 项全通过,ruff 检查通过。
2026-06-20 19:31:49 +08:00
chiguyong
200174c5c7
feat: SQLite persistence, verification loop, spec-driven execution
...
Phase 2 of architecture optimization (U5/U6/U9):
- U5: SqliteConversationStore with WAL mode + LRU cache (1000 convs)
Replaces in-memory ConversationStore in portal.py
Data survives server restarts (ref: Codex Thread persistence)
- U6: VerificationLoop with verify/verify_and_retry
Default commands: pytest + ruff check
ReActEngine integration via verification_enabled flag
New run_tests tool for LLM to invoke verification
- U9: SpecManager for plan-as-contract (ref: Qoder Quest Mode)
Plans persisted to .agentkit/specs/{spec_id}.yaml
API: GET/PUT /api/v1/specs, POST /api/v1/specs/{id}/confirm
PlanExecEngine emits spec_created event after plan generation
Also fixes: portal skill_name routing, app.py SessionManager guard,
test_telemetry CostAwareRouter removal, test_compression_config fixture
2026-06-17 10:45:20 +08:00
chiguyong
5374bc8501
refactor: eliminate routing layer, align with industry best practices
...
Phase 1 of architecture optimization (U1/U2/U4/U8):
- U1: Rename SimpleRouter to RequestPreprocessor, route() to preprocess()
Eliminates misleading routing concept; LLM decides autonomously
in REACT agent loop (matches Codex/Claude Code/Trae pattern)
- U2: Delete CostAwareRouter, HeuristicClassifier, SemanticRouter
(~700 lines removed). skill_routing.py: 1688 to 220 lines
- U4: PlanExecEngine defaults to ReActStepExecutor, delete _LLMStepExecutor
(pure LLM calls without tools = no execution capability)
- U8: ReActEngine defaults to ContextCompressor(keep_recent=10)
Supersedes plans 2026-06-15-002/003/004.
New plan: 2026-06-16-006-refactor-architecture-optimization-evolution-plan.md
2026-06-17 10:44:40 +08:00
chiguyong
c4257591d4
refactor(router): replace CostAwareRouter with SimpleRouter and prompt-based tool calling
2026-06-16 03:31:05 +08:00
chiguyong
e984b4c462
feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match
...
- Expand ExecutionMode enum with REWOO/REFLEXION/PLAN_EXEC
- Add _resolve_execution_mode() to respect skill.config.execution_mode
- Rewrite IntentRouter._match_keywords() for multi-candidate scoring
- Add QualityGate 5th dimension: skill_match validation with warning escalation
- Calibrate HeuristicClassifier: low-complexity signals only when no high signals
- Fix negation regex for Chinese text (avoid matching past punctuation)
- Fix backtest mode_map normalization and .env loading
- Add 61 unit tests (21 HeuristicClassifier + 14 IntentRouter + 13 QualityGate + 13 existing)
Results: execution_mode_accuracy 9.09%→36.36%, skill_routing_F1 66.67%→77.78%
2026-06-15 22:43:13 +08:00