fischer-agentkit/tests/e2e
chiguyong cac9c73dd5 fix(routing): U1-U6 路由优化 + 修复方案 + 代码审查修复
实现 6 个修复单元(U1-U6)并应用 ce-code-review 发现的 5 项安全修复。

## U1: benchmark 超时阈值
- 按 difficulty 分级超时:easy=45s, medium=60s, hard=90s
- 替换原单一 60s 硬编码

## U2: OpenAICompatibleProvider httpx 超时
- 新增 timeout 参数(默认 120s),替换硬编码 60s
- ProviderConfig.timeout 透传到 Provider
- 新增 2 项单元测试

## U3: 激活 QualityGate skill_match 校验
- BaseAgent._build_skill_context() 构造 skill_context
- 在 base.py / tasks.py / runner.py 三处传入 QualityGate.validate()

## U4: 添加 disambiguation_keywords 字段
- IntentConfig 新增 disambiguation_keywords 字段
- 8 个 skill YAML 补充该字段

## U5: 优化 RequestPreprocessor 路由正则
- 拆分 _FACTUAL_RE 为 CN/EN 双正则(中文无空格)
- 新增 _MATH_RE / _TRANSLATION_RE 纯模式
- _TOOL_CONTEXT_RE 排除需要工具的实时查询
- 多行输入守卫 + 结尾标点支持
- 新增 21 项单元测试(共 40 项全通过)

## U6: 重新基准测试
- 真实 LLM benchmark:准确率 60% -> 93.3%
- 4/5 通过,p50=40.8s,一致性=100%
- 旧基线备份至 baseline_2026-06-17_old_arch.json

## ce-code-review 修复(5 项)
- 修复 \s 字符类匹配换行符的安全隐患
- 添加事实/数学正则的结尾标点支持
- 修复 geo_optimizer.yaml 关键词重复
- 修复 _login_with_retry 不可达 return
- 修复 real_llm_server fixture stderr_fh 资源泄漏

测试:tests/unit/chat/ 63 项全通过,ruff 检查通过。
2026-06-20 19:31:49 +08:00
..
__init__.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
benchmark_dataset.py feat: 私董会讨论模式 + 回测集成 + WS持久化修复 2026-06-17 23:52:53 +08:00
benchmark_generator.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
capability_metrics.py feat(router): enable SemanticRouter + upgrade benchmark to L3/L5 2026-06-15 23:02:47 +08:00
conftest.py feat(router): enable SemanticRouter + upgrade benchmark to L3/L5 2026-06-15 23:02:47 +08:00
test_basic_api.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
test_basic_cli.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
test_basic_websocket.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
test_capability_alignment.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
test_capability_comprehensive.py refactor: standardize benchmark with industry methodology (P/R/F1, multi-run, baseline) 2026-06-17 12:01:34 +08:00
test_capability_react.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
test_capability_routing.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
test_capability_team.py feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match 2026-06-15 22:43:13 +08:00
test_real_llm_e2e.py fix(routing): U1-U6 路由优化 + 修复方案 + 代码审查修复 2026-06-20 19:31:49 +08:00
test_request_preprocessor_backtest.py fix(routing): U1-U6 路由优化 + 修复方案 + 代码审查修复 2026-06-20 19:31:49 +08:00