Commit Graph

3 Commits

Author SHA1 Message Date
chiguyong 78a7faa17b refactor: remove all emoji from agentkit
Deploy to Production / deploy (push) Waiting to run Details
Test / backend-test (push) Waiting to run Details
Test / frontend-unit (push) Waiting to run Details
Test / api-e2e (push) Waiting to run Details
Test / frontend-e2e (push) Waiting to run Details
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
Replace emoji across codebase: YAML avatars -> first char, frontend banners -> Ant Design Vue components, CLI status -> OK/FAIL/WARN labels, terminal -> [WARN]/[OK]/[PENDING], Bitable DB default -> table, App.vue font cleanup, test fixtures -> first char letters. shell.avatar type upgraded to string | Component.
2026-07-02 01:33:28 +08:00
chiguyong fa2a6dece2 feat(router): enable SemanticRouter + upgrade benchmark to L3/L5
- Enable SemanticRouter in agentkit.yaml (router.semantic.enabled: true)
- Integrate SemanticRouter into e2e backtest (_build_real_components)
- Add 8 new semantic test cases: 5 colloquial + 3 mixed-lang expressions
- Add L3 output quality evaluation framework (LLM-as-Judge, 1-5 score)
- Add L5 adaptive capability metrics (consistency rate from overfitting data)
- Add OutputQualityObservation model and evaluate_output_quality() method
- Report now includes L3 and L5 sections

Results: 52 tests pass, description_match F1=66.67%, L5 adaptive rate=100%
2026-06-15 23:02:47 +08:00
chiguyong e984b4c462 feat(router): optimize routing intelligence — ExecutionMode expansion, multi-candidate scoring, quality gate skill match
- Expand ExecutionMode enum with REWOO/REFLEXION/PLAN_EXEC
- Add _resolve_execution_mode() to respect skill.config.execution_mode
- Rewrite IntentRouter._match_keywords() for multi-candidate scoring
- Add QualityGate 5th dimension: skill_match validation with warning escalation
- Calibrate HeuristicClassifier: low-complexity signals only when no high signals
- Fix negation regex for Chinese text (avoid matching past punctuation)
- Fix backtest mode_map normalization and .env loading
- Add 61 unit tests (21 HeuristicClassifier + 14 IntentRouter + 13 QualityGate + 13 existing)

Results: execution_mode_accuracy 9.09%→36.36%, skill_routing_F1 66.67%→77.78%
2026-06-15 22:43:13 +08:00