Key improvements:
- Low-complexity queries (<0.3) now try IntentRouter keyword match
before falling back to DIRECT_CHAT, fixing 0% F1 on keyword_match
- SemanticRouter similarity_low lowered from 0.6 to 0.4
- Short text (<20 chars) uses effective_low = max(0.25, low - 0.15)
- Short text with no semantic match forces LLM classify fallback
- Added colloquial keywords to 7 skill YAMLs
- Fixed code_reviewer.yaml output_schema placement
- Fixed SemanticRouter build in e2e tests
- Fixed base_url detection for bailian-coding API keys
Results: keyword_match F1 0->60.87%, colloquial F1 0->100%, mixed_lang F1 0->100%