- Settings API: reverse-resolve env vars to preserve ${VAR} refs in yaml,
write new API keys to .env instead of agentkit.yaml, extract env_key
from existing ${VAR} reference when updating providers
- Onboarding: merge-update instead of overwrite when config exists,
use config_arg to determine output path, .env merge instead of overwrite
- Unified templates: bailian-coding provider name, full model_aliases,
docker-compose with postgres, expanded .env.example
- Optional ruamel.yaml for comment/format preservation in Settings API
- clients.yaml: add _deep_resolve for ${VAR} env var references
- All CLI commands use load_config_with_dotenv() consistently
- Tests: mock find_config_path and CWD auto-discovery to avoid env leaks
Key improvements:
- Low-complexity queries (<0.3) now try IntentRouter keyword match
before falling back to DIRECT_CHAT, fixing 0% F1 on keyword_match
- SemanticRouter similarity_low lowered from 0.6 to 0.4
- Short text (<20 chars) uses effective_low = max(0.25, low - 0.15)
- Short text with no semantic match forces LLM classify fallback
- Added colloquial keywords to 7 skill YAMLs
- Fixed code_reviewer.yaml output_schema placement
- Fixed SemanticRouter build in e2e tests
- Fixed base_url detection for bailian-coding API keys
Results: keyword_match F1 0->60.87%, colloquial F1 0->100%, mixed_lang F1 0->100%
- Add HeuristicClassifier to replace LLM quick_classify with zero-cost
local heuristic (keyword/length/code-pattern scoring), gated by
router.classifier config (default: heuristic)
- Add parallel tool execution in ReActEngine via asyncio.gather for
multiple independent tool_calls, gated by parallel_tools param
- Add AsyncWriteQueue for non-blocking session persistence with WAL
buffer, gated by async_writes param on SessionManager
- Add httpx.Limits connection pool config to all LLM providers
- Add router config section to ServerConfig and agentkit.yaml
- All optimizations have config switches for safe rollback
Critical:
- C1: Add verifier_timeout_seconds for independent Verifier timeout
- C2: Verifier parse failure raises RuntimeError instead of dead-loop
Major:
- M1: Inject previous_output into Worker retry context
- M2: Add Pydantic ge/le constraint on ReviewFeedback.score
- M3: Use Literal type for feedback_mode enum validation
- M4: Use Literal types for ReviewIssue severity and category
- M5: Merge error messages when escalation agent also fails
Tests: 8 new test cases added (19 total), all passing
1. InMemoryMessageBus.request(): fix param name (timeout→timeout_seconds) to match ABC
2. InMemoryMessageBus: track consumer tasks, cancel on unsubscribe
3. InMemoryMessageBus: _try_resolve_pending() in queue consumer path
4. evolve_soul(): use "default" category when patterns is empty
5. quick_classify(): use delimiter-based prompt to mitigate injection risk
6. Use asyncio.get_running_loop() instead of deprecated get_event_loop()
Phase B:
- U1: CostAwareRouter with 3-layer routing (rule/LLM/capability matching)
- U6: OrganizationContext with agent profiles and capability-based discovery
- U7: AlignmentGuard with constraint injection and cascade detection
Phase C:
- U8: Soul dynamic evolution with version tracking and reflection-triggered updates
- U9: Auction mechanism as optional advanced routing mode
- U10: Server integration + end-to-end integration tests
250 new tests passing across all units.
- Shell whitelist: use exact binary match instead of startswith
- Shell audit log: use deque(maxlen=10000) to cap memory
- Terminal history: use deque(maxlen) for O(1) eviction
- Path optimizer: cap _pending_paths at 50 entries per task_type
- Pitfall detector: only add tips to matching steps, not all
- Experience store: handle non-numeric _parse_time_window input
- Extract shared is_safe_url() to utils/security.py (DRY)
- Workflow condition evaluator: handle float() ValueError
- Enhanced chat CLI with adaptive mode and session management
- Added pipeline reflection and schema extensions
- Upgraded BaiduSearch and WebSearch tools with advanced capabilities
- Expanded server routes for skills and chat
- Added session store enhancements
- New chat module and pipeline reflection support
- Orchestrator accepts optional message_bus parameter; workers publish
task.progress messages via MessageBus after each subtask execution
- AgentPool accepts optional message_bus; auto-registers agents on
create and auto-unregisters on remove
- app.py initializes MessageBus from config and injects into AgentPool
- ServerConfig adds bus configuration field
- 5 new tests, all passing
- AgentMessage: message model with sender/recipient/topic/payload/correlation_id
- MessageBus Protocol: publish/subscribe/unsubscribe/request/broadcast/health_check
- InMemoryMessageBus: asyncio.Queue-based implementation for testing
- RedisMessageBus: Redis Streams (XADD/XREADGROUP) implementation with
consumer groups, message acknowledgment, and dead letter queue
- create_message_bus() factory with graceful Redis→InMemory fallback
- Request-response pattern via correlation_id + asyncio.Future
- 13 new tests, all passing
- AskHumanTool: Human-in-the-Loop tool for Chat mode, pushes questions
via WebSocket callback and waits for user reply via asyncio.Future
- Token streaming: execute_stream() now uses chat_stream() instead of
chat(), yielding token-type ReActEvents for each StreamChunk
- _build_response_from_stream() static method constructs LLMResponse
from accumulated stream data
- Export AskHumanTool from tools/__init__.py
- 12 new tests (7 AskHumanTool + 5 token streaming), all passing
Add HeadroomRetrieveTool that allows LLM to retrieve original
uncompressed data from CCR cache via Function Calling. Auto-registered
when HeadroomCompressor is active and available.
Add compression config to ServerConfig (following telemetry pattern),
create compressor in create_app, pass through AgentPool to
ConfigDrivenAgent, and inject into ReActEngine.execute() calls.