Adds CalendarTool implementing the Tool ABC so the ReAct engine can
create, query, update, and delete events autonomously. Resolves
event_type_name and tag_names (look up or create), sets
source="agent" to distinguish agent-created events from manual ones.
- src/agentkit/tools/calendar_tool.py — CalendarTool(Tool)
- tests/unit/tools/test_calendar_tool.py — 13 tests covering all actions
Deploy to Production / deploy (push) Waiting to runDetails
The skills tab mixed generic execution-engine templates (react/direct/
rewoo/...) with business-domain skills (monitor/geo_optimizer/...) with
no visual or data distinction. Adds a derived `category` field to the
SkillInfo/SkillDetail API models and groups the frontend display.
Backend:
- SkillInfo/SkillDetail: add category (Literal), agent_type, execution_mode,
task_mode fields
- _skill_to_info: derive category from explicit _ENGINE_TEMPLATE_NAMES set
(not name suffix — trend_agent/deai_agent are business skills despite
the _agent suffix)
- Simplify repetitive hasattr pattern with getattr
Frontend:
- ISkillInfo/ISkillDetail: add category + mode fields
- skills store: agentTemplates/businessSkills computed getters
(businessSkills is defensive: anything not explicitly engine template)
- SkillsView: group into 执行引擎 / 业务技能 sections with counts
- SkillCard: type badge (引擎/技能), category-based icon, mode display,
dark-mode-aware accent color
Tests:
- test_category_derived_from_name_suffix: verifies field exposure
- test_category_no_orphans: invariant — every skill has a valid category
- test_trend_agent_classified_as_business_skill: regression guard for
the _agent suffix misclassification bug
Code review (ce-code-review): 2 P1 + 5 P2 findings applied.
All config file writes now use the write-temp + fsync + os.replace
pattern (KTD-4) so a crash mid-write leaves the original file intact.
- Add src/agentkit/server/utils/atomic_write.py with write_text_atomic
- settings.py: _write_yaml_config and _write_env_var use atomic write
- skill_service.py: import_skill uses atomic write
- skill_service.py: update_skill_config uses atomic write + fcntl.flock
around the read-modify-write cycle to serialize concurrent updates
- Add 11 unit tests covering happy path, crash safety, concurrency, errors
The whoami route accepted rotated/old refresh tokens for cold-start
because it only checked session revocation status, not the token hash.
Now when token_type == "refresh", the route computes hash_token(token)
and compares it with the session's stored refresh_token_hash using
hmac.compare_digest (constant-time). Mismatch returns 401.
- Add SessionService.get_stored_refresh_hash(session_id) helper
- Add hash verification in whoami route (R9)
- Add TestWhoamiTokenHash with 5 integration tests
SkillService: enable/disable (persisted in skill_states table, schema
v4), import from YAML (with path traversal + name validation), reload
from file, update config. GET /skills now filters disabled skills.
KbService: list/upload/delete documents with department_id binding.
Added department_id field to KnowledgeSource + UploadedDocument.
Department visibility: (bound to user depts) ∪ (global = None).
10 new admin endpoints: skill enable/disable/import/reload/update,
KB documents CRUD, source sync/rebuild. All guarded by _require_admin.
Implemented reload stub in skill_management.py (was no-op).
54 new tests (26 unit + 28 integration). Fixed 4 pre-existing lint
errors. 357 admin tests pass, no regressions.
U1: Bump _SCHEMA_VERSION to 3, add 5 department tables (departments,
user_departments, department_skill_bindings, department_kb_bindings,
department_quotas) + 5 ORM models + helpers.
U2: DepartmentService (12 async methods: CRUD + bind/unbind skill/KB +
count_users). Mount admin_router in app.py. 36 unit + 28 integration tests.
U4: DepartmentContext FastAPI dependency (per-route, admin bypasses
filtering). filter_skills_by_department / filter_kb_sources_by_department
helpers. Applied to GET /skills and GET /kb-management/* routes.
15 integration tests for department isolation.
Also includes brainstorm + plan docs. 108 new tests, all pass.
Add create_user method to LocalAuthProvider (bcrypt hash + INSERT,
raises ValueError on duplicate username/email).
Add UserService with 9 async methods: create/list/get/update/delete
(soft)/reset_password/assign_department/remove_department/list_user_departments. reset_password revokes all sessions via SessionService.
delete_user is soft (is_active=0, row preserved).
Add 9 user endpoints to routes/admin.py: POST/GET/PATCH/DELETE users,
reset-password, assign/remove department, list departments. All
guarded by _require_admin.
Tests: 40 unit + 37 integration = 77 new tests. Full admin suite
170 tests pass, no regressions.
Adds the central business-logic layer for ``auth_sessions`` so routes,
the auth middleware, and the admin endpoints can call a single service
instead of touching the table directly.
Server
- session_service.SessionService: CRUD + lifecycle for auth_sessions.
- create() enforces the per-user cap (default 10): the oldest
active session is evicted with reason=session_cap_eviction.
- rotate() swaps a refresh token, adds the old hash to the
denylist, and raises SessionReuseDetected (revoking all sessions
for the user) when the old token is replayed.
- revoke() / revoke_by_refresh_token() / revoke_all_for_user()
with explicit reasons: user_terminated, admin_revoked,
password_changed, reuse_detected, session_cap_eviction.
- touch() bumps last_active_at (called on /auth/whoami).
- session_cache.SessionValidationCache: bounded LRU+TTL wrapper
(default 30s/1k entries) around SessionService.is_session_valid.
The middleware hits this on every request carrying a V2 sid claim;
one SQLite round-trip per 30s per session instead of per request.
- get_session_service() / get_validation_cache() module-level
singletons overridable in tests via set_session_service() /
set_validation_cache().
Tests
- tests/unit/auth/test_session_service.py: 15 cases covering
create/rotate/revoke/list/cap-eviction/reuse-detection/expired
sessions.
Refs: U3 in docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md
Adds V2 JWT claim schema that closes the kicked-out window and enables
refresh-token rotation with reuse detection.
Server
- jwt_utils.create_token_pair now takes ``session_id`` and ``remember_me``
kwargs. When ``session_id`` is provided, both tokens carry a ``sid``
claim and the access token also carries a ``jti`` claim; the refresh
token's jti is intentionally absent (rotation uses the token hash).
- New ``REFRESH_TOKEN_TTL_REMEMBER_ME = 30d`` (default 7d) selected by
the ``remember_me`` flag.
- ``verify_token`` now supports an optional ``expected_type`` filter
(e.g. ``"access"`` / ``"refresh"``); when omitted, both types pass
(used by /auth/whoami's cold-start path).
- New ``auth.denylist`` module: ``InMemoryRecentlyRevoked`` (default for
the Tauri sidecar / dev mode) and ``RedisRecentlyRevoked`` (multi-
process server). Bounded LRU with auto-expiry via ``time.monotonic()``.
Backwards-compat
- Tokens issued before U2 (no ``sid``) are still accepted by
``verify_token``; validation falls through to the legacy
``user_sessions`` table via the U10 shim (next commit).
Tests
- tests/unit/auth/test_jwt_utils.py: 12 cases — V1/V2 claim presence,
default + remember-me TTL, expected_type filter, expiry, wrong secret.
- tests/unit/auth/test_denylist.py: 6 cases — add/contains, TTL expiry,
LRU eviction, re-add refresh, clear, hash stability.
Refs: U2 in docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md
- Settings API: reverse-resolve env vars to preserve ${VAR} refs in yaml,
write new API keys to .env instead of agentkit.yaml, extract env_key
from existing ${VAR} reference when updating providers
- Onboarding: merge-update instead of overwrite when config exists,
use config_arg to determine output path, .env merge instead of overwrite
- Unified templates: bailian-coding provider name, full model_aliases,
docker-compose with postgres, expanded .env.example
- Optional ruamel.yaml for comment/format preservation in Settings API
- clients.yaml: add _deep_resolve for ${VAR} env var references
- All CLI commands use load_config_with_dotenv() consistently
- Tests: mock find_config_path and CWD auto-discovery to avoid env leaks
Key improvements:
- Low-complexity queries (<0.3) now try IntentRouter keyword match
before falling back to DIRECT_CHAT, fixing 0% F1 on keyword_match
- SemanticRouter similarity_low lowered from 0.6 to 0.4
- Short text (<20 chars) uses effective_low = max(0.25, low - 0.15)
- Short text with no semantic match forces LLM classify fallback
- Added colloquial keywords to 7 skill YAMLs
- Fixed code_reviewer.yaml output_schema placement
- Fixed SemanticRouter build in e2e tests
- Fixed base_url detection for bailian-coding API keys
Results: keyword_match F1 0->60.87%, colloquial F1 0->100%, mixed_lang F1 0->100%
- Add HeuristicClassifier to replace LLM quick_classify with zero-cost
local heuristic (keyword/length/code-pattern scoring), gated by
router.classifier config (default: heuristic)
- Add parallel tool execution in ReActEngine via asyncio.gather for
multiple independent tool_calls, gated by parallel_tools param
- Add AsyncWriteQueue for non-blocking session persistence with WAL
buffer, gated by async_writes param on SessionManager
- Add httpx.Limits connection pool config to all LLM providers
- Add router config section to ServerConfig and agentkit.yaml
- All optimizations have config switches for safe rollback
Critical:
- C1: Add verifier_timeout_seconds for independent Verifier timeout
- C2: Verifier parse failure raises RuntimeError instead of dead-loop
Major:
- M1: Inject previous_output into Worker retry context
- M2: Add Pydantic ge/le constraint on ReviewFeedback.score
- M3: Use Literal type for feedback_mode enum validation
- M4: Use Literal types for ReviewIssue severity and category
- M5: Merge error messages when escalation agent also fails
Tests: 8 new test cases added (19 total), all passing
1. InMemoryMessageBus.request(): fix param name (timeout→timeout_seconds) to match ABC
2. InMemoryMessageBus: track consumer tasks, cancel on unsubscribe
3. InMemoryMessageBus: _try_resolve_pending() in queue consumer path
4. evolve_soul(): use "default" category when patterns is empty
5. quick_classify(): use delimiter-based prompt to mitigate injection risk
6. Use asyncio.get_running_loop() instead of deprecated get_event_loop()
Phase B:
- U1: CostAwareRouter with 3-layer routing (rule/LLM/capability matching)
- U6: OrganizationContext with agent profiles and capability-based discovery
- U7: AlignmentGuard with constraint injection and cascade detection
Phase C:
- U8: Soul dynamic evolution with version tracking and reflection-triggered updates
- U9: Auction mechanism as optional advanced routing mode
- U10: Server integration + end-to-end integration tests
250 new tests passing across all units.
- Shell whitelist: use exact binary match instead of startswith
- Shell audit log: use deque(maxlen=10000) to cap memory
- Terminal history: use deque(maxlen) for O(1) eviction
- Path optimizer: cap _pending_paths at 50 entries per task_type
- Pitfall detector: only add tips to matching steps, not all
- Experience store: handle non-numeric _parse_time_window input
- Extract shared is_safe_url() to utils/security.py (DRY)
- Workflow condition evaluator: handle float() ValueError
- Enhanced chat CLI with adaptive mode and session management
- Added pipeline reflection and schema extensions
- Upgraded BaiduSearch and WebSearch tools with advanced capabilities
- Expanded server routes for skills and chat
- Added session store enhancements
- New chat module and pipeline reflection support
- Orchestrator accepts optional message_bus parameter; workers publish
task.progress messages via MessageBus after each subtask execution
- AgentPool accepts optional message_bus; auto-registers agents on
create and auto-unregisters on remove
- app.py initializes MessageBus from config and injects into AgentPool
- ServerConfig adds bus configuration field
- 5 new tests, all passing
- AgentMessage: message model with sender/recipient/topic/payload/correlation_id
- MessageBus Protocol: publish/subscribe/unsubscribe/request/broadcast/health_check
- InMemoryMessageBus: asyncio.Queue-based implementation for testing
- RedisMessageBus: Redis Streams (XADD/XREADGROUP) implementation with
consumer groups, message acknowledgment, and dead letter queue
- create_message_bus() factory with graceful Redis→InMemory fallback
- Request-response pattern via correlation_id + asyncio.Future
- 13 new tests, all passing
- AskHumanTool: Human-in-the-Loop tool for Chat mode, pushes questions
via WebSocket callback and waits for user reply via asyncio.Future
- Token streaming: execute_stream() now uses chat_stream() instead of
chat(), yielding token-type ReActEvents for each StreamChunk
- _build_response_from_stream() static method constructs LLMResponse
from accumulated stream data
- Export AskHumanTool from tools/__init__.py
- 12 new tests (7 AskHumanTool + 5 token streaming), all passing
Add HeadroomRetrieveTool that allows LLM to retrieve original
uncompressed data from CCR cache via Function Calling. Auto-registered
when HeadroomCompressor is active and available.
Add compression config to ServerConfig (following telemetry pattern),
create compressor in create_app, pass through AgentPool to
ConfigDrivenAgent, and inject into ReActEngine.execute() calls.
Add HeadroomCompressor implementing CompressionStrategy Protocol with
content-type routing (JSON→SmartCrusher, code→CodeCompressor), CCR
reversible compression cache, and graceful degradation when headroom-ai
is not installed.
Add runtime-checkable CompressionStrategy Protocol with compress(),
compress_tool_result(), and is_available() methods. Add compress_tool_result
and is_available to existing ContextCompressor. Add create_compressor()
factory function with headroom/summary provider routing and ImportError
fallback.
Add telemetry module with tracing (agent/tool/llm/pipeline_step spans),
metrics (5 histograms/counters), and setup with optional OTLP exporters.
Uses no-op pattern when opentelemetry not installed. GenAI Semantic
Conventions for LLM spans. Integrated into ReactEngine, LLMGateway,
ToolBase, and FastAPI app.
Add StepRetryPolicy with jitter-based exponential backoff, SagaOrchestrator
with LIFO compensation pattern, integrate retry_policy and compensate
fields into PipelineStage/PipelineStep schema, add GEO pipeline
compensation definitions for all 7 steps.
Add PipelineStateMemory/Redis/PG backends, PipelineStateManager with
Redis Sorted Set hot state + PostgreSQL JSONB cold persistence.
Integrated into PipelineEngine with state persistence calls at each
step transition.
Add WebCrawlTool (Crawl4AI wrapper with graceful degradation),
SchemaExtractTool (extruct-based Schema.org extraction), and
SchemaGenerateTool (JSON-LD generation with optional pydantic-schemaorg
validation). All tools work without optional dependencies.