chiguyong
0f3f0a7550
feat(U3): G8 delta_flush_interval 调速
...
- ReActEngine 新增 flush_interval_ms 构造参数(默认 0 = 逐 chunk yield 向后兼容)
- execute_stream chunk 循环用 time.monotonic 节流,累积 _flush_buffer 批量 yield
- flush_interval_ms=0 条件短路为 True 逐 chunk yield 保当前行为
- 流结束 mid-interval 最终 flush 剩余 buffer 不丢字符
- ServerConfig.streaming 配置项(flush_interval_ms)
- test_delta_flush.py 覆盖 R11/R12/R14
2026-06-29 20:49:52 +08:00
chiguyong
c4aaef05aa
feat(U2): G2 prompt cache 双块结构
...
- ReActEngine 新增 _build_system_message(stable+volatile) 双块构造
- Anthropic provider 返回 content blocks,stable 块带 cache_control
- 非 Anthropic provider 返回字符串拼接,依赖 stable 前缀命中自动前缀缓存
- execute_stream/execute 记忆注入从 system_prompt 末尾移到 volatile 层
- LLMGateway.get_provider_name_for_model 暴露 provider 检测能力
- anthropic.py _convert_messages 支持 list-type system content 透传
- ServerConfig.prompt_cache 配置项(默认 enable=True)
- ReActEngine.prompt_cache_enable 构造参数(默认 True 保当前行为)
- test_prompt_cache_layers.py 覆盖 R4-R7/R13
2026-06-29 20:47:23 +08:00
chiguyong
c66a7773b5
feat(U1): G3 工具调用 schema 校验
...
- base.py 新增 ToolValidationError(error_code/details)与 _validate_input
- safe_execute 在 execute 前用 jsonschema.validate 校验 kwargs
- input_schema=None 跳过校验保持向后兼容
- _execute_tool 优先捕获 ToolValidationError 保留 error_code
- function_tool._infer_schema 修复 VAR_KEYWORD/VAR_POSITIONAL 误入 schema
- test_tool_schema_validation.py 覆盖 R8-R10
2026-06-29 20:34:14 +08:00
chiguyong
2747bb4e64
chore(prior): malformed tool call handling, auth whitelist, dev scripts, wave1 plan
2026-06-29 20:25:03 +08:00
chiguyong
bbbf9cd40a
feat(bitable): add bitable companion service with full P0-P2 fixes
...
Bitable is a multi-dimensional table companion service that runs alongside
the main AgentKit server. It provides structured data storage with formula
fields, views, and ingestion pipelines.
Major components:
- Domain models (Pydantic v2): Table, Field, Record, View, RecalcTask
- SQLAlchemy 2 async ORM with independent bitable PostgreSQL schema
- Formula engine: AST parser, DAG, Kahn topological sort, safe eval
- RecalcWorker: atomic task claiming (FOR UPDATE SKIP LOCKED), topo-order
processing, stale-threshold reaper for crash recovery
- REST API (/api/v1/bitable): tables, fields, records, views, files
- BitableTool: agent-facing tool with batch chunking (500/batch)
- CLI: agentkit bitable subcommands (create, list, import-excel, etc.)
- Frontend: Vue 3 + vxe-table grid with field management, views, filters
- Ingestion: Excel (openpyxl), database reflection, API collector
Security fixes (ce-code-review P0 + ce-debug P1):
- SQL injection prevention (field_id validation, parameterized queries)
- IDOR protection (_check_table_ownership on all table-level endpoints)
- SSRF prevention (URL scheme + private IP validation in parse_excel_url)
- OOM prevention (streaming file upload, batch delete, batch insert)
- Atomic recalc task claiming (FOR UPDATE SKIP LOCKED)
- Formula engine cache invalidation on field changes
- Composite cursor pagination for non-id sort orders
- Batch upsert (eliminates N+1 queries)
- Sync I/O offloaded to thread pool in async contexts
- Internal token auth (X-Internal-Token, hmac.compare_digest)
- PK unique index enforcement
Test coverage: 88 unit tests (95 skipped without Docker)
2026-06-25 01:09:59 +08:00
chiguyong
567cbc9c9b
refactor: simplify code across U1-U7 (bug fix + efficiency + reuse + quality)
2026-06-24 22:35:52 +08:00
chiguyong
3dfda904d7
feat(core): add middleware pipeline architecture with onion model
...
U6: Unified middleware protocol (before/after) with MiddlewareChain
implementing onion model execution. Parallel integration (KTD1) —
middleware path controlled by presence of middleware_chain parameter,
existing ReActEngine path unchanged when None.
- New core/middleware.py: RequestContext, Middleware protocol,
MiddlewareChain (onion model: before outer→inner, after inner→outer)
- 3 example middlewares: SummarizationMiddleware (U3 headroom compression),
TokenUsageMiddleware, LoopDetectionMiddleware (request-level audit)
- ReActEngine.__init__ accepts middleware_chain parameter
- execute() branches: middleware path when chain present, existing path otherwise
- 22 tests covering ordering, error handling, state passing, backward compat
2026-06-24 20:52:15 +08:00
chiguyong
122173ec2c
feat(core): add headroom-based compression trigger
...
U3: ContextCompressor now accepts model_context_limit, headroom_threshold,
and min_tokens. should_compress() triggers when token ratio exceeds 0.8 of
model limit OR exceeds min_tokens (8000 fallback). ReActEngine._should_compress
delegates to compressor when available, checks is_available() first.
Tests: 6 scenarios (headroom trigger, min_tokens guard, small model,
unavailable compressor, delegation, fallback) — all pass.
2026-06-24 20:28:14 +08:00
chiguyong
018b342d96
feat(react): add loop detection to prevent repeated identical tool calls
...
U1: Sliding window hash detection in ReAct loop. When the same tool is
called with identical arguments >= threshold times (default 2), injects
a correction message first, then raises LoopDetectedError if the LLM
doesn't change strategy. Covers both _execute_loop and execute_stream.
2026-06-24 20:12:35 +08:00
TraeAI
d245f2e3d8
fix: UI/UX 修复 + 暗色主题 + async generator 防御
...
- App.vue: 重构 bootstrapBackend 流程,新增 retryBootstrap 重试入口
- SplashScreen.vue: 错误状态显示「重试」按钮
- system.py: /system/resources 移除 SYSTEM_CONFIG 权限依赖,避免 dev 模式 401
- react.py + gateway.py: 新增 _ensure_async_iterable helper 防御
'async for requires aiter, got coroutine'
- theme.ts: Ant Design colorTextLightSolid 映射到 --text-inverse
修复暗色主题下所有 primary 按钮白底白字
- ChatSidebar.vue: 新建对话按钮兜底深色文字
- SystemMonitorPanel.vue: 服务状态区域间距优化
- chat.ts + portal.py + sqlite_conversation_store.py: 会话标题派生修复
解决点击对话标题变成"对话"的问题
- app.py: Serve 模式自动创建 default agent
- Tauri src-tauri/: 完整 Tauri 客户端配置 (icons, capabilities, Cargo)
2026-06-20 23:35:57 +08:00
chiguyong
5374bc8501
refactor: eliminate routing layer, align with industry best practices
...
Phase 1 of architecture optimization (U1/U2/U4/U8):
- U1: Rename SimpleRouter to RequestPreprocessor, route() to preprocess()
Eliminates misleading routing concept; LLM decides autonomously
in REACT agent loop (matches Codex/Claude Code/Trae pattern)
- U2: Delete CostAwareRouter, HeuristicClassifier, SemanticRouter
(~700 lines removed). skill_routing.py: 1688 to 220 lines
- U4: PlanExecEngine defaults to ReActStepExecutor, delete _LLMStepExecutor
(pure LLM calls without tools = no execution capability)
- U8: ReActEngine defaults to ContextCompressor(keep_recent=10)
Supersedes plans 2026-06-15-002/003/004.
New plan: 2026-06-16-006-refactor-architecture-optimization-evolution-plan.md
2026-06-17 10:44:40 +08:00
chiguyong
b54213b3c6
fix(review): resolve all P0/P1/P2 findings from code review
2026-06-16 09:08:03 +08:00
chiguyong
16ac592855
feat(gateway): empty response auto-retry with fallback model chain
2026-06-16 08:07:21 +08:00
chiguyong
9caf332e9e
fix: ensure agent never returns empty result to user
2026-06-16 08:01:43 +08:00
chiguyong
c4257591d4
refactor(router): replace CostAwareRouter with SimpleRouter and prompt-based tool calling
2026-06-16 03:31:05 +08:00
chiguyong
0ccef7be5c
feat: P0 production hardening — LLM cache, semantic routing, state persistence
...
U1: LLM Cache Core (exact + semantic match, InMemory + Redis backends)
U2: Cache integration into LLMGateway with CacheConfig
U3: Semantic Router as Layer 1.5 in CostAwareRouter
U4: UsageStore persistence (Redis Hash + InMemory fallback)
U5: CascadeStateStore persistence (Redis INCR + InMemory TTL)
U6: EvolutionStore interface unification (Protocol + PostgreSQL backend)
U7: Configuration integration + E2E tests
Code review fixes:
- P0: date iteration bug (day>=28), semantic router index never built,
Redis connection leak (per-call → persistent pool)
- P1: cache degradation recovery, semantic_search degradation,
double miss counting, asyncio.Lock for PG init, LIMIT on queries,
__import__ anti-pattern → _utcnow()
- P2: InMemory TTL cleanup, embedding preservation on put(),
data TTL = max(exact_ttl, semantic_ttl)
2026-06-14 15:16:00 +08:00
chiguyong
5ef08a3b30
fix(review): comprehensive P0-P2 code review fixes
2026-06-12 22:18:25 +08:00
chiguyong
2e55aae775
fix(review): address code review findings for speed optimization
...
- P0: Rename WAL buffer to pending buffer, add crash-loss warning
- P1: Fix keyword substring false matches with word-boundary regex
- P1: Pass connection pool params in _build_llm_config
- P1: Change parallel_tools default to False (safer default)
- P1: Add classifier value validation in CostAwareRouter
- P2: Replace __import__ with proper datetime import
- P2: Add max_buffer_size enforcement in AsyncWriteQueue
2026-06-12 13:21:44 +08:00
chiguyong
a36bc3d1c1
feat: optimize chat response speed for sub-1s first token latency
...
- Add HeuristicClassifier to replace LLM quick_classify with zero-cost
local heuristic (keyword/length/code-pattern scoring), gated by
router.classifier config (default: heuristic)
- Add parallel tool execution in ReActEngine via asyncio.gather for
multiple independent tool_calls, gated by parallel_tools param
- Add AsyncWriteQueue for non-blocking session persistence with WAL
buffer, gated by async_writes param on SessionManager
- Add httpx.Limits connection pool config to all LLM providers
- Add router config section to ServerConfig and agentkit.yaml
- All optimizations have config switches for safe rollback
2026-06-12 13:15:06 +08:00
chiguyong
32c800d1e4
fix: portal routing + response speed + IME input
...
1. Portal unified routing: ws_chat now uses CostAwareRouter uniformly
(handles Layer 0/1/2), replacing direct IntentRouter calls.
Greeting/chat_mode requests skip IntentRouter LLM call entirely.
2. Response speed: greeting & simple chat now use direct LLM call
(no ReAct loop), zero-cost Layer 0 detection.
3. IME input fix: use e.isComposing (native browser property)
instead of compositionstart/end for Enter key detection.
4. Test: fix InMemoryMessageBus.request() parameter name
timeout -> timeout_seconds.
2026-06-11 21:30:25 +08:00
chiguyong
d47f279887
fix: resolve code review issues from deferred improvements
...
1. InMemoryMessageBus.request(): fix param name (timeout→timeout_seconds) to match ABC
2. InMemoryMessageBus: track consumer tasks, cancel on unsubscribe
3. InMemoryMessageBus: _try_resolve_pending() in queue consumer path
4. evolve_soul(): use "default" category when patterns is empty
5. quick_classify(): use delimiter-based prompt to mitigate injection risk
6. Use asyncio.get_running_loop() instead of deprecated get_event_loop()
2026-06-11 13:49:02 +08:00
chiguyong
7054ac02b6
feat(tools): add AskHumanTool + token streaming in ReAct execute_stream
...
- AskHumanTool: Human-in-the-Loop tool for Chat mode, pushes questions
via WebSocket callback and waits for user reply via asyncio.Future
- Token streaming: execute_stream() now uses chat_stream() instead of
chat(), yielding token-type ReActEvents for each StreamChunk
- _build_response_from_stream() static method constructs LLMResponse
from accumulated stream data
- Export AskHumanTool from tools/__init__.py
- 12 new tests (7 AskHumanTool + 5 token streaming), all passing
2026-06-07 23:40:43 +08:00
chiguyong
b34b06724d
fix(agentkit): resolve all P0/P1/P2/P3 issues from code review
2026-06-07 22:05:18 +08:00
chiguyong
fcb4fb33f3
feat(compression): U3 ReAct engine tool result compression and incremental compress
...
Extend _build_tool_result_message to accept compressor parameter for
tool output compression. Add _should_compress helper for token budget
checking. Add incremental compression within ReAct loop when
conversation exceeds threshold.
2026-06-07 18:19:53 +08:00
chiguyong
239009357a
feat(telemetry): U7 OpenTelemetry integration with zero-dependency no-op pattern
...
Add telemetry module with tracing (agent/tool/llm/pipeline_step spans),
metrics (5 histograms/counters), and setup with optional OTLP exporters.
Uses no-op pattern when opentelemetry not installed. GenAI Semantic
Conventions for LLM spans. Integrated into ReactEngine, LLMGateway,
ToolBase, and FastAPI app.
2026-06-07 17:26:21 +08:00
chiguyong
6e362a8ae7
feat(agentkit): Phase 4 enterprise production upgrade — 12 Implementation Units
...
Phase A (P0): EpisodicMemory pgvector search+EmbeddingCache, ReAct timeout+CancellationToken, evolution system fix (A/B test+LLMPromptOptimizer+StrategyTuner), AnthropicProvider native Messages API
Phase B (P1): RetryPolicy+CircuitBreaker, chat_stream fallback chain, WebSocket endpoint, SSE stream fix, Evolution+Memory API routes (7 endpoints), embedding cache+Enhanced Search per-KB degradation fix
Phase C (P2): GeminiProvider native generateContent API, Agent state lock+config hot-reload
Tests: 1301 passed, 18 skipped, 0 failed
2026-06-06 21:51:04 +08:00
chiguyong
e33dc25ad3
feat(memory): RAG pipeline optimization — 5 Implementation Units
...
U1: QueryTransformer — LLM/rule-based query rewriting + sub-query decomposition
U2: HttpRAGService enhanced_search() — rerank + compression via /bases/{kb_id}/retrieve
U3: Structured context injection — source attribution headers in RAG results
U4: RetrieveKnowledgeTool — built-in tool for mid-reasoning knowledge retrieval
U5: Configurable retrieval params + per-KB weights + CJK token estimation
Config example:
memory:
retrieval:
top_k: 5
token_budget: 2000
context_template: structured
query_transform:
enabled: true
strategy: llm
semantic:
search_mode: enhanced
use_rerank: true
kb_weights:
industry-kb-id: 1.2
enterprise-kb-id: 0.8
Tests: 1037 passed, 18 skipped, 0 failed
2026-06-06 19:27:09 +08:00
chiguyong
8620751864
fix(review): address P0+P1 findings from Tier 2 code review
...
P0: MemoryRetriever.retrieve score mutation fix
P1: Redis atomic Lua script, deprecated API fix, SQLite WAL mode,
Redis URL masking, UniqueConstraint, TraceRecorder completed flag,
EpisodicMemory recall improvement, LLMReflector sanitization,
A/B test safety, generator cleanup, ContextCompressor guards,
OpenAIEmbedder reuse, Pipeline failure handling, Metrics O(1),
Health check Redis PING, CLI skill loading, CORS config,
API key direct pass-through
Tests: 924 passed, 18 skipped, 0 failed
2026-06-06 17:57:47 +08:00
chiguyong
f858d279f3
feat(agentkit): Phase 3 upgrade - persistence, memory, evolution, observability
...
10 Implementation Units across 3 phases:
Phase A - Infrastructure:
- U1: RedisTaskStore with Redis/memory backend + factory function
- U2: TraceRecorder for execution trace recording
- U3: PersistentEvolutionStore with SQLite backend
Phase B - Core Capabilities:
- U4: MemoryRetriever integration into ReAct engine
- U5: Embedder abstraction + EpisodicMemory vector search
- U6: LLMReflector for LLM-in-the-loop reflection
- U7: SkillPipeline for multi-skill orchestration
Phase C - Enhancement:
- U8: SKILL.md format + progressive disclosure levels
- U9: ContextCompressor + prompt cache rendering
- U10: Structured logging + metrics endpoint + enhanced health check
Tests: 924 passed, 18 skipped, 0 failed
2026-06-06 17:17:45 +08:00
chiguyong
2844eeb548
feat(streaming): Phase C - LLM streaming + ReAct events + SSE endpoint
...
U8: StreamChunk protocol + OpenAI chat_stream + Gateway streaming with usage tracking
U9: ReActEvent dataclass + execute_stream() yielding thinking/tool_call/tool_result/final_answer
U10: POST /tasks/stream SSE endpoint + Client SDK stream_task()
15 new tests passing, no regression.
2026-06-06 11:54:17 +08:00
chiguyong
f87b790c0f
feat(agentkit): v2 Phase 1 - ReAct/LLM Gateway/Skill/Server + review fixes
...
535 unit + 52 integration tests passing. README added.
2026-06-05 23:32:16 +08:00