Commit Graph

21 Commits

Author SHA1 Message Date
chiguyong a27eed3714 fix(config): unify config loading chain and protect ${VAR} references
- Settings API: reverse-resolve env vars to preserve ${VAR} refs in yaml,
  write new API keys to .env instead of agentkit.yaml, extract env_key
  from existing ${VAR} reference when updating providers
- Onboarding: merge-update instead of overwrite when config exists,
  use config_arg to determine output path, .env merge instead of overwrite
- Unified templates: bailian-coding provider name, full model_aliases,
  docker-compose with postgres, expanded .env.example
- Optional ruamel.yaml for comment/format preservation in Settings API
- clients.yaml: add _deep_resolve for ${VAR} env var references
- All CLI commands use load_config_with_dotenv() consistently
- Tests: mock find_config_path and CWD auto-discovery to avoid env leaks
2026-06-16 00:26:54 +08:00
chiguyong 11e2009cb8 feat(router): improve colloquial/mixed-lang routing, fix low-complexity IntentRouter bypass
Key improvements:
- Low-complexity queries (<0.3) now try IntentRouter keyword match
  before falling back to DIRECT_CHAT, fixing 0% F1 on keyword_match
- SemanticRouter similarity_low lowered from 0.6 to 0.4
- Short text (<20 chars) uses effective_low = max(0.25, low - 0.15)
- Short text with no semantic match forces LLM classify fallback
- Added colloquial keywords to 7 skill YAMLs
- Fixed code_reviewer.yaml output_schema placement
- Fixed SemanticRouter build in e2e tests
- Fixed base_url detection for bailian-coding API keys

Results: keyword_match F1 0->60.87%, colloquial F1 0->100%, mixed_lang F1 0->100%
2026-06-15 23:54:57 +08:00
chiguyong 64d62a2b60 feat: autonomous task execution - connect PlanExecEngine + TeamOrchestrator
U1: TeamOrchestrator._execute_phase real execution (Expert.agent.execute)
U2: LLM-based merge strategies (BEST/VOTE/FUSION) with fallback
U3: ReActStepExecutor replacing _LLMStepAgent for tool-enabled steps
U4: SharedWorkspace integration for cross-phase/cross-execution state
U5: GoalPlanner prompt tuning with few-shot and verb pattern matching
U6: Replan-before-fallback in TeamOrchestrator
U7: End-to-end validation tests for multi-step research tasks
U8: WebSocket progress events (step_event_callback + new event types)

Code review fixes: P0 response.strip fix, P1 competitor status check,
milestone real impl, VOTE self-bias fix, confirmation_handler wiring,
ExpertTeam public API, DRY _build_result_summaries, replan tests

Also: geo_server.py refactor (ServerConfig.from_yaml), delete llm_config.yaml
2026-06-15 12:41:32 +08:00
chiguyong 99fe4c99f7 fix: comprehensive code review fixes + WS test stability 2026-06-15 08:17:34 +08:00
chiguyong 5ef08a3b30 fix(review): comprehensive P0-P2 code review fixes 2026-06-12 22:18:25 +08:00
chiguyong 8c365486e2 fix(pipeline): address code review findings for adversarial loop
Critical:
- C1: Add verifier_timeout_seconds for independent Verifier timeout
- C2: Verifier parse failure raises RuntimeError instead of dead-loop

Major:
- M1: Inject previous_output into Worker retry context
- M2: Add Pydantic ge/le constraint on ReviewFeedback.score
- M3: Use Literal type for feedback_mode enum validation
- M4: Use Literal types for ReviewIssue severity and category
- M5: Merge error messages when escalation agent also fails

Tests: 8 new test cases added (19 total), all passing
2026-06-12 10:02:37 +08:00
chiguyong 6731d96c65 feat(configs): add code_reviewer skill and coding_harness pipeline
- code_reviewer.yaml: Verifier Agent skill config for adversarial review
  with structured output schema for ReviewFeedback format
- coding_harness.yaml: Example pipeline with adversarial loop
  develop → test → review (Worker↔Verifier) → archive
2026-06-12 09:38:37 +08:00
chiguyong 2110c84fb6 fix: switch default model to qwen3-coder-plus for better function calling
DeepSeek-chat has limited/partial function calling support. Qwen3-coder-plus
(DashScope) has robust OpenAI-compatible function calling.

Also added tool usage instructions to system prompt and enhanced logging
to trace tool propagation through the pipeline.
2026-06-12 09:27:52 +08:00
chiguyong cc4c6fe346 fix: direct-mode agent falls through to default when task needs tools
When IntentRouter matches a direct-mode agent (no tools), but the task
content suggests tool needs (shell, search, file ops, etc.), the routing
now falls through to the default agent which has full tool access.

This fixes the issue where "帮我执行个命令" would be routed to
direct_agent and fail because direct mode doesn't support tool calling.

Also restored "你好" in direct_agent keywords since it's correctly
handled now — greetings don't need tools, direct mode is fine.
2026-06-11 15:26:19 +08:00
chiguyong 52b7d6007d fix: remove '你好' from direct_agent keywords so greetings route to default agent with tools 2026-06-11 14:49:59 +08:00
chiguyong 93bc7c4e3e fix: change all agent YAML model from hardcoded provider to 'default'
Hardcoded model names like 'openai/gpt-4o-mini' or 'anthropic/claude-sonnet'
cause 'No provider available' errors when the specific provider isn't configured.
Using 'default' lets the system pick the available provider automatically.
2026-06-11 14:19:26 +08:00
chiguyong 5b42487d8a feat(core): add ReWOO, Plan-and-Execute, Reflexion execution engines
Phase A of Multi-Agent Marketplace architecture:
- ReWOOEngine: plan-all-then-execute pattern for parallel data fetch
- PlanExecEngine: adapter wrapping GoalPlanner+PlanExecutor+PipelineReplanner
- ReflexionEngine: ReAct + Evaluate + Reflect + Retry for high-precision tasks
- SkillConfig: extend VALID_EXECUTION_MODES with rewoo/plan_exec/reflexion
- ConfigDrivenAgent: add _handle_rewoo/_handle_plan_exec/_handle_reflexion routes
- 5 professional agent YAML configs with layered model defaults
- 107 unit tests passing
2026-06-10 17:08:48 +08:00
chiguyong 7874e875af merge: integrate feat/agentkit-phase8-chat-adaptive (chat/gui commands + GUI mode)
Restores agentkit chat, agentkit gui CLI commands, onboarding wizard,
and GUI mode (AGENTKIT_GUI_MODE) with static file serving.
Resolves merge conflicts in orchestrator.py, app.py, tools/__init__.py, shell.py.
2026-06-10 07:44:06 +08:00
chiguyong b34f74f598 feat(phase6): implement end-to-end enterprise scenario validation (U15)
- Add goal-driven agent skill config and pipeline config
- Add 9 E2E integration tests covering all 7 capabilities:
  - SC1: Goal-driven SEO analysis (GoalPlanner→PlanExecutor→PlanChecker→ExperienceStore)
  - SC2: Knowledge Q&A with system operation (MultiSourceRAG)
  - SC3: Workflow with approval (WorkflowStore + approval node)
  - SC4: Self-evolution experience accumulation (ExperienceStore→PitfallDetector→PathOptimizer)
  - SC5: Parallel execution efficiency verification
  - SC6: Skill registry integration (capabilities, versions, health)
  - Cross-capability: Plan+Experience+Pitfall, Review+Experience, RAG+Workflow
- All 2472 tests passing (9 integration + 2463 unit)
2026-06-10 01:38:28 +08:00
chiguyong 31bd3b126c feat(phase8): chat adaptive enhancements, pipeline reflection, search tools upgrade
- Enhanced chat CLI with adaptive mode and session management
- Added pipeline reflection and schema extensions
- Upgraded BaiduSearch and WebSearch tools with advanced capabilities
- Expanded server routes for skills and chat
- Added session store enhancements
- New chat module and pipeline reflection support
2026-06-09 23:18:06 +08:00
chiguyong bad66445ff feat(compression): U6 GEO Pipeline compression integration tests and config
Add GEO Pipeline end-to-end compression integration tests with
MockHeadroomCompressor. Add compression configuration section to
llm_config.yaml with headroom and summary mode examples.
2026-06-07 18:20:41 +08:00
chiguyong 2e547e345a feat(geo): U4 GEO skill tool binding with BaiduSearch and E2E tests
Add BaiduSearchTool (API mode + scraping fallback), bind tools to
GEO skill YAML configs (baidu_search, web_crawl, schema_extract,
schema_generate), extend geo_full_pipeline with generate_content
and deai steps, add 36 E2E integration tests.
2026-06-07 17:25:37 +08:00
chiguyong 1390bd8d6e feat(skills): U5 GEO Pipeline orchestration with DAG execution
- GEOPipeline: YAML-driven DAG pipeline with parallel/sequential execution
- PipelineStep with input_mapping ($.input.xxx, $.steps.name.output.xxx)
- Topological sort for execution groups, SharedWorkspace integration
- geo_full_pipeline.yaml: detect→analyze→optimize→track workflow
- 10 tests passing
2026-06-06 22:34:24 +08:00
chiguyong f858d279f3 feat(agentkit): Phase 3 upgrade - persistence, memory, evolution, observability
10 Implementation Units across 3 phases:

Phase A - Infrastructure:
- U1: RedisTaskStore with Redis/memory backend + factory function
- U2: TraceRecorder for execution trace recording
- U3: PersistentEvolutionStore with SQLite backend

Phase B - Core Capabilities:
- U4: MemoryRetriever integration into ReAct engine
- U5: Embedder abstraction + EpisodicMemory vector search
- U6: LLMReflector for LLM-in-the-loop reflection
- U7: SkillPipeline for multi-skill orchestration

Phase C - Enhancement:
- U8: SKILL.md format + progressive disclosure levels
- U9: ContextCompressor + prompt cache rendering
- U10: Structured logging + metrics endpoint + enhanced health check

Tests: 924 passed, 18 skipped, 0 failed
2026-06-06 17:17:45 +08:00
chiguyong 2844eeb548 feat(streaming): Phase C - LLM streaming + ReAct events + SSE endpoint
U8: StreamChunk protocol + OpenAI chat_stream + Gateway streaming with usage tracking
U9: ReActEvent dataclass + execute_stream() yielding thinking/tool_call/tool_result/final_answer
U10: POST /tasks/stream SSE endpoint + Client SDK stream_task()

15 new tests passing, no regression.
2026-06-06 11:54:17 +08:00
chiguyong 669ca604e5 feat(configs): add GEO AgentKit Server configuration
- llm_config.yaml: DeepSeek + OpenAI-compatible providers with env var substitution
- skills/ (8 YAML): citation_detector, content_generator, deai_agent, geo_optimizer,
  monitor, schema_advisor, competitor_analyzer, trend_agent
  - Added intent fields for content_generator, competitor_analyzer, trend_agent
  - Added quality_gate fields for content_generator, deai_agent, geo_optimizer
  - Updated custom_handler paths to configs.geo_handlers
- geo_tools.py: 14 FunctionTools calling GEO Backend via HTTP
- geo_handlers.py: 3 custom handlers (citation/monitor/schema) calling /internal/ API
- geo_server.py: FastAPI factory with LLM Gateway, Tool Registry, Skill Registry
2026-06-05 23:25:14 +08:00