fischer-agentkit/AGENTS.md

# Fischer AgentKit — Project Context

## Rules

- Python >= 3.11, type hints required, `pydantic>=2.0` for all data models
- Ruff for lint + format: `ruff check src/ && ruff format src/` (target py311, line-length 100)
- Tests: `pytest` (asyncio_mode=auto), markers: `integration`, `redis`, `postgres`
- Never use `any` type — use proper Pydantic models or `Unknown`
- API key comparison must use `hmac.compare_digest` (constant-time)
- Expert names validated with `_EXPERT_NAME_RE = re.compile(r"^[a-zA-Z0-9_-]{1,64}$")`
- HandoffTransport queues bounded (`maxsize=1024`), close uses sentinel pattern
- Frontend: Vue 3 + TypeScript + Ant Design Vue, Pinia stores, no `require()` calls

## Tech Stack

- **Backend**: Python 3.11+, FastAPI, Uvicorn, Pydantic v2, SQLAlchemy 2 (async)
- **Frontend**: Vue 3, TypeScript, Vite 5, Ant Design Vue 4, Pinia, Vue Router 4
- **Desktop**: Tauri 2.x (Rust shell + Python sidecar)
- **Infra**: Redis (bus/cache/state), PostgreSQL + pgvector (episodic memory)
- **CLI**: Typer + Rich
- **Exact versions**: see `pyproject.toml` (Python), `package.json` (Node)

## Commands

```bash
# Backend
pip install -e ".[dev]"                # Install with dev deps
agentkit gui --port 8002               # Web GUI (frontend + API)
agentkit serve --port 8001             # API-only server
agentkit chat                          # CLI interactive chat
agentkit init                          # Generate agentkit.yaml
agentkit version / doctor / usage      # Utility commands
agentkit task submit/status/list/cancel # Task management
agentkit skill list/load/info          # Skill management
agentkit pair --name X                 # Generate API key for external system
pytest                                 # Run all tests
pytest -m "not integration"            # Unit tests only
ruff check src/ && ruff format src/    # Lint + format

# Frontend
cd src/agentkit/server/frontend
npm install                            # Install deps
npm run dev                            # Vite dev server (proxy /api -> :8000)
npm run build:frontend                 # Production build -> ../static
npm run typecheck                      # TypeScript check

# Desktop
cd src/agentkit/server/frontend
npm run tauri dev                      # Tauri dev mode
npm run tauri build                    # Tauri production build

# Docker
docker-compose up -d                   # AgentKit + Redis + PostgreSQL
```

## Architecture

### Request Flow

```
User Input -> CostAwareRouter (3-layer)
  Layer 0: RegexRules (~0ms, 0 tokens) -> DIRECT_CHAT
  Layer 1: HeuristicClassifier (~0ms) / LLM quick_classify (~500ms, ~100 tokens)
  Layer 1.5: SemanticRouter (vector similarity, optional)
  Layer 2: Capability matching / Vickrey Auction
  -> ExecutionMode: DIRECT_CHAT / REACT / SKILL_REACT / TEAM_COLLAB
```

### Agent Hierarchy

```
BaseAgent (core/base.py) — abstract, execute() is final
  +-- ConfigDrivenAgent (core/config_driven.py) — YAML-driven, 3 task modes
  +-- ReActEngine (core/react.py) — Think->Act->Observe
  +-- ReflexionAgent (core/reflexion.py) — reflection-driven
  +-- ReWOOAgent (core/rewoo.py) — plan-without-observation
  +-- StandaloneAgent (core/standalone.py) — standalone runner
```

### Expert Team Mode

```
ExpertConfig (extends AgentConfig) -> Expert (wraps ConfigDrivenAgent via AgentPool)
ExpertTeam: manages experts, shared workspace, collaboration plan
TeamOrchestrator: executes plan (serial/parallel/competitive + merge)
CollaborationPlan: phases with dependencies, parallel types, merge strategies
ExpertTeamRouter: @team prefix routing, name validation, MAX_EXPERTS=10
HandoffTransport: InProcess (asyncio.Queue) + Redis Pub/Sub
```

Lifecycle: FORMING -> PLANNING -> EXECUTING -> SYNTHESIZING -> COMPLETED
On failure: fallback to single-agent mode (lead or first active expert).

### Module Map

| Layer | Modules | Purpose |
|-------|---------|---------|
| API | `server/`, `cli/` | FastAPI routes + Typer CLI |
| Service | `core/`, `chat/`, `skills/`, `experts/` | Agent engine, routing, skills, expert teams |
| Data | `memory/`, `session/`, `bus/` | Persistence, sessions, messaging |
| Utility | `llm/`, `tools/`, `evolution/`, `quality/`, `mcp/` | LLM gateway, tools, self-evolution, quality, MCP |

### Key Subsystems

- **LLM Gateway** (`llm/`): 6 providers (OpenAI/Anthropic/Gemini/Doubao/Wenxin/Yuanbao), fallback, semantic cache, usage tracking
- **Memory** (`memory/`): 4-layer (SOUL/USER/MEMORY/DAILY), WorkingMemory (Redis), EpisodicMemory (PG+pgvector), SemanticMemory (HTTP RAG)
- **Evolution** (`evolution/`): Reflector, PromptOptimizer (genetic), PitfallDetector, ABTester
- **Tools** (`tools/`): 21 built-in + MCP extension, composition (SequentialChain/ParallelFanOut/DynamicSelector)
- **Pipeline** (`orchestrator/`): PipelineEngine, SagaOrchestrator, DynamicPipeline, HandoffManager
- **Bus** (`bus/`): MemoryBus (in-process), RedisBus (distributed)

### Server Routes (17 modules)

| Prefix | Module | Purpose |
|--------|--------|---------|
| `/api/v1/agents` | agents.py | Agent CRUD |
| `/api/v1/tasks` | tasks.py | Task submit/query/cancel |
| `/api/v1/skills` | skills.py | Skill register/list |
| `/api/v1/chat` | chat.py | Chat REST + WebSocket |
| `/api/v1/ws` | ws.py | WebSocket channel |
| `/api/v1/llm` | llm.py | LLM usage |
| `/api/v1/health` | health.py | Health check |
| `/api/v1/metrics` | metrics.py | Metrics |
| `/api/v1/evolution` | evolution.py + evolution_dashboard.py | Self-evolution API |
| `/api/v1/memory` | memory.py | Memory management |
| `/api/v1/portal` | portal.py | Portal |
| `/api/v1/kb` | kb_management.py | Knowledge base |
| `/api/v1/skill-mgmt` | skill_management.py | Skill management |
| `/api/v1/workflows` | workflows.py | Workflows |
| `/api/v1/terminal` | terminal.py | Terminal |
| `/api/v1/settings` | settings.py | Settings |

### WebSocket Chat Protocol

Client -> Server: `message`, `reply`, `confirmation_reply`, `cancel`, `ping`
Server -> Client: `connected`, `token`, `thinking`, `step`, `final_answer`, `skill_match`, `confirmation_request`, `confirmation_result`, `ask_human`, `error`, `pong`
Expert Team events: `team_formed`, `expert_step`, `expert_result`, `plan_update`, `team_synthesis`, `team_dissolved`

### Frontend Pages

- `/agent/chat` — Chat with Expert Team view
- `/agent/code` — Code/workflow
- `/agent/monitor` — Evolution dashboard
- `/computer-use` — Desktop control

### Configuration Priority

CLI args > `agentkit.yaml` > env vars (`${VAR:-default}`) > `.env` > hardcoded defaults

Config search: `--config` path > `./agentkit.yaml` > `~/.agentkit/agentkit.yaml`

## Conventions

- Skill configs: `configs/skills/*.yaml` (15 presets)
- LLM configs: `agentkit.yaml` llm section (unified with server config)
- Pipeline configs: `configs/pipelines/*.yaml`
- Expert templates: registered via `ExpertTemplateRegistry`
- All Pydantic models use `model_config = ConfigDict(...)` not `class Config`
- Test files: `tests/unit/` and `tests/integration/`
- Frontend stores: Pinia, one per domain (chat, team, settings)
- Frontend components: `src/agentkit/server/frontend/src/components/`

## Boundaries

- Never modify `pyproject.toml` version without explicit request
- Never push to main directly — use feature branches
- Integration tests require Docker (Redis + PostgreSQL)
- Desktop builds require Rust toolchain + PyInstaller