diff --git a/docs/brainstorms/2026-06-20-centralized-auth-token-persistence-requirements.md b/docs/brainstorms/2026-06-20-centralized-auth-token-persistence-requirements.md index dc886da..bbc037d 100644 --- a/docs/brainstorms/2026-06-20-centralized-auth-token-persistence-requirements.md +++ b/docs/brainstorms/2026-06-20-centralized-auth-token-persistence-requirements.md @@ -1,10 +1,10 @@ # Fischer AgentKit — 集中鉴权与 Token 持久化 (Requirements) **Date:** 2026-06-20 -**Branch:** `feat/centralized-auth-token-persistence` -**Status:** Draft -**Scope:** 服务端签发 JWT + 客户端安全持久化 + 服务端 Session 表 + Refresh Token 轮换 + 「记住我」 + 启动态区分 -**Out of scope:** 企业 IdP 对接(OIDC / SAML / LDAP)、多租户、密码强度策略、2FA、SSO 跳转 +**Branch:** `feat/auth-server-token-persistence`(原 `feat/centralized-auth-token-persistence`) +**Status:** Active — 已合并 AuthProvider 抽象层 scope(2026-06-20 更新) +**Scope:** 服务端签发 JWT + 客户端安全持久化 + 服务端 Session 表 + Refresh Token 轮换 + 「记住我」 + 启动态区分 + **AuthProvider 抽象层(为未来对接集团 IdP 留扩展点)** +**Out of scope:** 实现具体企业 IdP 适配(OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)、多租户、密码强度策略、2FA、SSO 跳转 --- @@ -42,15 +42,19 @@ - 支持 **「记住我」**(refresh 7d / 30d) - 客户端 access token **预刷新**(剩余 <2min 主动 refresh) - 启动时区分 `no_token` / `token_invalid` / `token_valid` 三态 +- 鉴权后端 **可插拔**(AuthProvider 抽象):当前 Local,未来 OIDC / SAML / LDAP 只需新增 adapter +- admin 端点 **与认证后端解耦**(统一通过 user_id 操作),未来切换 IdP 不影响 session 管理 +- 审计日志记录登录来源(`auth_provider` 字段),集团接管时可溯源 ### 1.3 设计哲学 -> **服务端权威 + 客户端最小信任 + 加密落盘 + 可观测可治理** +> **服务端权威 + 客户端最小信任 + 加密落盘 + 可观测可治理 + 认证后端可插拔** - **服务端权威**:所有 token 校验、续期、撤销由服务端说了算,客户端不可绕过 - **客户端最小信任**:客户端不存密码、access token 不持久化(仅内存)、refresh token 进 Keychain - **加密落盘**:refresh token 走 OS 级加密(Tauri: macOS Keychain / Windows Credential Manager / Linux Secret Service) - **可观测可治理**:admin 能看 / 踢任意 session,集团统一管理的基础设施准备好 +- **认证后端可插拔**:所有用户认证逻辑走 `AuthProvider` Protocol(authenticate / get_user / sync_attributes / revoke_user),当前 `LocalAuthProvider` 封装 SQLite + bcrypt;未来 `OidcAuthProvider` 接管时,路由层、admin API、Session 表都不需要重写 --- @@ -72,6 +76,9 @@ | F10 | 客户端 access token 剩余 <2min 主动 refresh | 不依赖 401 触发 | | F11 | 启动区分 `no_token` / `token_invalid` / `token_valid` | 错误态有「重试」按钮而不是直接清空 | | F12 | 多个 Tauri / Web 客户端可同时登录同一账号 | 互不干扰,独立 session | +| F13 | 鉴权后端可插拔(AuthProvider 抽象) | 配置切换 `local` ↔ `oidc-stub`,路由/Admin/Session 表零修改 | +| F14 | admin 端点与认证后端解耦 | 未来切 IdP,admin 看 session 列表 / 踢人功能不变 | +| F15 | 审计日志记录 `auth_provider` 字段 | 登录来源可溯源(local / oidc / saml) | ### 2.2 Non-Functional Goals @@ -88,7 +95,7 @@ ## 3. Non-Goals -- ❌ **企业 IdP 对接**(OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)— 下一迭代单独 brainstorm +- ❌ **实现具体企业 IdP 适配**(OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)— 下一迭代单独 brainstorm;本次只预留 AuthProvider 抽象层 - ❌ **多租户 / 集团多组织隔离** — 当前单租户架构不动 - ❌ **密码强度策略 / 密码过期 / 密码历史** — 单独的 IAM 改造 - ❌ **2FA / TOTP / WebAuthn / Passkey** — 单独 brainstorm @@ -155,6 +162,7 @@ | `revoked` | INTEGER (bool) | — | 是否被踢 | | `revoked_reason` | TEXT NULL | — | `user_terminated` / `password_changed` / `admin_revoked` / `reuse_detected` | | `previous_session_id` | TEXT NULL | — | refresh 轮换的上一跳(用于审计) | +| `auth_provider` | TEXT | — | 登录来源:`local` / `oidc-stub` / `saml`(未来扩展) | #### 5.1.2 JWT payload 扩展 @@ -320,6 +328,189 @@ apiClient.interceptors.request.use(async (config) => { - 客户端版本检查:`Authorization` header 带 `X-Client-Version` header - 服务端支持 1 个 minor 版本的旧客户端(~30 天灰度) +### 5.5 鉴权后端可插拔 — AuthProvider 抽象层 + +> **设计动机**:当前用本地 users 表 + bcrypt 校验密码。未来集团对接 OIDC / SAML / LDAP 时,路由层、admin API、Session 表都不应重写。通过 `AuthProvider` Protocol 把"用户存在哪里 / 密码怎么校验 / 属性怎么同步"封装在 adapter 内部。 + +#### 5.5.1 AuthProvider Protocol + +```python +# auth/providers/base.py +from typing import Protocol +from ..models import User + +class AuthProvider(Protocol): + """所有鉴权后端必须实现的能力。 + + 路由层只调用以下方法,不感知具体实现是 SQLite / OIDC / LDAP。 + """ + + name: str # 标识当前 provider,写入 session.auth_provider + + async def authenticate(self, *, username: str, password: str) -> User: + """校验用户名 + 密码,返回 User 对象。失败抛 InvalidCredentials。""" + ... + + async def get_user_by_id(self, user_id: int) -> User | None: + """按 id 查 user(admin 端点、session 校验、whoami 都用这个)。""" + ... + + async def sync_user_attributes(self, user_id: int) -> None: + """同步用户属性(部门/邮箱/职位等)。 + + LocalAuthProvider: no-op + OidcAuthProvider: 从 IdP 拉最新 profile 写回本地 users 表 + """ + ... + + async def revoke_user(self, user_id: int) -> None: + """禁用用户(离职 / 锁定场景)。 + + LocalAuthProvider: UPDATE users SET is_active=0 + OidcAuthProvider: 调 IdP 的 disable API(未来) + """ + ... +``` + +#### 5.5.2 默认实现 LocalAuthProvider + +```python +# auth/providers/local.py +class LocalAuthProvider: + name = "local" + + def __init__(self, db: aiosqlite.Connection): + self._db = db + + async def authenticate(self, *, username: str, password: str) -> User: + # 封装现有 routes/auth.py:201-213 的 password 校验逻辑 + row = await self._db.execute( + "SELECT id, username, password_hash, is_active FROM users WHERE username = ?", + (username,), + ) + row = await row.fetchone() + if not row or not row["is_active"]: + raise InvalidCredentials("user not found or inactive") + if not verify_password(password, row["password_hash"]): + raise InvalidCredentials("invalid password") + return await load_user(row["id"]) + + async def get_user_by_id(self, user_id: int) -> User | None: + return await load_user(user_id) + + async def sync_user_attributes(self, user_id: int) -> None: + return # local provider: no-op + + async def revoke_user(self, user_id: int) -> None: + await self._db.execute( + "UPDATE users SET is_active = 0 WHERE id = ?", (user_id,) + ) + await self._db.commit() +``` + +#### 5.5.3 占位实现 StubOIDCProvider + +```python +# auth/providers/oidc_stub.py +class StubOIDCProvider: + """OIDC 对接的接口占位。 + + 当前阶段只定义接口契约,不做实际 IdP 通讯。下一迭代实现时, + 重写 authenticate / sync_user_attributes / revoke_user 即可, + 路由层、admin API、Session 表零修改。 + """ + + name = "oidc-stub" + + async def authenticate(self, *, username: str, password: str) -> User: + raise NotImplementedError( + "OIDC provider not implemented. " + "Use 'local' provider in agentkit.yaml: auth.provider: local" + ) + + async def get_user_by_id(self, user_id: int) -> User | None: + raise NotImplementedError + + async def sync_user_attributes(self, user_id: int) -> None: + raise NotImplementedError + + async def revoke_user(self, user_id: int) -> None: + raise NotImplementedError +``` + +#### 5.5.4 Provider 切换配置 + +```yaml +# agentkit.yaml +auth: + provider: local # local | oidc-stub (未来: oidc-keycloak, oidc-feishu, ...) + session: + table: auth_sessions + access_ttl_seconds: 900 + refresh_ttl_seconds: 604800 + refresh_ttl_remember_me_seconds: 2592000 + jwt: + secret_env: AGENTKIT_JWT_SECRET + algorithm: HS256 +``` + +#### 5.5.5 路由层 DI 注入 + +```python +# auth/providers/__init__.py +from functools import lru_cache +from ..config import get_settings +from .base import AuthProvider +from .local import LocalAuthProvider +from .oidc_stub import StubOIDCProvider + +@lru_cache +def get_auth_provider() -> AuthProvider: + settings = get_settings() + db = await get_auth_db() # 现有 aiosqlite 连接 + if settings.auth.provider == "local": + return LocalAuthProvider(db) + elif settings.auth.provider == "oidc-stub": + return StubOIDCProvider() + else: + raise ValueError(f"unknown auth provider: {settings.auth.provider}") +``` + +```python +# routes/auth.py 改造点 +from fastapi import Depends +from ..auth.providers import get_auth_provider, AuthProvider + +@router.post("/login") +async def login( + body: LoginRequest, + provider: AuthProvider = Depends(get_auth_provider), +) -> LoginResponse: + user = await provider.authenticate(username=body.username, password=body.password) + # ... 后续 session 创建逻辑不变 +``` + +#### 5.5.6 admin 端点与 Provider 解耦 + +所有 admin API(`/admin/users/{id}/sessions` 等)都通过 `user_id` 操作,**不直接调用 provider.authenticate**。这意味着: + +- 未来切到 OIDC,admin 踢人 / 看 session 列表功能不变 +- LocalAuthProvider.revoke_user 和 OidcAuthProvider.revoke_user 实现不同,但 admin 端点统一调 `provider.revoke_user(user_id)` +- 审计日志记录 `auth_provider`,未来切 IdP 后可溯源"哪个 session 是本地建的、哪个是 IdP 建的" + +#### 5.5.7 未来 IdP 对接清单(下一迭代参考) + +下一迭代实现 OIDC 时,按此 checklist 推进: + +- [ ] `auth/providers/oidc.py` — 实现 OidcAuthProvider(authenticate / get_user / sync_attributes / revoke_user) +- [ ] `auth/oauth_routes.py` — `/auth/oauth/{provider}/redirect` 和 `/auth/oauth/{provider}/callback` 端点 +- [ ] `auth/state_cache.py` — OAuth state 参数防 CSRF(Redis TTL 5min) +- [ ] 用户首次从 IdP 登录时的「本地账号创建」策略(justeer / 拒绝 / 邀请制) +- [ ] IdP 端的 session 同步(IdP 登出时本地 session 也撤销) +- [ ] 集团部门 / 职位属性映射到本地 users 表 + +本次迭代只做 1-3 项的占位(接口 + stub 实现),其余列入下一迭代的独立 brainstorm。 + --- ## 6. Non-Functional Requirements @@ -375,8 +566,12 @@ apiClient.interceptors.request.use(async (config) => { - `src/agentkit/server/auth/session.py` — session CRUD + 轮换逻辑 - `src/agentkit/server/auth/jwt_utils.py` — 扩展 JWT payload(加 sid / jti) - `src/agentkit/server/auth/keychain_audit.py` — refresh reuse 检测 -- `src/agentkit/server/routes/auth.py` — 修改 login/refresh/logout,新增 whoami/sessions/change-password -- `src/agentkit/server/routes/admin.py`(或 auth 内部)— admin session 管理 +- `src/agentkit/server/auth/providers/base.py` — `AuthProvider` Protocol 接口契约 +- `src/agentkit/server/auth/providers/local.py` — `LocalAuthProvider` 默认实现(封装 SQLite + bcrypt) +- `src/agentkit/server/auth/providers/oidc_stub.py` — `StubOIDCProvider` 占位实现(NotImplementedError + 文档) +- `src/agentkit/server/auth/providers/__init__.py` — `get_auth_provider()` DI 工厂 +- `src/agentkit/server/routes/auth.py` — 修改 login/refresh/logout,新增 whoami/sessions/change-password;通过 `Depends(get_auth_provider)` 注入 provider +- `src/agentkit/server/routes/admin.py`(或 auth 内部)— admin session 管理(按 user_id 操作,与 provider 解耦) - `migrations/versions/xxx_add_auth_sessions.py` — Alembic migration - `src/agentkit/server/auth/cache.py` — Redis session 元数据 cache @@ -460,12 +655,14 @@ apiClient.interceptors.request.use(async (config) => { ### 8.1 假设 -- **A1**: 用户当前没有 IdP 集成需求,集团统一管理仅靠本系统自带的 session 管理 + admin 端点满足 +- **A1**: 本次迭代只预留 AuthProvider 抽象层(接口 + Local + OIDC stub),不实现具体 IdP 适配;集团对接需求在下一迭代独立 brainstorm - **A2**: 单租户架构不变(`auth.db` 全局共享) - **A3**: Tauri 仅支持桌面平台(macOS / Windows / Linux),不规划移动端 - **A4**: Keychain 失败时降级到 localStorage 可接受(dev 环境 / Linux 无 keyring daemon) - **A5**: Web 端仅作为开发/降级用途,不追求 localStorage 加密 - **A6**: 旧客户端灰度期 ≤ 1 个 minor 版本(约 30 天) +- **A7**: AuthProvider 的 Local 实现保留 bcrypt 成本 = 12(已实现),未来 IdP 接管时 Local 仍可用作「本地应急账号」 +- **A8**: admin API 鉴权走现有 RBAC(不依赖 provider),未来切 IdP 时 admin 角色定义保持不变 ### 8.2 待澄清(实施前确认) @@ -495,13 +692,14 @@ apiClient.interceptors.request.use(async (config) => { ## 10. Out of Scope (Explicit) -- 企业 IdP / SSO(OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)— 下一迭代 -- 2FA / TOTP / WebAuthn / Passkey — 后续 -- 多租户 — 后续 -- 密码强度策略 / 密码过期 — 后续 -- 登录失败锁定 / 滑窗限流 — 后续安全加固 -- 邮件 / 短信通知 — 需要先有通知服务 -- 完整审计日志 / 全文检索 / 导出 — 后续 +- ❌ **实现具体企业 IdP / SSO 适配**(OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)— 下一迭代单独 brainstorm;本次只预留 AuthProvider 抽象层 +- ❌ OAuth / SAML 跳转流程、state cache、用户属性同步等 IdP 集成细节 — 见 5.5.7 checklist +- ❌ 2FA / TOTP / WebAuthn / Passkey — 后续 +- ❌ 多租户 — 后续 +- ❌ 密码强度策略 / 密码过期 — 后续 +- ❌ 登录失败锁定 / 滑窗限流 — 后续安全加固 +- ❌ 邮件 / 短信通知 — 需要先有通知服务 +- ❌ 完整审计日志 / 全文检索 / 导出 — 后续 --- @@ -520,3 +718,5 @@ apiClient.interceptors.request.use(async (config) => { - `keyring` crate:https://docs.rs/keyring/latest/keyring/ - OWASP JWT 安全备忘单:https://cheatsheetseries.owasp.org/cheatsheets/JSON_Web_Token_for_Java_Cheat_Sheet.html - Auth0 Refresh Token Rotation:https://auth0.com/docs/secure/tokens/refresh-tokens/refresh-token-rotation + - OIDC Core 1.0(未来对接参考):https://openid.net/specs/openid-connect-core-1_0.html + - OAuth 2.0 Authorization Framework(未来对接参考):https://www.rfc-editor.org/rfc/rfc6749 diff --git a/docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md b/docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md index 62dacfa..9d16ca2 100644 --- a/docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md +++ b/docs/plans/2026-06-20-002-feat-centralized-auth-token-persistence-plan.md @@ -2,15 +2,17 @@ **Date:** 2026-06-20 **Status:** active -**Branch:** `feat/centralized-auth-token-persistence` +**Branch:** `feat/auth-server-token-persistence`(原 `feat/centralized-auth-token-persistence`) **Type:** feat **Origin:** [docs/brainstorms/2026-06-20-centralized-auth-token-persistence-requirements.md](docs/brainstorms/2026-06-20-centralized-auth-token-persistence-requirements.md) +> **2026-06-20 更新**:合并 AuthProvider 抽象层 scope(origin §5.5),新增 KTD-10、U1 `auth_provider` 字段、U3/U4 改造点、U11 实施单元、Phase 6、AE-10/AE-11。 + --- ## Summary -Replace the current minimal JWT + localStorage auth with a production-grade scheme: server-side **session table** (track every login, enable forced revocation), **Tauri OS Keychain** storage for refresh tokens (encrypted at rest), **refresh token rotation** (defense against token leakage), **pre-emptive token refresh** (no 401 storms), and a **three-state startup** (valid / invalid / error). Goal: after first login, Tauri cold-start goes directly to the main app, no login page; admin can see/force-revoke any user's active sessions; password change instantly invalidates all other devices. +Replace the current minimal JWT + localStorage auth with a production-grade scheme: server-side **session table** (track every login, enable forced revocation), **Tauri OS Keychain** storage for refresh tokens (encrypted at rest), **refresh token rotation** (defense against token leakage), **pre-emptive token refresh** (no 401 storms), a **three-state startup** (valid / invalid / error), and an **AuthProvider 抽象层** that decouples routes / admin API / session table from the concrete auth backend (Local today; OIDC / SAML / LDAP tomorrow via adapter). Goal: after first login, Tauri cold-start goes directly to the main app, no login page; admin can see/force-revoke any user's active sessions; password change instantly invalidates all other devices; future enterprise IdP integration requires no rewrite of the routing or admin layer. --- @@ -24,7 +26,7 @@ The current auth flow has three structural gaps: The user's primary stated need is "after I log in once, subsequent app opens should go straight to the main app." The current code attempts this via localStorage rehydration, but two failure modes break it: (a) refresh hits `_refreshFailed` and the auth store clears itself; (b) when the access token expires mid-session and refresh fails (server restart, network blip), the store clears and the user is bounced to `/login`. We need both stronger local persistence and server-side session awareness to make this experience reliable. -The secondary stated need is **"集团统一管理"** (centralized enterprise management). Without a session table and admin endpoints, an admin cannot: see who is logged in, force-logout a lost device, or ensure that a compromised employee is immediately removed from all devices. This groundwork is also a prerequisite for the future OIDC/SAML integration brainstorm (out of scope here, but the session table is the same data model an IdP would feed). +The secondary stated needs are **"集团统一管理"** (centralized enterprise management) and **"和集团的账号密码对接"** (eventual IdP integration). Without a session table and admin endpoints, an admin cannot: see who is logged in, force-logout a lost device, or ensure that a compromised employee is immediately removed from all devices. The session table is the same data model an IdP would feed. To keep the future IdP integration from requiring a routing / admin rewrite, the auth backend must be **pluggable behind an `AuthProvider` Protocol** (see KTD-10 and U11). Local today, OIDC tomorrow — without touching routes or admin code. --- @@ -46,6 +48,9 @@ The secondary stated need is **"集团统一管理"** (centralized enterprise ma - Frontend "Active sessions" management UI in `SettingsView` (list current devices, kick others) - Admin UI: see any user's active sessions, kick any session - Backwards-compat for one minor version: old clients without `sid` claim still work via `user_sessions` table fallback +- **AuthProvider 抽象层** (`auth/providers/base.py` Protocol + `LocalAuthProvider` + `StubOIDCProvider`) — routes / admin / SessionService 通过 `Depends(get_auth_provider)` 拿到 provider;切换 IdP 不重写路由 +- `auth_sessions` 表 `auth_provider` 字段记录登录来源(`local` / `oidc-stub` / 未来 `oidc-keycloak` / `saml` / `ldap`) +- 配置开关 `auth.provider: local | oidc-stub`(agentkit.yaml),未来加新 provider 只需新 adapter ### Out of Scope (deferred to follow-up work) @@ -84,6 +89,9 @@ The plan must satisfy all of the following origin IDs (see [requirements doc](do - **F10** Pre-emptive refresh when access expires in <2 min - **F11** Startup distinguishes `valid` / `invalid` / `error` - **F12** Multiple Tauri / Web clients can log in the same user simultaneously (independent sessions) +- **F13** AuthProvider 可插拔(`auth.provider` 配置切换 local ↔ oidc-stub,路由/Admin/Session 表零修改) +- **F14** admin 端点与认证后端解耦(未来切 IdP,admin 看 session 列表 / 踢人功能不变) +- **F15** 审计日志记录 `auth_provider` 字段(登录来源可溯源) - **N1** Token validation P99 < 5ms (Redis cache for session metadata) - **N5** All auth code has unit + integration tests - **N6** Backwards-compat for old clients (1 minor version) @@ -164,6 +172,20 @@ The plan must satisfy all of the following origin IDs (see [requirements doc](do **Trade-off**: Two validation paths in `get_current_user`. Mitigated by extracting the session-lookup into a helper that both paths share. +### KTD-10: AuthProvider 抽象层(为未来 IdP 对接留扩展点) + +**Decision**: 鉴权逻辑走 `auth/providers/base.py:AuthProvider` Protocol(`name` / `authenticate` / `get_user_by_id` / `sync_user_attributes` / `revoke_user`),路由层用 `Depends(get_auth_provider)` 注入。当前默认 `LocalAuthProvider`(封装 SQLite + bcrypt),未来 `OidcAuthProvider` 接管时**路由 / admin / Session 表零修改**。`StubOIDCProvider` 作为占位(`raise NotImplementedError`),用于未来接口契约验证。 + +**Rationale**: 用户明确"未来要和集团账号密码对接"(OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)。如果现在把"用户存在哪里 / 密码怎么校验"写死在 routes/admin 里,未来切 IdP 必须重写所有路由层 + admin 端点。提前抽象可以让未来 IdP 集成只需新增一个 adapter(~300-500 行),不触及现有 routes / admin / SessionService。`auth_sessions` 表加 `auth_provider` 字段记录登录来源,审计可溯源。 + +**Trade-off**: +- 多 1 个抽象层(`auth/providers/base.py` Protocol)+ 1 个 DI 工厂(`get_auth_provider`)+ 1 个 StubOIDCProvider 占位 +- 收益:未来 IdP 集成不重写路由层 + admin API;admin 踢人 / 看 session 列表跨 provider 一致 + +**Alternatives considered**: +- ❌ 不预留扩展点,只做当下 LocalAuthProvider:未来切 IdP 必须重写 routes + admin + SessionService +- ❌ 直接实现 OIDC:拉长本迭代 2-3 倍 + --- ## High-Level Technical Design @@ -196,23 +218,26 @@ The plan must satisfy all of the following origin IDs (see [requirements doc](do │ ▼ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ FastAPI server (Python sidecar) │ │ -│ │ ┌──────────────────┐ ┌──────────────────────┐ │ │ -│ │ │ routes/auth.py │───▶│ auth/session.py │ │ │ -│ │ │ + admin routes │ │ - create / rotate │ │ │ -│ │ └──────────────────┘ │ - revoke / kick │ │ │ -│ │ │ │ - reuse detection │ │ │ -│ │ │ └──────────────────────┘ │ │ -│ │ │ │ │ │ -│ │ │ ▼ │ │ -│ │ │ ┌──────────────────────┐ │ │ -│ │ │ │ auth/models.py │ │ │ -│ │ │ │ AuthSessionModel │ │ │ -│ │ │ └──────────────────────┘ │ │ -│ │ │ │ │ │ -│ │ ▼ ▼ │ │ -│ │ ┌─────────────────────────────────────────────┐ │ │ -│ │ │ auth/cache.py (Redis or in-process LRU) │ │ │ -│ │ └─────────────────────────────────────────────┘ │ │ +│ │ ┌────────────────────────┐ ┌──────────────────────┐ │ │ +│ │ │ routes/auth.py │──▶│ auth/session.py │ │ │ +│ │ │ + admin routes │ │ - create / rotate │ │ │ +│ │ │ Depends(get_auth_ │ │ - revoke / kick │ │ │ +│ │ │ provider) ─────┼──▶│ - reuse detection │ │ │ +│ │ └────────────────────────┘ └──────────────────────┘ │ │ +│ │ │ │ │ │ +│ │ ▼ ▼ │ │ +│ │ ┌────────────────────────┐ ┌──────────────────────┐ │ │ +│ │ │ auth/providers/ │ │ auth/models.py │ │ │ +│ │ │ - base.py (Protocol) │ │ AuthSessionModel │ │ │ +│ │ │ - local.py (Local) │ │ + auth_provider col │ │ │ +│ │ │ - oidc_stub.py (stub) │ └──────────────────────┘ │ │ +│ │ │ get_auth_provider() DI │ │ │ +│ │ └────────────────────────┘ │ │ +│ │ │ │ │ +│ │ ▼ │ │ +│ │ ┌─────────────────────────────────────────────┐ │ │ +│ │ │ auth/cache.py (Redis or in-process LRU) │ │ │ +│ │ └─────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ @@ -373,32 +398,112 @@ Client Server ## Implementation Units -### U1. Schema: AuthSessionModel + Alembic migration +### U1. Schema: AuthSessionModel + extended bootstrap + backfill -**Goal**: Add the `auth_sessions` table with all required fields and indexes. +**Goal**: Add the `auth_sessions` table with all required fields and indexes, AND backfill existing `user_sessions` rows on first startup. -**Requirements**: F6, N5, N6 (the table backs every session-aware endpoint). +**Requirements**: F6, F15, N5, N6 (the table backs every session-aware endpoint; backfill prevents forced re-login; `auth_provider` field enables future IdP audit traceability). **Dependencies**: None. **Files**: -- `src/agentkit/server/auth/models.py` — add `AuthSessionModel` (SQLAlchemy 2 typed) + extend `_SCHEMA_SQL` for direct aiosqlite init -- `migrations/versions/2026_06_20_001_add_auth_sessions.py` — Alembic migration: CREATE TABLE + 3 indexes (`(user_id, revoked, expires_at)`, `(expires_at)`, `(refresh_token_hash)`) -- `tests/unit/auth/test_models.py` — model serialization + index smoke tests +- `src/agentkit/server/auth/models.py` — add `AuthSessionModel` (SQLAlchemy 2 typed) + extend `_SCHEMA_SQL` for direct aiosqlite init + add `_SCHEMA_VERSION = 2` constant + extend `init_auth_db()` to run the backfill +- `tests/unit/auth/test_models.py` — model serialization + index smoke + backfill tests -**Approach**: +**Approach (schema)**: - Use UUID strings as PK (matches existing `users.id` style in this codebase) - `device_info` is a JSON string (reuse pattern from `UserSessionModel.device_info`) - `expires_at` is ISO-8601 string (matches `UserModel.last_login_at`) - `revoked` is INTEGER (0/1) for SQLite compatibility +- Add the new `CREATE TABLE auth_sessions` block to `_SCHEMA_SQL` (line 234-242 is the current `user_sessions` block; append after it) with these indexes: + - `idx_auth_sessions_user_id_active` on `(user_id, revoked, expires_at)` — supports the cap-count query and the list-active query + - `idx_auth_sessions_expires_at` on `(expires_at)` — supports cleanup sweeps + - `idx_auth_sessions_refresh_token_hash` on `(refresh_token_hash)` — unique + - `idx_auth_sessions_auth_provider` on `(auth_provider)` — supports future IdP "list sessions by provider" query +- **Add `auth_provider` column** (NEW per KTD-10): `TEXT NOT NULL DEFAULT 'local'` — records which provider created the session. Values: `local` (current) / `oidc-stub` (future stub) / `oidc-keycloak` / `saml` / `ldap` (future real adapters). Backfilled rows get `'local'` via the default. +- Bump `_SCHEMA_VERSION = 2` (currently implicit; the existing `init_auth_db` is idempotent via `CREATE TABLE IF NOT EXISTS` so version is mostly for the backfill gate) -**Test scenarios**: +**Approach (backfill) — critical, was missing from the original plan**: +The current `routes/auth.py:201-213` writes to `user_sessions` on login. After the new schema lands, the new `SessionService.create_session` writes to `auth_sessions` instead. To prevent forcing every existing user to re-login on the deploy, `init_auth_db()` runs a **one-time backfill** on startup: + +```python +async def _backfill_user_sessions(db: aiosqlite.Connection) -> int: + """One-time backfill from user_sessions to auth_sessions. + + Runs only when auth_sessions is empty AND user_sessions has rows. + Idempotent: subsequent restarts are no-ops. + """ + cursor = await db.execute("SELECT COUNT(*) FROM auth_sessions") + (count,) = await cursor.fetchone() + if count > 0: + return 0 # already backfilled + + cursor = await db.execute( + "SELECT id, user_id, refresh_token_hash, device_info, created_at, expires_at, revoked_at " + "FROM user_sessions WHERE revoked_at IS NULL" + ) + rows = await cursor.fetchall() + backfilled = 0 + for row in rows: + device_info = json.loads(row["device_info"]) if row["device_info"] else {} + # Use existing user_sessions.id as the auth_sessions.id so that + # legacy clients holding the old refresh_token_hash still match + # a row in the new table (this is what the back-compat path in + # U10 relies on). + await db.execute( + "INSERT OR IGNORE INTO auth_sessions " + "(id, user_id, refresh_token_hash, device_fingerprint, device_label, " + " ip, user_agent, created_at, last_active_at, expires_at, revoked, revoked_reason) " + "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)", + ( + row["id"], # reuse legacy id for back-compat + row["user_id"], + row["refresh_token_hash"], + device_info.get("fingerprint", "unknown"), + device_info.get("label", "Unknown device"), + device_info.get("ip", ""), + device_info.get("user_agent", ""), + row["created_at"], + row["created_at"], # last_active_at defaults to created_at + row["expires_at"], + 0, # not revoked (already filtered) + None, + ), + ) + backfilled += 1 + if backfilled: + logger.info(f"Backfilled {backfilled} user_sessions rows to auth_sessions") + return backfilled +``` + +**Approach (idempotency)**: +- The `INSERT OR IGNORE` on `auth_sessions.id PK` makes the backfill safe to re-run +- The `count > 0` early-exit means after the first backfill, subsequent startups are < 1ms + +**Approach (rolled-back risk)**: +- The backfill does NOT delete `user_sessions` rows. They are kept for 1 minor version as the legacy read path. U10's Phase 5 cleanup drops the table. + +**Test scenarios** (test_models.py): - Create session, query by `sid`, find it -- Create 11 sessions for one user, count = 11 (cap check is in U4) +- Create 11 sessions for one user, count = 11 (cap check is in U3) - Query `WHERE user_id=? AND revoked=0 AND expires_at > now` returns active sessions - Index `(user_id, revoked, expires_at)` is present (verify via `PRAGMA index_list`) +- Index `idx_auth_sessions_auth_provider` is present +- **`auth_provider` column** tests (NEW per KTD-10): + - Default value is `'local'` when column is omitted from INSERT + - `WHERE auth_provider = 'local'` returns only local-created sessions + - `WHERE auth_provider = 'oidc-stub'` returns zero rows in current code +- **Backfill tests** (NEW): + - `init_auth_db` on a DB with `user_sessions` rows but empty `auth_sessions` → backfills all non-revoked rows + - `init_auth_db` on a DB with existing `auth_sessions` rows → does NOT re-backfill (idempotent) + - Backfilled rows have the original `user_sessions.id` as their `auth_sessions.id` + - Backfilled rows have `revoked=0` + - Backfilled rows have their `expires_at` preserved + - Backfill does NOT touch `user_sessions` rows that are already revoked (`revoked_at IS NOT NULL`) -**Verification**: `pytest tests/unit/auth/test_models.py -v` passes; migration runs cleanly on a test DB. +**Verification**: `pytest tests/unit/auth/test_models.py -v` passes; `init_auth_db` runs cleanly on a copy of prod DB with the existing `user_sessions` table; backfill log line appears exactly once per fresh DB. + +**Note on Alembic**: This codebase does **not** use Alembic. There is no `alembic.ini`, no `migrations/` directory, and no `alembic` dependency in `pyproject.toml`. The auth DB schema is managed via the `_SCHEMA_SQL` constant + `init_auth_db()` pattern (see `auth/models.py:202-333`). This U1 unit aligns with that pattern; the original plan's Alembic reference was incorrect. --- @@ -437,7 +542,7 @@ Client Server **Goal**: Centralize all session operations behind a `SessionService` class so routes don't duplicate the logic. -**Requirements**: F5, F6, F8, F9, F11 (rotation, recording, kick, password change, three-state validation). +**Requirements**: F5, F6, F8, F9, F11, F13, F15 (rotation, recording, kick, password change, three-state validation, provider-pluggability, audit field). **Dependencies**: U1 (model), U2 (denylist). @@ -447,10 +552,11 @@ Client Server - `tests/unit/auth/test_session.py` — full service test suite **Approach (SessionService methods)**: -- `async create_session(user_id, device_fingerprint, device_label, ip, user_agent, remember_me: bool) -> AuthSessionModel` +- `async create_session(user_id, device_fingerprint, device_label, ip, user_agent, remember_me: bool, auth_provider: str = "local") -> AuthSessionModel` - **Cap check first**: count active sessions for user; if ≥10, mark oldest non-current as `revoked` with `revoked_reason='session_cap_eviction'` - Generate new `sid` (uuid4), `jti` (uuid4) - Compute `expires_at` based on `remember_me` (30d vs 7d) + - **Store `auth_provider` from caller** (U4 passes `provider.name`); enables F15 audit traceability - Insert row, return model - `async get_active_session(sid: str) -> AuthSessionModel | None` - First check `SessionCache.get(sid)`; on miss, query DB, write to cache (60s TTL) @@ -459,7 +565,7 @@ Client Server - Decode `old_refresh_token`; get `sid`; lookup session - **Reuse detection**: compare `sha256(old_refresh_token)` against `session.refresh_token_hash`. If different, this is a reuse → call `revoke_all_for_user(user_id, reason='reuse_detected')` + raise `TokenReuseDetected` - Also check `RecentlyRevokedTokens.contains(sha256(old_refresh_token))` — if yes, same handling - - On legitimate use: generate new `refresh_token`, update `session.refresh_token_hash` = `sha256(new)`, `session.last_active_at` = now, `session.expires_at` = now + ttl, `session.previous_session_id` = old sid (audit) + - On legitimate use: generate new `refresh_token`, update `session.refresh_token_hash` = `sha256(new)`, `session.last_active_at` = now, `session.expires_at` = now + ttl, `session.previous_session_id` = old sid (audit), `auth_provider` **preserved** (rotation doesn't change provider) - Add `sha256(old_refresh_token)` to denylist for 30s - Issue new access + refresh JWTs (call into jwt_utils) - Invalidate cache entry for this sid @@ -469,6 +575,7 @@ Client Server - Bulk update; returns count of revoked sessions - `async list_active_for_user(user_id: str) -> list[AuthSessionModel]` - `async list_all_for_admin(user_id: str) -> list[AuthSessionModel]` (admin endpoint) +- `async list_active_by_provider(auth_provider: str) -> list[AuthSessionModel]` (NEW per KTD-10) — supports future "show me all OIDC sessions" admin view **Approach (SessionCache)**: ```python @@ -485,11 +592,13 @@ class SessionCache(Protocol): - `create_session` with remember_me=True sets expires_at 30d out, else 7d - `create_session` for a user with 10 active sessions evicts the oldest non-current one - `create_session` for a user with 10 active sessions, the new login is one of them, the evicted one is the OLDEST non-new +- **`create_session` with `auth_provider='oidc-stub'`** stores that value in the row (NEW per KTD-10) - `get_active_session` returns the row when valid - `get_active_session` returns None when `revoked=True` - `get_active_session` returns None when `expires_at < now` - `get_active_session` second call within 60s hits cache (spy on DB call count) - `rotate_refresh` with the CURRENT token returns new pair +- `rotate_refresh` preserves the original `auth_provider` value (NEW per KTD-10) - `rotate_refresh` with a REUSED old token (different hash) → `TokenReuseDetected` raised + ALL sessions for user revoked - `rotate_refresh` with a token in the denylist → same handling - `rotate_refresh` updates `previous_session_id` to the old sid @@ -498,6 +607,7 @@ class SessionCache(Protocol): - `revoke_all_for_user` except_sid= keeps the current session - `list_active_for_user` returns only `revoked=False AND expires_at > now` - `list_all_for_admin` returns all rows including revoked (for audit) +- `list_active_by_provider('local')` returns only local sessions; `('oidc-stub')` returns empty in current code (NEW per KTD-10) **Verification**: All unit tests pass; `pytest tests/unit/auth/test_session.py -v` shows 100% line coverage of `session.py`. @@ -507,30 +617,30 @@ class SessionCache(Protocol): **Goal**: Expose all session operations as HTTP endpoints. -**Requirements**: F1, F2, F5, F6, F7, F8, F9, F10, F11. +**Requirements**: F1, F2, F5, F6, F7, F8, F9, F10, F11, F13, F14, F15. -**Dependencies**: U3 (the service). +**Dependencies**: U3 (the service), **U11 (AuthProvider 抽象层 — must land first or alongside)**. **Files**: -- `src/agentkit/server/routes/auth.py` — extend `LoginRequest` with `remember_me: bool = False`; add `WhoamiResponse`, `SessionInfoResponse`; add new endpoints -- `src/agentkit/server/routes/admin.py` — new module: admin session management endpoints (or extend existing admin module) +- `src/agentkit/server/routes/auth.py` — extend `LoginRequest` with `remember_me: bool = False`; add `WhoamiResponse`, `SessionInfoResponse`; add new endpoints; **DI 注入 `AuthProvider` 通过 `Depends(get_auth_provider)`**(KTD-10) +- `src/agentkit/server/routes/admin.py` — new module: admin session management endpoints (or extend existing admin module); **调用 `provider.revoke_user(user_id)` 而不是直接改 users 表**(KTD-10) - `src/agentkit/server/dependencies.py` — `get_current_user` extension to look up session via sid; back-compat fallback for old tokens - `src/agentkit/server/auth/password.py` — extend with `change_password(user_id, new_password)` that revokes all other sessions -- `tests/integration/auth/test_auth_routes.py` — full endpoint suite +- `tests/integration/auth/test_auth_routes.py` — full endpoint suite; **追加 provider mock 注入测试**(KTD-10) - `tests/integration/auth/test_admin_routes.py` — admin endpoints **Approach (new endpoints)**: | Method | Path | Body / Query | Auth | Behavior | |--------|------|--------------|------|----------| -| POST | `/auth/login` | `{username, password, remember_me?}` | none | bcrypt verify → `SessionService.create_session` → return `TokenResponse` | +| POST | `/auth/login` | `{username, password, remember_me?}` | none | **`provider.authenticate(username, password)`** → `SessionService.create_session(auth_provider=provider.name)` → return `TokenResponse` | | POST | `/auth/refresh` | `{refresh_token}` | refresh | `SessionService.rotate_refresh` → return new `TokenResponse`; on `TokenReuseDetected` → 401 `{error: "token_reuse_detected"}` | | POST | `/auth/logout` | `{refresh_token}` | access (optional) | `revoke_session(sid, reason='user_terminated')` | -| GET | `/auth/whoami` | — | access OR refresh | Returns `{user, session: {sid, device_label, ip, created_at, last_active_at, expires_at}}`. Accepts refresh token to support cold-start where access is gone. | -| GET | `/auth/sessions` | — | access | List current user's active sessions | +| GET | `/auth/whoami` | — | access OR refresh | Returns `{user, session: {sid, device_label, ip, auth_provider, created_at, last_active_at, expires_at}}`. Accepts refresh token to support cold-start where access is gone. | +| GET | `/auth/sessions` | — | access | List current user's active sessions (each annotated with `auth_provider`) | | DELETE | `/auth/sessions/{sid}` | — | access | Revoke that session (if owned by current user) | | POST | `/auth/logout-others` | — | access | Revoke all sessions except current | -| POST | `/auth/change-password` | `{old_password, new_password}` | access | `verify_password(old)` → `hash_password(new)` → update user → `revoke_all_for_user(except_sid=current)` | +| POST | `/auth/change-password` | `{old_password, new_password}` | access | `provider.authenticate` 校验 old → `provider.revoke_user(user_id)` 失效其他 session(KTD-10: 跨 provider 行为一致) | **Approach (admin endpoints)**: @@ -539,37 +649,164 @@ class SessionCache(Protocol): | GET | `/admin/users/{user_id}/sessions` | admin | List all that user's sessions (incl revoked) | | DELETE | `/admin/users/{user_id}/sessions/{sid}` | admin | Force-revoke any session | -**Approach (`get_current_user` back-compat)**: +**Approach (`/auth/whoami` middleware bypass — critical fix)**: + +The current `AuthMiddleware._verify_jwt` (in `src/agentkit/server/auth/middleware.py:80-91`) only accepts `type=access` tokens and 401s on `type=refresh`. The cold-start sequence sends a refresh token (because the access token is gone). To make this work without weakening auth, `/auth/whoami` is added to `AuthMiddleware.WHITELIST_PATHS` and the route does its own auth: + ```python -async def get_current_user(token: str = Depends(oauth2_scheme)) -> User: - payload = verify_token(token, expected_type="access") +# In auth/middleware.py: +WHITELIST_PATHS = ( + "/api/v1/health", + "/api/v1/auth/login", + "/api/v1/auth/refresh", + "/api/v1/auth/logout", + "/api/v1/auth/whoami", # NEW: route does its own auth + "/docs", + "/openapi.json", + "/redoc", +) +``` + +The `/auth/whoami` route accepts **either** an access token (normal call) **or** a refresh token (cold-start), and the auth check happens inside the route via `verify_token` + session lookup: + +```python +@router.get("/whoami") +async def whoami(request: Request) -> WhoamiResponse: + """Returns the current user + session metadata. + + Accepts either type=access (normal) or type=refresh (cold-start). + On 401 from this endpoint, the client treats it as 'invalid' state + (NOT 'error' state) so the router redirects to /login. + """ + auth_header = request.headers.get("Authorization", "") + if not auth_header.startswith("Bearer "): + raise HTTPException(401, "missing bearer token") + token = auth_header[7:] + + try: + payload = verify_token(token, expected_type=None) # accept both types + except jwt.ExpiredSignatureError: + raise HTTPException(401, "token expired") + except jwt.InvalidTokenError: + raise HTTPException(401, "invalid token") + sid = payload.get("sid") if sid: + # New-style: validate session in DB session = await session_service.get_active_session(sid) if not session: raise HTTPException(401, "session revoked or expired") - # attach session to request.state for downstream use - return await load_user(session.user_id) - # Legacy path: JWT without sid → still valid if signature + exp ok - logger.debug("Legacy JWT without sid; using exp-only validation") - return await load_user(payload["sub"]) + user = await load_user(session.user_id) + # Issue a fresh access token so the client doesn't need a separate /refresh + new_access = create_access_token(user_id=user.id, session_id=session.id) + return WhoamiResponse( + user=user_to_response(user), + access_token=new_access, + session=session_to_response(session), + ) + else: + # Legacy token without sid — back-compat path (U10) + user = await load_user(payload["sub"]) + if not user or not user.is_active: + raise HTTPException(401, "user not found or inactive") + new_access = create_access_token(user_id=user.id, session_id=None) # legacy + return WhoamiResponse( + user=user_to_response(user), + access_token=new_access, + session=None, # no session metadata for legacy + ) ``` -**Approach (`change_password`)**: +**Approach (defined phantom functions)**: + +The plan's pseudo-code references several functions that don't exist yet. Define them explicitly: + ```python +# In auth/dependencies.py — NEW dependency for current session +async def get_current_session(request: Request) -> AuthSession: + """Return the active session for the current request. + + Reads request.state.session (set by get_current_user middleware/dependency). + Raises 401 if no session (legacy tokens) or session is revoked. + """ + session = getattr(request.state, "session", None) + if session is None: + raise HTTPException(401, "no active session (legacy token)") + return session + +# In auth/dependencies.py — keep existing get_current_user but extend it +async def get_current_user(request: Request) -> User: + """Return the current authenticated user. + + Strategy: + - If request.state.current_user is already set (by AuthMiddleware for + type=access tokens), return it. + - Otherwise, this is called from a path that bypassed middleware + (e.g. /auth/whoami). The route must have set request.state.user + via its own auth check. + - Legacy tokens (no sid) only set current_user, not session. + """ + user = getattr(request.state, "current_user", None) + if user is None: + user = getattr(request.state, "user", None) # set by whoami route + if user is None: + raise HTTPException(401, "not authenticated") + return user + +# In auth/users.py — NEW helper +async def load_user(user_id: str) -> User | None: + """Load a user by id. Returns None if not found or inactive.""" + async with aiosqlite.connect(str(DEFAULT_AUTH_DB_PATH)) as db: + cursor = await db.execute( + "SELECT * FROM users WHERE id = ? AND is_active = 1", (user_id,) + ) + row = await cursor.fetchone() + return user_row_to_dict(row) if row else None +``` + +**Approach (`get_current_user` back-compat with sid validation)**: + +The new `get_current_user` is called by routes after `AuthMiddleware` has run. The middleware sets `request.state.current_user` (a dict with `id`, `username`, `role`, etc.) for `type=access` tokens. With the new sid-bearing tokens, the middleware is extended to also set `request.state.session`: + +```python +# In auth/middleware.py — extend _verify_jwt to also load session +def _verify_jwt(self, token: str) -> dict[str, Any] | None: + # ... existing signature/expiry check ... + sid = payload.get("sid") + if sid: + # Synchronous check is not possible (DB call). Defer to a + # per-route dependency. Middleware only checks signature + expiry + # for new tokens; the session-revoked check happens in the + # get_current_user dependency. + pass + return payload +``` + +The session-revoked check is then done lazily in `get_current_session`, which calls `SessionService.get_active_session(sid)`. This is one extra DB-or-cache call per request, mitigated by the 60s Redis cache (KTD-6). + +**Approach (`change_password`)**: + +```python +@router.post("/change-password") async def change_password( payload: ChangePasswordRequest, - current: User = Depends(get_current_user), + user: User = Depends(get_current_user), session: AuthSession = Depends(get_current_session), ): - if not verify_password(payload.old_password, current.password_hash): + if not verify_password(payload.old_password, user.password_hash): raise HTTPException(400, "old password incorrect") new_hash = hash_password(payload.new_password) - await db.execute("UPDATE users SET password_hash=?, updated_at=? WHERE id=?", ...) - await session_service.revoke_all_for_user( - current.id, except_sid=session.id, reason="password_changed" + async with aiosqlite.connect(str(DEFAULT_AUTH_DB_PATH)) as db: + await db.execute( + "UPDATE users SET password_hash=?, updated_at=? WHERE id=?", + (new_hash, _now_iso(), user.id), + ) + await db.commit() + revoked_count = await session_service.revoke_all_for_user( + user.id, except_sid=session.id, reason="password_changed" ) - return {"ok": True} + logger.info(f"Password changed for user {user.id}; revoked {revoked_count} other sessions") + return {"ok": True, "revoked_sessions": revoked_count} ``` **Test scenarios** (test_auth_routes.py): @@ -1052,6 +1289,148 @@ async function handleSubmit() { --- +### U11. AuthProvider 抽象层(为未来 IdP 对接留扩展点) + +**Goal**: 把"用户存在哪里 / 密码怎么校验 / 属性怎么同步"封装在可插拔的 `AuthProvider` adapter 后面。当前实现 `LocalAuthProvider`(封装 SQLite + bcrypt);同时提供 `StubOIDCProvider` 占位实现(`raise NotImplementedError`)作为未来 OIDC 实现的接口契约参考。路由层 / admin API / SessionService 通过 `Depends(get_auth_provider)` 拿到 provider 引用,**未来切 IdP 零修改路由**。 + +**Requirements**: F13, F14, F15. + +**Dependencies**: None(被 U1/U3/U4 引用;可与 U1-U4 任何阶段并行或先后落地;建议在 Phase 1 早期就上,因为 U1 schema 需要 `auth_provider` 字段)。 + +**Files**: +- `src/agentkit/server/auth/providers/__init__.py` — new package;导出 `AuthProvider`、`get_auth_provider()` 工厂、`LocalAuthProvider`、`StubOIDCProvider` +- `src/agentkit/server/auth/providers/base.py` — `AuthProvider` Protocol(`name: str` + `authenticate` / `get_user_by_id` / `sync_user_attributes` / `revoke_user` 4 个 async 方法) +- `src/agentkit/server/auth/providers/local.py` — `LocalAuthProvider` 实现,封装现有 `auth/password.py` 逻辑(bcrypt 校验 + 查 users 表) +- `src/agentkit/server/auth/providers/oidc_stub.py` — `StubOIDCProvider` 占位实现,所有方法 `raise NotImplementedError` 并在 docstring 中指向下一迭代 OIDC 实现的 checklist +- `src/agentkit/server/config.py` — extend `AuthConfig` with `provider: Literal["local", "oidc-stub"] = "local"`(或新增 `auth.provider` 字段) +- `tests/unit/auth/providers/test_base.py` — Protocol 静态类型检查(`runtime_checkable` Protocol 验证)+ mock provider 用例 +- `tests/unit/auth/providers/test_local.py` — `LocalAuthProvider` 全量单测(复用 `auth/password.py` 测试场景) +- `tests/unit/auth/providers/test_oidc_stub.py` — `StubOIDCProvider` 调用任意方法均抛 `NotImplementedError` 的单测 + +**Approach (AuthProvider Protocol)**: + +```python +# auth/providers/base.py +from typing import Protocol, runtime_checkable +from ..models import User + +@runtime_checkable +class AuthProvider(Protocol): + """所有鉴权后端必须实现的能力。 + + 路由层只调用以下方法,不感知具体实现是 SQLite / OIDC / LDAP。 + 未来新增 IdP 只需新加一个实现此 Protocol 的 adapter。 + """ + + name: str # 标识当前 provider,写入 session.auth_provider + + async def authenticate(self, *, username: str, password: str) -> User: + """校验用户名 + 密码,返回 User 对象。失败抛 InvalidCredentials。""" + ... + + async def get_user_by_id(self, user_id: int) -> User | None: + """按 id 查 user(admin 端点、session 校验、whoami 都用这个)。""" + ... + + async def sync_user_attributes(self, user_id: int) -> None: + """同步用户属性(部门/邮箱/职位等)。LocalAuthProvider: no-op;OidcAuthProvider: 从 IdP 拉最新 profile 写回本地。""" + ... + + async def revoke_user(self, user_id: int) -> None: + """禁用用户(离职/锁定)。LocalAuthProvider: UPDATE users SET is_active=0;OidcAuthProvider: 调 IdP 的 disable API(未来)。""" + ... +``` + +**Approach (LocalAuthProvider)**: 把 `routes/auth.py:201-213` 的 password 校验逻辑(SQLite SELECT + bcrypt 校验 + load_user)搬到 `LocalAuthProvider.authenticate`。路由层不再直接调 `verify_password` / `load_user` —— 统一走 provider。`revoke_user` 走 `UPDATE users SET is_active=0`(admin 端点统一调这个,不再直接写 DB)。 + +**Approach (StubOIDCProvider)**: 所有方法 raise `NotImplementedError`,docstring 写明: +> 当前未实现。下一迭代 OIDC 集成时,重写本类即可,路由 / admin / Session 表零修改。配置 `auth.provider: oidc-stub` 启动会立即报 NotImplementedError(这是设计:避免误启用未完成的功能)。 + +**Approach (DI 工厂)**: + +```python +# auth/providers/__init__.py +from functools import lru_cache +from ...config import get_settings +from .base import AuthProvider +from .local import LocalAuthProvider +from .oidc_stub import StubOIDCProvider + +@lru_cache +def get_auth_provider() -> AuthProvider: + settings = get_settings() + provider_name = settings.auth.provider + if provider_name == "local": + db = get_auth_db() # 现有 aiosqlite 连接(需改造为模块级单例) + return LocalAuthProvider(db) + elif provider_name == "oidc-stub": + return StubOIDCProvider() + else: + raise ValueError(f"unknown auth provider: {provider_name}") +``` + +**Approach (config 扩展)**: + +```yaml +# agentkit.yaml +auth: + provider: local # local | oidc-stub (未来: oidc-keycloak, oidc-feishu, ...) + session: + table: auth_sessions + access_ttl_seconds: 900 + refresh_ttl_seconds: 604800 + refresh_ttl_remember_me_seconds: 2592000 + jwt: + secret_env: AGENTKIT_JWT_SECRET + algorithm: HS256 +``` + +**Test scenarios** (test_base.py + test_local.py + test_oidc_stub.py): +- `LocalAuthProvider` with valid username+password returns User +- `LocalAuthProvider` with wrong password raises `InvalidCredentials` +- `LocalAuthProvider` with unknown username raises `InvalidCredentials` +- `LocalAuthProvider` with inactive user (`is_active=0`) raises `InvalidCredentials` +- `LocalAuthProvider.get_user_by_id` returns the user or None +- `LocalAuthProvider.sync_user_attributes` is a no-op (returns None) +- `LocalAuthProvider.revoke_user` sets `is_active=0` and subsequent `authenticate` fails +- `LocalAuthProvider.name == "local"` +- `StubOIDCProvider.authenticate` raises `NotImplementedError` with helpful message +- `StubOIDCProvider.get_user_by_id` raises `NotImplementedError` +- `StubOIDCProvider.sync_user_attributes` raises `NotImplementedError` +- `StubOIDCProvider.revoke_user` raises `NotImplementedError` +- `StubOIDCProvider.name == "oidc-stub"` +- `get_auth_provider()` with `auth.provider=local` returns `LocalAuthProvider` instance +- `get_auth_provider()` with `auth.provider=oidc-stub` returns `StubOIDCProvider` instance +- `get_auth_provider()` with `auth.provider=unknown` raises `ValueError` +- `get_auth_provider()` is memoized (lru_cache; second call returns same instance) +- `runtime_checkable(AuthProvider)`: both Local and Stub pass `isinstance(prov, AuthProvider)` check +- Protocol violation: a class missing `authenticate` method does NOT pass `isinstance` check (negative test) + +**Patterns to follow**: +- Protocol + runtime_checkable pattern (Python typing best practice) +- DI 工厂 + lru_cache 单例(与现有 `get_settings` 一致) +- error 类型 `InvalidCredentials` 放到 `auth/providers/exceptions.py`(新建) + +**Verification**: +- `pytest tests/unit/auth/providers/ -v` 全部通过 +- `mypy src/agentkit/server/auth/providers/` 无报错 +- 启动 dev server,配置 `auth.provider: oidc-stub` → 第一次 `/auth/login` 返回 501 NotImplementedError(确认 stub 起作用) +- 启动 dev server,配置 `auth.provider: local` → 走现有登录流程,确认未破坏 +- admin 踢人功能调用 `provider.revoke_user(user_id)` 后,user 再 `authenticate` 失败(cross-check LocalAuthProvider.revoke_user 行为) + +**未来 IdP 对接 checklist**(下一迭代参考): + +- [ ] `auth/providers/oidc.py` — 实现 `OidcAuthProvider`(authenticate / get_user / sync_attributes / revoke_user) +- [ ] `auth/oauth_routes.py` — `/auth/oauth/{provider}/redirect` 和 `/auth/oauth/{provider}/callback` 端点 +- [ ] `auth/state_cache.py` — OAuth state 参数防 CSRF(Redis TTL 5min) +- [ ] 用户首次从 IdP 登录时的「本地账号创建」策略(justeer / 拒绝 / 邀请制) +- [ ] IdP 端的 session 同步(IdP 登出时本地 session 也撤销) +- [ ] 集团部门 / 职位属性映射到本地 users 表 + +本次迭代只做 Protocol + Local 实现 + Stub 占位 + DI 工厂 + 上述 1-3 项的占位(接口定义),其余列入下一迭代独立 brainstorm。 + +--- + ## System-Wide Impact | Stakeholder | Impact | Mitigation | @@ -1059,10 +1438,11 @@ async function handleSubmit() { | End users (Tauri) | First login → no more login prompts for 7d (30d if "remember me"). | Pre-emptive refresh + Keychain storage prevent the failure modes that broke the existing flow. | | End users (Web) | Same as Tauri but refresh in localStorage (degraded security). | Document the trade-off; Keychain is Tauri-only. | | Admins | New capability: see active sessions, kick any user. | UI in admin pages; surface clearly in the Users view. | -| Developers (auth code) | New session module, denylist, cache. | U3 is the single source of truth — routes don't duplicate logic. | +| Developers (auth code) | New session module, denylist, cache, **AuthProvider 抽象层**. | U3 is the single source of truth — routes don't duplicate logic. U11 is the single source of auth backend — routes don't import password.py directly. | +| **未来集团 IdP 集成团队** | 切到 OIDC / SAML / LDAP 时只新增 adapter,不重写路由 / admin | U11 Protocol + LocalAuthProvider 已上;下一迭代 `auth/providers/oidc.py` 直接实现 Protocol 即可 | | Existing in-flight clients | Unaffected during 30-day window. | U10 shim. | | Server load | +1 cache lookup per request (cached 60s). | Redis-backed cache makes this sub-ms. | -| DB schema | New `auth_sessions` table; existing `user_sessions` deprecated. | Alembic migration; keep `user_sessions` reads working for one version. | +| DB schema | New `auth_sessions` table (含 `auth_provider` 字段); existing `user_sessions` deprecated. | Alembic migration; keep `user_sessions` reads working for one version. | --- @@ -1077,6 +1457,9 @@ async function handleSubmit() { | Session cap eviction surprises users (they didn't expect to be kicked) | Low | Low (visible at next login) | Make the cap (10) generous; document it; do not log evicted users out silently. | | Test mocks diverge from real `keyring` behavior | Medium | Medium (CI passes, manual fails) | Use `keyring::mock` feature in CI; document that real-platform testing is manual. | | JWT secret rotation in dev mode invalidates all sessions | Low | High (Tauri dev loops) | Document the behavior; provide `agentkit doctor` to check. | +| **AuthProvider 切换时遗留 routes 直接调 `verify_password` / 改 users 表**(KTD-10) | Medium | Medium(切 IdP 时必须清理) | U11 引入后强制要求所有 routes 走 `Depends(get_auth_provider)`;code review 模板加 checklist「禁止 routes 直接调 password/auth 函数」 | +| **`lru_cache` 单例 + 测试隔离冲突**(U11) | Low | Low(测试 flaky) | `get_auth_provider` 提供 `cache_clear()` helper;`conftest.py` 在每个 test fixture 前后清缓存 | +| **未来 IdP 接管时 `LocalAuthProvider` 残留依赖** | Low | Low(迁移期保留即可) | U11 checklist 显式列出:Local 仍可用作"本地应急账号",OIDC 接管后不删 Local,仅调整路由默认 provider | ### External Dependencies @@ -1139,6 +1522,28 @@ This plan has natural phasing based on dependency order. Each phase lands as a s - Update `X-Client-Version` floor - ~1 day of work +### Phase 6: AuthProvider 抽象层(U11 + 关联改造) + +> **2026-06-20 新增 Phase**(合并 AuthProvider scope) + +- `auth/providers/base.py` — `AuthProvider` Protocol + `runtime_checkable` +- `auth/providers/local.py` — `LocalAuthProvider`(封装现有 `routes/auth.py:201-213` 的 password 校验逻辑) +- `auth/providers/oidc_stub.py` — `StubOIDCProvider`(`raise NotImplementedError` 占位) +- `auth/providers/__init__.py` — `get_auth_provider()` DI 工厂(`lru_cache` 单例) +- `config.py` — 新增 `auth.provider: local | oidc-stub` 配置 +- U1 schema 加 `auth_provider` 字段(合并入 Phase 1 U1) +- U3 SessionService `create_session` 接受 `auth_provider` 参数(合并入 Phase 1 U3) +- U4 routes `Depends(get_auth_provider)` 注入;admin 端点调 `provider.revoke_user(user_id)` 而不是直接改 users 表(合并入 Phase 2 U4) +- ~1.5 days of work(可以与 Phase 1 早期并行落地) + +**Rollout gate**: +- `pytest tests/unit/auth/providers/ -v` 全部通过 +- 启动 dev server,配置 `auth.provider: oidc-stub` → 第一次 `/auth/login` 返回 501 NotImplementedError +- 启动 dev server,配置 `auth.provider: local` → 现有登录流程不受影响 +- admin 踢人功能调用 `provider.revoke_user(user_id)` 行为与原 DB 直接 UPDATE 等价 + +**未来 IdP 集成入口**:下一迭代 OIDC 集成只需新加 `auth/providers/oidc.py` + `auth/oauth_routes.py`(见 U11 checklist),路由 / admin / Session 表零修改。 + --- ## Open Questions @@ -1150,6 +1555,8 @@ These are deferred to implementation and tracked here for visibility: 3. **Q3**: Should the cap-eviction trigger a server-side notification (e.g. an `audit_event`)? Plan defaults to writing a row to a future `auth_audit_log` table; for now, just the `revoked_reason='session_cap_eviction'` field is enough. 4. **Q4**: Should `change_password` rate-limit (e.g. 5 attempts per hour)? Out of scope here but worth a follow-up security brainstorm. 5. **Q5**: macOS Tauri builds need code-signing for Keychain access. The dev binary is unsigned → Keychain prompts "always allow". Plan documents this; production builds must be signed. +6. **Q6 (新增 2026-06-20)**: AuthProvider 抽象层与现有 `routes/auth.py:201-213` 的 password 校验逻辑如何共存?计划方案:U11 第一步 `LocalAuthProvider` 完整复刻现有逻辑(行为等价),第二步 U4 routes 改造时一次性切换;U11 落地时写"行为等价"测试套件确认切换前后行为一致 +7. **Q7 (新增 2026-06-20)**: `get_auth_provider()` 的 `lru_cache` 单例在测试环境如何隔离?计划方案:导出 `cache_clear()` helper;`conftest.py` 在每个 test fixture 前后 `get_auth_provider.cache_clear()`;不引入 `dependency_overrides`(避免 FastAPI app 状态污染) --- @@ -1255,4 +1662,30 @@ The following end-to-end flows must work after this plan lands. Each is testable 2. Log in (gets a legacy JWT without sid) 3. Make API calls 4. **Expected**: server validates the legacy JWT via the back-compat path; user is not affected + +### AE-10: AuthProvider 切换(local → oidc-stub 验证接口契约)(Covers F13, F14) + +> **2026-06-20 新增**(KTD-10 / U11 验证) + +1. 配置 `agentkit.yaml` 的 `auth.provider: local`,启动 dev server +2. 调 `POST /auth/login` 用现有 admin 账号 +3. **Expected**: 200 OK,返回 TokenResponse;DB 中 `auth_sessions.auth_provider='local'` +4. 改配置为 `auth.provider: oidc-stub`,重启 dev server +5. 调 `POST /auth/login` 同样账号 +6. **Expected**: 501 Not Implemented(StubOIDCProvider 抛 NotImplementedError) +7. 验证 admin 端点 `/admin/users/{id}/sessions` 仍能列出步骤 3 创建的 session(含 `auth_provider='local'` 字段) +8. **Expected**: admin 看 session 列表功能不受 provider 切换影响(KTD-10 核心承诺) +9. 调 `isinstance(provider_instance, AuthProvider)` 验证 Local 和 Stub 都通过 Protocol 检查 +10. **Expected**: 两者都返回 `True`(`runtime_checkable` Protocol 行为正确) + +### AE-11: 审计字段 auth_provider 写入(覆盖历史 + 新建)(Covers F15) + +1. 在 AE-1 步骤 1-2 完成后,调 `GET /auth/sessions` 列出当前 user 的所有 active session +2. **Expected**: 每个 session 包含 `auth_provider: "local"` 字段(即使是 backfill 自 `user_sessions` 的行也是 `'local'`,因为 backfill 走默认值) +3. admin 调 `GET /admin/users/{id}/sessions` 跨 user 看 +4. **Expected**: 所有 session 都带 `auth_provider` 字段,admin 可按 provider 过滤(即使当前只有 local,未来 oidc 接入后会有 oidc-* 区分) +5. `SessionService.list_active_by_provider('local')` 返回所有 local session +6. **Expected**: count = 步骤 2 看到的总数 +7. `SessionService.list_active_by_provider('oidc-stub')` 在当前实现下返回空 list +8. **Expected**: count = 0(证明字段存在但无数据,未来 OIDC 接入后才会有值) 5. Server log shows DEBUG: "Legacy JWT without sid; using exp-only validation"