94 KiB
Fischer AgentKit — Centralized Auth & Token Persistence (Plan)
Date: 2026-06-20
Status: active
Branch: feat/auth-server-token-persistence(原 feat/centralized-auth-token-persistence)
Type: feat
Origin: docs/brainstorms/2026-06-20-centralized-auth-token-persistence-requirements.md
2026-06-20 更新:合并 AuthProvider 抽象层 scope(origin §5.5),新增 KTD-10、U1
auth_provider字段、U3/U4 改造点、U11 实施单元、Phase 6、AE-10/AE-11。
Summary
Replace the current minimal JWT + localStorage auth with a production-grade scheme: server-side session table (track every login, enable forced revocation), Tauri OS Keychain storage for refresh tokens (encrypted at rest), refresh token rotation (defense against token leakage), pre-emptive token refresh (no 401 storms), a three-state startup (valid / invalid / error), and an AuthProvider 抽象层 that decouples routes / admin API / session table from the concrete auth backend (Local today; OIDC / SAML / LDAP tomorrow via adapter). Goal: after first login, Tauri cold-start goes directly to the main app, no login page; admin can see/force-revoke any user's active sessions; password change instantly invalidates all other devices; future enterprise IdP integration requires no rewrite of the routing or admin layer.
Problem Frame
The current auth flow has three structural gaps:
- Token at rest in plaintext —
access_token,refresh_token, anduserare stored unencrypted in WebView localStorage (~/Library/WebKit/.../LocalStorage/on macOS). Any process with file access can read them. - No revocation surface —
user_sessionstable only stores a refresh-token hash withrevoked_at. There is no device fingerprint, no IP, no "kick this session" admin endpoint, no "change password → kick everywhere" flow. Sessions outlive the user's intent. - No rotation — the same refresh token can be used for the full 7-day window. If leaked, the attacker has a week of access with no detection.
The user's primary stated need is "after I log in once, subsequent app opens should go straight to the main app." The current code attempts this via localStorage rehydration, but two failure modes break it: (a) refresh hits _refreshFailed and the auth store clears itself; (b) when the access token expires mid-session and refresh fails (server restart, network blip), the store clears and the user is bounced to /login. We need both stronger local persistence and server-side session awareness to make this experience reliable.
The secondary stated needs are "集团统一管理" (centralized enterprise management) and "和集团的账号密码对接" (eventual IdP integration). Without a session table and admin endpoints, an admin cannot: see who is logged in, force-logout a lost device, or ensure that a compromised employee is immediately removed from all devices. The session table is the same data model an IdP would feed. To keep the future IdP integration from requiring a routing / admin rewrite, the auth backend must be pluggable behind an AuthProvider Protocol (see KTD-10 and U11). Local today, OIDC tomorrow — without touching routes or admin code.
Scope Boundaries
In Scope
- New
auth_sessionsSQLAlchemy model + table + Alembic migration - JWT payload extended with
sid(session id) andjti(token id); session validation on every request - Refresh token rotation on every
/auth/refreshcall; old token enters a 30s short-window denylist - Refresh-token reuse detection → revoke all sessions for that user (defense against token theft)
- New endpoints:
GET /auth/whoami,GET /auth/sessions,DELETE /auth/sessions/{id},POST /auth/logout-others,POST /auth/change-password,GET /admin/users/{id}/sessions,DELETE /admin/users/{id}/sessions/{sid} - Active session cap = 10 per user; login that would exceed the cap evicts the oldest non-current session
- "Remember me" login option: refresh TTL = 30 days (vs default 7 days)
- Tauri Rust commands:
store_refresh_token/load_refresh_token/clear_refresh_tokenusing thekeyringcrate (macOS Keychain / Windows Credential Manager / Linux Secret Service) - Frontend
tauri-auth.tsadapter with localStorage fallback when Keychain is unavailable - Frontend auth-store: 3-state startup (
valid/invalid/error), pre-emptive refresh when access expires in <2 min, no localStorage write of access token - Frontend "Remember me" checkbox on
LoginView - Frontend "Active sessions" management UI in
SettingsView(list current devices, kick others) - Admin UI: see any user's active sessions, kick any session
- Backwards-compat for one minor version: old clients without
sidclaim still work viauser_sessionstable fallback - AuthProvider 抽象层 (
auth/providers/base.pyProtocol +LocalAuthProvider+StubOIDCProvider) — routes / admin / SessionService 通过Depends(get_auth_provider)拿到 provider;切换 IdP 不重写路由 auth_sessions表auth_provider字段记录登录来源(local/oidc-stub/ 未来oidc-keycloak/saml/ldap)- 配置开关
auth.provider: local | oidc-stub(agentkit.yaml),未来加新 provider 只需新 adapter
Out of Scope (deferred to follow-up work)
- Enterprise IdP / SSO (OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微) — separate brainstorm
- 2FA / TOTP / WebAuthn / Passkey — separate brainstorm
- Multi-tenant / org isolation — separate brainstorm
- Password strength policy / password expiry / password history — separate IAM brainstorm
- Login failure lockout / sliding-window rate-limit — separate security brainstorm
- Email / SMS notifications for reuse detection — requires notification service
- Full audit log search / export — separate observability brainstorm
- Per-session device "trust" flag (e.g. "this Mac is trusted for 90 days") — defer until IdP work
Resolved Decisions (locked in from the brainstorm)
| # | Question | Decision |
|---|---|---|
| 1 | Remember me TTL | 30 days (vs default 7 days) |
| 2 | Active session cap | 10 per user, evict oldest non-current on overflow |
| 3 | Tauri Keychain unavailable behavior | Silently fall back to localStorage, log warning |
Requirements (carried from origin)
The plan must satisfy all of the following origin IDs (see requirements doc):
- F1 First login → cold-start app goes directly to main UI, never shows login
- F2 "Remember me" toggle: 7d / 30d refresh TTL
- F3 Tauri: refresh token stored in OS Keychain, never on localStorage disk
- F4 Web: refresh token in localStorage (degraded security, accepted)
- F5 Refresh token rotation: every
/auth/refreshinvalidates the old token - F6 Server: every login recorded with device/IP/time
- F7 Admin: see any user's active sessions
- F8 Admin / self: kick any session
- F9 Password change: kick all other sessions
- F10 Pre-emptive refresh when access expires in <2 min
- F11 Startup distinguishes
valid/invalid/error - F12 Multiple Tauri / Web clients can log in the same user simultaneously (independent sessions)
- F13 AuthProvider 可插拔(
auth.provider配置切换 local ↔ oidc-stub,路由/Admin/Session 表零修改) - F14 admin 端点与认证后端解耦(未来切 IdP,admin 看 session 列表 / 踢人功能不变)
- F15 审计日志记录
auth_provider字段(登录来源可溯源) - N1 Token validation P99 < 5ms (Redis cache for session metadata)
- N5 All auth code has unit + integration tests
- N6 Backwards-compat for old clients (1 minor version)
Key Technical Decisions
KTD-1: auth_sessions table (new) vs extending user_sessions (existing)
Decision: Create a new auth_sessions table; deprecate user_sessions over 1 minor version.
Rationale: user_sessions only stores refresh_token_hash + revoked_at (3 fields). The new design needs device_fingerprint, device_label, ip, user_agent, last_active_at, expires_at, revoked_reason, previous_session_id. Adding 7 columns to an existing table breaks its existing semantics (the table is also referenced in production hardening tests). Clean break with migration is safer than schema bloat.
Trade-off: Two-table coexistence during the deprecation window. Mitigated by: keep user_sessions reads working for clients without sid claim (N6).
KTD-2: Session validation on every request (not just refresh)
Decision: get_current_user dependency reads sid from JWT, queries auth_sessions table (with Redis cache, 60s TTL) to confirm revoked=False and expires_at > now.
Rationale: Without this, a kicked-out user keeps their access token for up to 15 min (access TTL). With it, the kicked session is dead on the next request. The cost is +1 DB/cache lookup per request; cache makes this sub-ms.
Trade-off: One cache miss per request adds ~5ms; with the 60s cache the actual DB query rate is ~1/min/active-session.
KTD-3: Refresh token rotation + 30s denylist
Decision: Every successful /auth/refresh issues a new refresh token. The old token's hash is added to an in-memory + Redis denylist for 30 seconds. If the old token is reused within that window → TokenReuseDetected → revoke ALL sessions for that user.
Rationale: Industry standard (Auth0, Okta, AWS). Closes the window where an attacker who captured the old token can still use it after the legitimate user has refreshed.
Trade-off: The 30s window is a small UX cost (concurrent refresh calls from the same client during retry) but acceptable; legitimate retries complete in <1s and don't hit the window.
KTD-4: Tauri Keychain via keyring crate (not tauri-plugin-stronghold)
Decision: Use the keyring crate directly. It provides unified API across macOS Keychain, Windows Credential Manager, and Linux Secret Service with a single dependency.
Rationale: tauri-plugin-stronghold is a Tauri-team plugin but the v2 ecosystem is still maturing and the docs lag. keyring is the de-facto Rust standard for OS credential storage, used by cargo, git-credential-manager, and others. Smaller surface, fewer moving parts.
Trade-off: We write 3 small Tauri commands (store_refresh_token / load_refresh_token / clear_refresh_token) instead of using a plugin's auto-generated bindings. ~50 lines of Rust.
KTD-5: Access token in memory only (not persisted)
Decision: Access token lives only in the auth store's reactive ref<string | null>. Never written to localStorage or Keychain.
Rationale: Access tokens are short-lived (15 min). The cost of losing one (re-auth) is low; the security cost of persisting them (broader attack surface) is high. Refresh token is the only thing that needs durable storage.
Trade-off: App reload requires one refresh round-trip to get a new access token. Mitigated by the pre-emptive refresh + 3-state startup: by the time the app needs to call an API, the access token is already fresh.
KTD-6: Redis cache for session metadata (not just in-memory)
Decision: Use Redis (when available) to cache auth_sessions rows by sid. Fallback to in-process LRU (size=1024) when Redis is unavailable.
Rationale: The Tauri sidecar may run without Redis (zero-config dev mode). In-process LRU gives the same hit rate for single-process deployments. When Redis IS available (server deployment, multi-instance), it's the right cross-process answer.
Trade-off: Two code paths. Mitigated by a small SessionCache interface with two impls.
KTD-7: Session cap eviction strategy = LRU (oldest non-current)
Decision: When login would create the 11th session for a user, the oldest non-current session is revoked (with revoked_reason='session_cap_eviction') before the new one is created.
Rationale: LRU is intuitive ("the device I haven't used in a month should be the first to go"). Kicking "current" is wrong because the user is actively logging in.
Trade-off: None meaningful. Cap=10 is generous; the eviction is invisible to all but the user on the kicked device.
KTD-8: Pre-emptive refresh in api/base.ts interceptor (not in Pinia getter)
Decision: A request interceptor in BaseApiClient checks shouldRefresh() (access exp <2 min) BEFORE sending, and awaits silentRefresh() if needed.
Rationale: An interceptor guarantees the check runs for every request. A Pinia getter would only fire on accessToken access, which is not all requests (e.g. background fetches that don't read the getter).
Trade-off: One async function call before each request when expiring; negligible.
KTD-9: Backwards-compat shim for old clients
Decision: dependencies.py:get_current_user accepts JWTs with or without sid claim. Missing sid → fall back to user_sessions.refresh_token_hash validation. This path is logged and gated to one minor version.
Rationale: Avoids breaking in-flight clients. Lets us roll out gradually.
Trade-off: Two validation paths in get_current_user. Mitigated by extracting the session-lookup into a helper that both paths share.
KTD-10: AuthProvider 抽象层(为未来 IdP 对接留扩展点)
Decision: 鉴权逻辑走 auth/providers/base.py:AuthProvider Protocol(name / authenticate / get_user_by_id / sync_user_attributes / revoke_user),路由层用 Depends(get_auth_provider) 注入。当前默认 LocalAuthProvider(封装 SQLite + bcrypt),未来 OidcAuthProvider 接管时路由 / admin / Session 表零修改。StubOIDCProvider 作为占位(raise NotImplementedError),用于未来接口契约验证。
Rationale: 用户明确"未来要和集团账号密码对接"(OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)。如果现在把"用户存在哪里 / 密码怎么校验"写死在 routes/admin 里,未来切 IdP 必须重写所有路由层 + admin 端点。提前抽象可以让未来 IdP 集成只需新增一个 adapter(~300-500 行),不触及现有 routes / admin / SessionService。auth_sessions 表加 auth_provider 字段记录登录来源,审计可溯源。
Trade-off:
- 多 1 个抽象层(
auth/providers/base.pyProtocol)+ 1 个 DI 工厂(get_auth_provider)+ 1 个 StubOIDCProvider 占位 - 收益:未来 IdP 集成不重写路由层 + admin API;admin 踢人 / 看 session 列表跨 provider 一致
Alternatives considered:
- ❌ 不预留扩展点,只做当下 LocalAuthProvider:未来切 IdP 必须重写 routes + admin + SessionService
- ❌ 直接实现 OIDC:拉长本迭代 2-3 倍
High-Level Technical Design
Component Map
┌─────────────────────────────────────────────────────────────────────┐
│ Tauri Desktop (macOS / Windows / Linux) │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ WebView (Vue 3 frontend) │ │
│ │ ┌────────────┐ ┌──────────────┐ ┌────────────────┐ │ │
│ │ │ auth store │──│ api/base.ts │──│ Pinia + Router │ │ │
│ │ │ (memory) │ │ interceptor │ │ │ │ │
│ │ └────────────┘ └──────────────┘ └────────────────┘ │ │
│ │ │ │ │ │
│ │ │ │ silentRefresh │ │
│ │ │ ▼ │ │
│ │ │ ┌──────────────────┐ │ │
│ │ │ │ tauri-auth.ts │ invoke() │ │
│ │ │ └──────────────────┘ │ │ │
│ │ │ localStorage fallback ▼ │ │
│ │ │ ┌──────────────────────┐ │ │
│ │ └─────────────────▶│ src-tauri/src/auth.rs│ │ │
│ │ │ keyring::Entry │ │ │
│ │ └──────────────────────┘ │ │
│ │ │ │ │
│ └───────────────────────────────────────┼────────────────┘ │
│ │ HTTP │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ FastAPI server (Python sidecar) │ │
│ │ ┌────────────────────────┐ ┌──────────────────────┐ │ │
│ │ │ routes/auth.py │──▶│ auth/session.py │ │ │
│ │ │ + admin routes │ │ - create / rotate │ │ │
│ │ │ Depends(get_auth_ │ │ - revoke / kick │ │ │
│ │ │ provider) ─────┼──▶│ - reuse detection │ │ │
│ │ └────────────────────────┘ └──────────────────────┘ │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ ┌────────────────────────┐ ┌──────────────────────┐ │ │
│ │ │ auth/providers/ │ │ auth/models.py │ │ │
│ │ │ - base.py (Protocol) │ │ AuthSessionModel │ │ │
│ │ │ - local.py (Local) │ │ + auth_provider col │ │ │
│ │ │ - oidc_stub.py (stub) │ └──────────────────────┘ │ │
│ │ │ get_auth_provider() DI │ │ │
│ │ └────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────┐ │ │
│ │ │ auth/cache.py (Redis or in-process LRU) │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────┐ │
│ │ data/auth.db (SQLite) │ │
│ │ + auth_sessions table │ │
│ └──────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
State Machine — Client Auth
┌──────────────┐
app start │ │ valid refresh
────────────▶│ STARTUP │────────────────────▶ READY
│ │ │ │
└──────────────┘ │ │
│ │ │ │ │
invalid ────┘ │ └──── network err │ │
▼ │ │
┌──────────────┐ │ │
│ ERROR │ retry ──────┐ │ │
│ "刷新" │ │ │ │
└──────────────┘ │ │ │
▼ │ │ │
┌──────────────┐ │ │ │
│ INVALID │ retry ─────┤ │ │
│ "请重登" │ │ │ │
└──────────────┘ │ │ │
│ │ │
┌─────────────────────────┘ │ │
│ │ │
│ 401 in flight │ │
│ ◀──────────────────────────────────┘ │
│ │
▼ │
┌──────────────┐ │
│ silentRefresh│ │
└──────────────┘ │
│ │
ok ◀──┴──▶ fail → back to STARTUP / INVALID
State Machine — Server Session
login
│
▼
┌────────────┐
│ CREATED │ sid in JWT
│ active │
└────────────┘
│ │ │
refresh ok ─────┘ │ └──── logout → REVOKED (user)
(rotated) │
└── admin / password change / reuse detected
→ REVOKED (system)
Sequence — Cold Start (Tauri)
Window opens
│
▼
App.vue mounted
│
▼
bootstrapBackend()
│ start_backend (sidecar)
│ health check
│
▼
authStore.startupCheck()
│
├── 1. tauriAuthStorage.getRefreshToken()
│ Keychain (Tauri) → localStorage (Web fallback)
│
├── 2. GET /api/v1/auth/whoami (Authorization: Bearer <refresh>)
│ (the access token is gone, so we attach the refresh token;
│ the server uses a separate "whoami" code path that accepts
│ either type)
│
├── 3. response handling
│ 200 → { access_token, user } → state = VALID → /agent
│ 401 → state = INVALID → /login (with "会话已过期")
│ network err → state = ERROR → /login (with "无法连接")
│
▼
Router beforeEach
│ state = VALID → next()
│ state != VALID → next('/login')
Data Model — auth_sessions Table
erDiagram
auth_sessions {
TEXT id PK "uuid"
TEXT user_id FK
TEXT refresh_token_hash
TEXT device_fingerprint
TEXT device_label
TEXT ip
TEXT user_agent
TEXT created_at
TEXT last_active_at
TEXT expires_at
INTEGER revoked
TEXT revoked_reason
TEXT previous_session_id
}
users {
TEXT id PK
TEXT username
TEXT password_hash
...
}
auth_sessions }o--|| users : "user_id"
Sequence — Refresh Token Rotation + Reuse Detection
Client Server
│ │
│ POST /auth/refresh │
│ { refresh_token: "old" } │
│ ───────────────────────────▶ │
│ │ decode old → sid
│ │ lookup auth_sessions[sid]
│ │ hash(old) == session.refresh_token_hash? NO
│ │ → denylist check: hash(old) in denylist?
│ │ YES → REUSE DETECTED
│ │ → revoke ALL sessions for this user
│ │ → audit log "reuse_detected"
│ │ ← 401 { error: "token_reuse_detected" }
│ client clears state, │
│ routes to /login │
│ │
│ -- legit refresh -- │
│ POST /auth/refresh │
│ { refresh_token: "valid" } │
│ ───────────────────────────▶ │
│ │ hash(valid) == session.refresh_token_hash? YES
│ │ rotate: session.refresh_token_hash = hash(new)
│ │ add hash(old) to denylist (30s)
│ │ issue new access + new refresh
│ │ ← 200 { access_token, refresh_token }
│ store new refresh in │
│ Keychain, access in memory │
Implementation Units
U1. Schema: AuthSessionModel + extended bootstrap + backfill
Goal: Add the auth_sessions table with all required fields and indexes, AND backfill existing user_sessions rows on first startup.
Requirements: F6, F15, N5, N6 (the table backs every session-aware endpoint; backfill prevents forced re-login; auth_provider field enables future IdP audit traceability).
Dependencies: None.
Files:
src/agentkit/server/auth/models.py— addAuthSessionModel(SQLAlchemy 2 typed) + extend_SCHEMA_SQLfor direct aiosqlite init + add_SCHEMA_VERSION = 2constant + extendinit_auth_db()to run the backfilltests/unit/auth/test_models.py— model serialization + index smoke + backfill tests
Approach (schema):
- Use UUID strings as PK (matches existing
users.idstyle in this codebase) device_infois a JSON string (reuse pattern fromUserSessionModel.device_info)expires_atis ISO-8601 string (matchesUserModel.last_login_at)revokedis INTEGER (0/1) for SQLite compatibility- Add the new
CREATE TABLE auth_sessionsblock to_SCHEMA_SQL(line 234-242 is the currentuser_sessionsblock; append after it) with these indexes:idx_auth_sessions_user_id_activeon(user_id, revoked, expires_at)— supports the cap-count query and the list-active queryidx_auth_sessions_expires_aton(expires_at)— supports cleanup sweepsidx_auth_sessions_refresh_token_hashon(refresh_token_hash)— uniqueidx_auth_sessions_auth_provideron(auth_provider)— supports future IdP "list sessions by provider" query
- Add
auth_providercolumn (NEW per KTD-10):TEXT NOT NULL DEFAULT 'local'— records which provider created the session. Values:local(current) /oidc-stub(future stub) /oidc-keycloak/saml/ldap(future real adapters). Backfilled rows get'local'via the default. - Bump
_SCHEMA_VERSION = 2(currently implicit; the existinginit_auth_dbis idempotent viaCREATE TABLE IF NOT EXISTSso version is mostly for the backfill gate)
Approach (backfill) — critical, was missing from the original plan:
The current routes/auth.py:201-213 writes to user_sessions on login. After the new schema lands, the new SessionService.create_session writes to auth_sessions instead. To prevent forcing every existing user to re-login on the deploy, init_auth_db() runs a one-time backfill on startup:
async def _backfill_user_sessions(db: aiosqlite.Connection) -> int:
"""One-time backfill from user_sessions to auth_sessions.
Runs only when auth_sessions is empty AND user_sessions has rows.
Idempotent: subsequent restarts are no-ops.
"""
cursor = await db.execute("SELECT COUNT(*) FROM auth_sessions")
(count,) = await cursor.fetchone()
if count > 0:
return 0 # already backfilled
cursor = await db.execute(
"SELECT id, user_id, refresh_token_hash, device_info, created_at, expires_at, revoked_at "
"FROM user_sessions WHERE revoked_at IS NULL"
)
rows = await cursor.fetchall()
backfilled = 0
for row in rows:
device_info = json.loads(row["device_info"]) if row["device_info"] else {}
# Use existing user_sessions.id as the auth_sessions.id so that
# legacy clients holding the old refresh_token_hash still match
# a row in the new table (this is what the back-compat path in
# U10 relies on).
await db.execute(
"INSERT OR IGNORE INTO auth_sessions "
"(id, user_id, refresh_token_hash, device_fingerprint, device_label, "
" ip, user_agent, created_at, last_active_at, expires_at, revoked, revoked_reason) "
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
(
row["id"], # reuse legacy id for back-compat
row["user_id"],
row["refresh_token_hash"],
device_info.get("fingerprint", "unknown"),
device_info.get("label", "Unknown device"),
device_info.get("ip", ""),
device_info.get("user_agent", ""),
row["created_at"],
row["created_at"], # last_active_at defaults to created_at
row["expires_at"],
0, # not revoked (already filtered)
None,
),
)
backfilled += 1
if backfilled:
logger.info(f"Backfilled {backfilled} user_sessions rows to auth_sessions")
return backfilled
Approach (idempotency):
- The
INSERT OR IGNOREonauth_sessions.id PKmakes the backfill safe to re-run - The
count > 0early-exit means after the first backfill, subsequent startups are < 1ms
Approach (rolled-back risk):
- The backfill does NOT delete
user_sessionsrows. They are kept for 1 minor version as the legacy read path. U10's Phase 5 cleanup drops the table.
Test scenarios (test_models.py):
- Create session, query by
sid, find it - Create 11 sessions for one user, count = 11 (cap check is in U3)
- Query
WHERE user_id=? AND revoked=0 AND expires_at > nowreturns active sessions - Index
(user_id, revoked, expires_at)is present (verify viaPRAGMA index_list) - Index
idx_auth_sessions_auth_provideris present auth_providercolumn tests (NEW per KTD-10):- Default value is
'local'when column is omitted from INSERT WHERE auth_provider = 'local'returns only local-created sessionsWHERE auth_provider = 'oidc-stub'returns zero rows in current code
- Default value is
- Backfill tests (NEW):
init_auth_dbon a DB withuser_sessionsrows but emptyauth_sessions→ backfills all non-revoked rowsinit_auth_dbon a DB with existingauth_sessionsrows → does NOT re-backfill (idempotent)- Backfilled rows have the original
user_sessions.idas theirauth_sessions.id - Backfilled rows have
revoked=0 - Backfilled rows have their
expires_atpreserved - Backfill does NOT touch
user_sessionsrows that are already revoked (revoked_at IS NOT NULL)
Verification: pytest tests/unit/auth/test_models.py -v passes; init_auth_db runs cleanly on a copy of prod DB with the existing user_sessions table; backfill log line appears exactly once per fresh DB.
Note on Alembic: This codebase does not use Alembic. There is no alembic.ini, no migrations/ directory, and no alembic dependency in pyproject.toml. The auth DB schema is managed via the _SCHEMA_SQL constant + init_auth_db() pattern (see auth/models.py:202-333). This U1 unit aligns with that pattern; the original plan's Alembic reference was incorrect.
U2. JWT utils: sid + jti claims, dual decode path
Goal: Add sid and jti to issued JWTs; teach verify_token to read both old and new claim shapes.
Requirements: F5, F12, N6 (rotation + multi-client + backwards compat).
Dependencies: U1 (the sid references a row in auth_sessions).
Files:
src/agentkit/server/auth/jwt_utils.py—create_token_pair(...)now takessession_id: str;verify_token(...)returns decoded payload includingsid+jti; back-compat: missingsidis logged at DEBUG and accepted (caller decides what to do)src/agentkit/server/auth/denylist.py— new module:RecentlyRevokedTokensclass backed by in-memoryOrderedDict+ Redis pub/sub for cross-process;add(token_hash, ttl=30),contains(token_hash) -> booltests/unit/auth/test_jwt_utils.py— extend existing tests: round-trip withsid, decode legacy token, decode tampered token
Approach:
create_token_pair(user_id, session_id, ttl_pair)—accesspayload:{sub, sid, jti, type, exp, iat};refreshpayload: same minusjti(refresh tokens are long-lived; jti would be regenerated on every rotation, which is wasteful)verify_token(token, expected_type)— return full payload dict; legacy payload (nosid) is preserved as-is, callers branch on'sid' in payloadRecentlyRevokedTokens— single-processOrderedDictkeyed by SHA-256 hash, max 10k entries;containsis O(1);addevicts oldest if at capacity- Redis adapter:
SADD+EXPIRE;SISMEMBERfor check; the in-process impl is the fallback when Redis is unavailable
Test scenarios:
create_token_pair(...)produces tokens withsidandjti(access only)verify_tokenon a token withoutsidreturns the payload unchanged (caller must handle)verify_tokenon an expired token raisesExpiredSignatureErrorRecentlyRevokedTokens.add(hash, ttl)+contains(hash)returns True within 30s, False afterRecentlyRevokedTokenswith 10001 entries evicts the oldest (capacity test)- Redis adapter mock:
SADD+SISMEMBER+EXPIREcalled with correct args
Verification: pytest tests/unit/auth/test_jwt_utils.py -v passes; manual curl round-trip works against a running dev server.
U3. Session service: CRUD + rotation + reuse detection
Goal: Centralize all session operations behind a SessionService class so routes don't duplicate the logic.
Requirements: F5, F6, F8, F9, F11, F13, F15 (rotation, recording, kick, password change, three-state validation, provider-pluggability, audit field).
Dependencies: U1 (model), U2 (denylist).
Files:
src/agentkit/server/auth/session.py— new module:SessionServiceclasssrc/agentkit/server/auth/cache.py— new module:SessionCacheinterface +RedisSessionCache+InProcessLRUSessionCacheimplstests/unit/auth/test_session.py— full service test suite
Approach (SessionService methods):
async create_session(user_id, device_fingerprint, device_label, ip, user_agent, remember_me: bool, auth_provider: str = "local") -> AuthSessionModel- Cap check first: count active sessions for user; if ≥10, mark oldest non-current as
revokedwithrevoked_reason='session_cap_eviction' - Generate new
sid(uuid4),jti(uuid4) - Compute
expires_atbased onremember_me(30d vs 7d) - Store
auth_providerfrom caller (U4 passesprovider.name); enables F15 audit traceability - Insert row, return model
- Cap check first: count active sessions for user; if ≥10, mark oldest non-current as
async get_active_session(sid: str) -> AuthSessionModel | None- First check
SessionCache.get(sid); on miss, query DB, write to cache (60s TTL) - Return None if
revoked=Trueorexpires_at < now
- First check
async rotate_refresh(old_refresh_token: str) -> tuple[AuthSessionModel, TokenPair]- Decode
old_refresh_token; getsid; lookup session - Reuse detection: compare
sha256(old_refresh_token)againstsession.refresh_token_hash. If different, this is a reuse → callrevoke_all_for_user(user_id, reason='reuse_detected')+ raiseTokenReuseDetected - Also check
RecentlyRevokedTokens.contains(sha256(old_refresh_token))— if yes, same handling - On legitimate use: generate new
refresh_token, updatesession.refresh_token_hash=sha256(new),session.last_active_at= now,session.expires_at= now + ttl,session.previous_session_id= old sid (audit),auth_providerpreserved (rotation doesn't change provider) - Add
sha256(old_refresh_token)to denylist for 30s - Issue new access + refresh JWTs (call into jwt_utils)
- Invalidate cache entry for this sid
- Decode
async revoke_session(sid: str, reason: str) -> None- Mark
revoked=True,revoked_reason=reason; invalidate cache
- Mark
async revoke_all_for_user(user_id: str, except_sid: str | None, reason: str) -> int- Bulk update; returns count of revoked sessions
async list_active_for_user(user_id: str) -> list[AuthSessionModel]async list_all_for_admin(user_id: str) -> list[AuthSessionModel](admin endpoint)async list_active_by_provider(auth_provider: str) -> list[AuthSessionModel](NEW per KTD-10) — supports future "show me all OIDC sessions" admin view
Approach (SessionCache):
class SessionCache(Protocol):
async def get(self, sid: str) -> AuthSessionModel | None: ...
async def set(self, sid: str, session: AuthSessionModel, ttl: int = 60) -> None: ...
async def invalidate(self, sid: str) -> None: ...
InProcessLRUSessionCache:OrderedDict[sid, (session, expires_at)]; cap=1024; lazy eviction on getRedisSessionCache:GET/SETEX/DEL; pickle the model for storage
Test scenarios (test_session.py):
create_sessioninserts a row with all fields populatedcreate_sessionwith remember_me=True sets expires_at 30d out, else 7dcreate_sessionfor a user with 10 active sessions evicts the oldest non-current onecreate_sessionfor a user with 10 active sessions, the new login is one of them, the evicted one is the OLDEST non-newcreate_sessionwithauth_provider='oidc-stub'stores that value in the row (NEW per KTD-10)get_active_sessionreturns the row when validget_active_sessionreturns None whenrevoked=Trueget_active_sessionreturns None whenexpires_at < nowget_active_sessionsecond call within 60s hits cache (spy on DB call count)rotate_refreshwith the CURRENT token returns new pairrotate_refreshpreserves the originalauth_providervalue (NEW per KTD-10)rotate_refreshwith a REUSED old token (different hash) →TokenReuseDetectedraised + ALL sessions for user revokedrotate_refreshwith a token in the denylist → same handlingrotate_refreshupdatesprevious_session_idto the old sidrevoke_sessionsetsrevoked=True,revoked_reason, invalidates cacherevoke_all_for_userexcept_sid=None revokes everythingrevoke_all_for_userexcept_sid= keeps the current sessionlist_active_for_userreturns onlyrevoked=False AND expires_at > nowlist_all_for_adminreturns all rows including revoked (for audit)list_active_by_provider('local')returns only local sessions;('oidc-stub')returns empty in current code (NEW per KTD-10)
Verification: All unit tests pass; pytest tests/unit/auth/test_session.py -v shows 100% line coverage of session.py.
U4. Routes: new auth + admin endpoints
Goal: Expose all session operations as HTTP endpoints.
Requirements: F1, F2, F5, F6, F7, F8, F9, F10, F11, F13, F14, F15.
Dependencies: U3 (the service), U11 (AuthProvider 抽象层 — must land first or alongside).
Files:
src/agentkit/server/routes/auth.py— extendLoginRequestwithremember_me: bool = False; addWhoamiResponse,SessionInfoResponse; add new endpoints; DI 注入AuthProvider通过Depends(get_auth_provider)(KTD-10)src/agentkit/server/routes/admin.py— new module: admin session management endpoints (or extend existing admin module); 调用provider.revoke_user(user_id)而不是直接改 users 表(KTD-10)src/agentkit/server/dependencies.py—get_current_userextension to look up session via sid; back-compat fallback for old tokenssrc/agentkit/server/auth/password.py— extend withchange_password(user_id, new_password)that revokes all other sessionstests/integration/auth/test_auth_routes.py— full endpoint suite; 追加 provider mock 注入测试(KTD-10)tests/integration/auth/test_admin_routes.py— admin endpoints
Approach (new endpoints):
| Method | Path | Body / Query | Auth | Behavior |
|---|---|---|---|---|
| POST | /auth/login |
{username, password, remember_me?} |
none | provider.authenticate(username, password) → SessionService.create_session(auth_provider=provider.name) → return TokenResponse |
| POST | /auth/refresh |
{refresh_token} |
refresh | SessionService.rotate_refresh → return new TokenResponse; on TokenReuseDetected → 401 {error: "token_reuse_detected"} |
| POST | /auth/logout |
{refresh_token} |
access (optional) | revoke_session(sid, reason='user_terminated') |
| GET | /auth/whoami |
— | access OR refresh | Returns {user, session: {sid, device_label, ip, auth_provider, created_at, last_active_at, expires_at}}. Accepts refresh token to support cold-start where access is gone. |
| GET | /auth/sessions |
— | access | List current user's active sessions (each annotated with auth_provider) |
| DELETE | /auth/sessions/{sid} |
— | access | Revoke that session (if owned by current user) |
| POST | /auth/logout-others |
— | access | Revoke all sessions except current |
| POST | /auth/change-password |
{old_password, new_password} |
access | provider.authenticate 校验 old → provider.revoke_user(user_id) 失效其他 session(KTD-10: 跨 provider 行为一致) |
Approach (admin endpoints):
| Method | Path | Auth | Behavior |
|---|---|---|---|
| GET | /admin/users/{user_id}/sessions |
admin | List all that user's sessions (incl revoked) |
| DELETE | /admin/users/{user_id}/sessions/{sid} |
admin | Force-revoke any session |
Approach (/auth/whoami middleware bypass — critical fix):
The current AuthMiddleware._verify_jwt (in src/agentkit/server/auth/middleware.py:80-91) only accepts type=access tokens and 401s on type=refresh. The cold-start sequence sends a refresh token (because the access token is gone). To make this work without weakening auth, /auth/whoami is added to AuthMiddleware.WHITELIST_PATHS and the route does its own auth:
# In auth/middleware.py:
WHITELIST_PATHS = (
"/api/v1/health",
"/api/v1/auth/login",
"/api/v1/auth/refresh",
"/api/v1/auth/logout",
"/api/v1/auth/whoami", # NEW: route does its own auth
"/docs",
"/openapi.json",
"/redoc",
)
The /auth/whoami route accepts either an access token (normal call) or a refresh token (cold-start), and the auth check happens inside the route via verify_token + session lookup:
@router.get("/whoami")
async def whoami(request: Request) -> WhoamiResponse:
"""Returns the current user + session metadata.
Accepts either type=access (normal) or type=refresh (cold-start).
On 401 from this endpoint, the client treats it as 'invalid' state
(NOT 'error' state) so the router redirects to /login.
"""
auth_header = request.headers.get("Authorization", "")
if not auth_header.startswith("Bearer "):
raise HTTPException(401, "missing bearer token")
token = auth_header[7:]
try:
payload = verify_token(token, expected_type=None) # accept both types
except jwt.ExpiredSignatureError:
raise HTTPException(401, "token expired")
except jwt.InvalidTokenError:
raise HTTPException(401, "invalid token")
sid = payload.get("sid")
if sid:
# New-style: validate session in DB
session = await session_service.get_active_session(sid)
if not session:
raise HTTPException(401, "session revoked or expired")
user = await load_user(session.user_id)
# Issue a fresh access token so the client doesn't need a separate /refresh
new_access = create_access_token(user_id=user.id, session_id=session.id)
return WhoamiResponse(
user=user_to_response(user),
access_token=new_access,
session=session_to_response(session),
)
else:
# Legacy token without sid — back-compat path (U10)
user = await load_user(payload["sub"])
if not user or not user.is_active:
raise HTTPException(401, "user not found or inactive")
new_access = create_access_token(user_id=user.id, session_id=None) # legacy
return WhoamiResponse(
user=user_to_response(user),
access_token=new_access,
session=None, # no session metadata for legacy
)
Approach (defined phantom functions):
The plan's pseudo-code references several functions that don't exist yet. Define them explicitly:
# In auth/dependencies.py — NEW dependency for current session
async def get_current_session(request: Request) -> AuthSession:
"""Return the active session for the current request.
Reads request.state.session (set by get_current_user middleware/dependency).
Raises 401 if no session (legacy tokens) or session is revoked.
"""
session = getattr(request.state, "session", None)
if session is None:
raise HTTPException(401, "no active session (legacy token)")
return session
# In auth/dependencies.py — keep existing get_current_user but extend it
async def get_current_user(request: Request) -> User:
"""Return the current authenticated user.
Strategy:
- If request.state.current_user is already set (by AuthMiddleware for
type=access tokens), return it.
- Otherwise, this is called from a path that bypassed middleware
(e.g. /auth/whoami). The route must have set request.state.user
via its own auth check.
- Legacy tokens (no sid) only set current_user, not session.
"""
user = getattr(request.state, "current_user", None)
if user is None:
user = getattr(request.state, "user", None) # set by whoami route
if user is None:
raise HTTPException(401, "not authenticated")
return user
# In auth/users.py — NEW helper
async def load_user(user_id: str) -> User | None:
"""Load a user by id. Returns None if not found or inactive."""
async with aiosqlite.connect(str(DEFAULT_AUTH_DB_PATH)) as db:
cursor = await db.execute(
"SELECT * FROM users WHERE id = ? AND is_active = 1", (user_id,)
)
row = await cursor.fetchone()
return user_row_to_dict(row) if row else None
Approach (get_current_user back-compat with sid validation):
The new get_current_user is called by routes after AuthMiddleware has run. The middleware sets request.state.current_user (a dict with id, username, role, etc.) for type=access tokens. With the new sid-bearing tokens, the middleware is extended to also set request.state.session:
# In auth/middleware.py — extend _verify_jwt to also load session
def _verify_jwt(self, token: str) -> dict[str, Any] | None:
# ... existing signature/expiry check ...
sid = payload.get("sid")
if sid:
# Synchronous check is not possible (DB call). Defer to a
# per-route dependency. Middleware only checks signature + expiry
# for new tokens; the session-revoked check happens in the
# get_current_user dependency.
pass
return payload
The session-revoked check is then done lazily in get_current_session, which calls SessionService.get_active_session(sid). This is one extra DB-or-cache call per request, mitigated by the 60s Redis cache (KTD-6).
Approach (change_password):
@router.post("/change-password")
async def change_password(
payload: ChangePasswordRequest,
user: User = Depends(get_current_user),
session: AuthSession = Depends(get_current_session),
):
if not verify_password(payload.old_password, user.password_hash):
raise HTTPException(400, "old password incorrect")
new_hash = hash_password(payload.new_password)
async with aiosqlite.connect(str(DEFAULT_AUTH_DB_PATH)) as db:
await db.execute(
"UPDATE users SET password_hash=?, updated_at=? WHERE id=?",
(new_hash, _now_iso(), user.id),
)
await db.commit()
revoked_count = await session_service.revoke_all_for_user(
user.id, except_sid=session.id, reason="password_changed"
)
logger.info(f"Password changed for user {user.id}; revoked {revoked_count} other sessions")
return {"ok": True, "revoked_sessions": revoked_count}
Test scenarios (test_auth_routes.py):
- Happy path:
POST /auth/loginwith valid creds → 200, returns token pair + userPOST /auth/loginwithremember_me=true→ refresh token exp 30dPOST /auth/loginwithremember_me=false→ refresh token exp 7dPOST /auth/refreshwith current token → 200, new pair (different from old)GET /auth/whoamiwith access token → 200, returns user + session metadataGET /auth/whoamiwith refresh token (cold-start case) → 200GET /auth/sessions→ list of current user's active sessionsDELETE /auth/sessions/{sid}for own session → 200, that session now revokedPOST /auth/logout-others→ 200, all other sessions revokedPOST /auth/change-passwordwith correct old → 200, other sessions revoked
- Error paths:
POST /auth/loginwith wrong password → 401 (constant-time)POST /auth/loginwith unknown user → 401 (constant-time)POST /auth/loginwith inactive user → 403POST /auth/refreshwith reused old token → 401{error: "token_reuse_detected"}POST /auth/refreshwith denylisted token → 401POST /auth/refreshwith tampered token → 401GET /auth/whoamiwith no Authorization header → 401GET /auth/whoamiwith expired access token → 401DELETE /auth/sessions/{sid}for someone else's session → 403POST /auth/change-passwordwith wrong old password → 400POST /auth/change-passwordwith weak new password (if validation added) → 422
- Integration:
- Login from client A, login from client B (different IPs / fingerprints) → both have independent sessions
- Login as user from 11 different fingerprints → 11th login evicts the 1st (oldest non-current)
- Change password → other devices get 401 on next request → bounced to /login
Test scenarios (test_admin_routes.py):
GET /admin/users/{id}/sessionsas admin → returns all sessions (active + revoked)GET /admin/users/{id}/sessionsas non-admin → 403DELETE /admin/users/{id}/sessions/{sid}as admin → that session revokedDELETE /admin/users/{id}/sessions/{sid}as non-admin → 403
Verification: All integration tests pass; pytest tests/integration/auth/ -v shows green.
U5. Tauri: keyring integration + commands
Goal: Add three Tauri commands to read/write/clear the refresh token in OS Keychain.
Requirements: F3.
Dependencies: None on the auth side; only depends on Tauri Cargo config.
Files:
src/agentkit/server/frontend/src-tauri/Cargo.toml— addkeyring = { version = "3", features = ["apple-native", "windows-native", "linux-native"] }(or just default features if 3 platforms covered)src/agentkit/server/frontend/src-tauri/src/auth.rs— new module with 3#[tauri::command]functionssrc/agentkit/server/frontend/src-tauri/src/lib.rs— register the commands intauri::Builder::default().invoke_handler(...)src/agentkit/server/frontend/src-tauri/capabilities/default.json— add the 3 commands to thepermissionsallowlisttests/unit-tauri/test_keyring.rs— Rust unit tests usingkeyring::mockfeature
Approach (auth.rs):
const SERVICE: &str = "com.fischer.agentkit";
const USERNAME: &str = "refresh_token";
#[tauri::command]
pub async fn store_refresh_token(token: String) -> Result<(), String> {
let entry = keyring::Entry::new(SERVICE, USERNAME)
.map_err(|e| format!("keychain init failed: {e}"))?;
entry.set_password(&token)
.map_err(|e| format!("keychain write failed: {e}"))
}
#[tauri::command]
pub async fn load_refresh_token() -> Result<Option<String>, String> {
let entry = keyring::Entry::new(SERVICE, USERNAME)
.map_err(|e| format!("keychain init failed: {e}"))?;
match entry.get_password() {
Ok(t) => Ok(Some(t)),
Err(keyring::Error::NoEntry) => Ok(None),
Err(e) => Err(format!("keychain read failed: {e}")),
}
}
#[tauri::command]
pub async fn clear_refresh_token() -> Result<(), String> {
let entry = keyring::Entry::new(SERVICE, USERNAME)
.map_err(|e| format!("keychain init failed: {e}"))?;
match entry.delete_credential() {
Ok(()) => Ok(()),
Err(keyring::Error::NoEntry) => Ok(()),
Err(e) => Err(format!("keychain delete failed: {e}")),
}
}
Approach (Cargo.toml):
- Add
keyring = "3"under[dependencies] - macOS: requires the binary to be signed (Keychain access); for unsigned dev builds, fallback to
keyring::mockvia feature flag (not needed in this plan; document in README instead)
Approach (capabilities/default.json):
- Add 3 entries to the
permissionsarray:"core:default:allow-store-refresh-token""core:default:allow-load-refresh-token""core:default:allow-clear-refresh-token"
Test scenarios (test_keyring.rs):
store_refresh_token("abc")thenload_refresh_token()returnsSome("abc")clear_refresh_token()thenload_refresh_token()returnsNoneload_refresh_token()on a fresh keyring returnsNone(not error)- Use
keyring::mockfeature for CI tests; real platform tests are manual on macOS dev machine
Verification: cargo test --manifest-path src/agentkit/server/frontend/src-tauri/Cargo.toml passes; manual smoke: launch Tauri dev, log in, check macOS Keychain Access.app for the entry.
U6. Frontend: tauri-auth.ts adapter
Goal: Abstract Keychain (Tauri) / localStorage (Web) access behind a single async API.
Requirements: F3, F4.
Dependencies: U5 (the Rust commands must exist for invoke() to work).
Files:
src/agentkit/server/frontend/src/api/tauri-auth.ts— new moduletests/unit/api/tauri-auth.test.ts— unit tests with mockedinvoke
Approach:
const SERVICE = 'agentkit.refresh_token'
function isTauri(): boolean {
return typeof window !== 'undefined' && '__TAURI_INTERNALS__' in window
}
export const tauriAuthStorage = {
async setRefreshToken(token: string): Promise<void> {
if (isTauri()) {
try {
const { invoke } = await import('@tauri-apps/api/core')
await invoke('store_refresh_token', { token })
return
} catch (e) {
console.warn('[auth] Keychain write failed, falling back to localStorage', e)
}
}
localStorage.setItem(SERVICE, token)
},
async getRefreshToken(): Promise<string | null> {
if (isTauri()) {
try {
const { invoke } = await import('@tauri-apps/api/core')
return await invoke<string | null>('load_refresh_token')
} catch (e) {
console.warn('[auth] Keychain read failed, falling back to localStorage', e)
}
}
return localStorage.getItem(SERVICE)
},
async clearRefreshToken(): Promise<void> {
if (isTauri()) {
try {
const { invoke } = await import('@tauri-apps/api/core')
await invoke('clear_refresh_token')
} catch (e) {
console.warn('[auth] Keychain clear failed, falling back to localStorage', e)
}
}
localStorage.removeItem(SERVICE)
},
}
Test scenarios (tauri-auth.test.ts):
isTauri()returnstruewhen__TAURI_INTERNALS__is in windowsetRefreshTokenin Tauri mode callsinvoke('store_refresh_token', { token })setRefreshTokenin Tauri mode falls back to localStorage when invoke throwssetRefreshTokenin Web mode (no Tauri) writes to localStorage directlygetRefreshTokenin Tauri mode returns the value frominvoke('load_refresh_token')getRefreshTokenin Tauri mode falls back to localStorage when invoke throwsclearRefreshTokenin Tauri mode callsinvoke('clear_refresh_token')clearRefreshTokenin Web mode removes from localStorage
Verification: npm run test:unit -- tauri-auth.test.ts passes; manual test: launch Tauri, log in, verify entry in macOS Keychain.
U7. Frontend: auth store refactor (3-state startup, pre-emptive refresh)
Goal: Rewrite stores/auth.ts to support the new flow.
Requirements: F1, F10, F11, F12.
Dependencies: U6 (adapter), U4 (server endpoints).
Files:
src/agentkit/server/frontend/src/stores/auth.ts— major refactorsrc/agentkit/server/frontend/src/api/auth.ts— addwhoami(),login(rememberMe),changePassword(),listSessions(),revokeSession()tests/unit/stores/auth.test.ts— extend existing test file
Approach (new auth store shape):
type AuthStartupState = 'valid' | 'invalid' | 'error' | 'pending'
export const useAuthStore = defineStore('auth', () => {
// --- State ---
const accessToken = ref<string | null>(null) // memory only, never persisted
const user = ref<IAuthUser | null>(readStoredUser()) // localStorage cache for avatar/role
const startupState = ref<AuthStartupState>('pending')
const isLoading = ref(false)
const error = ref<string | null>(null)
// --- Getters ---
const isAuthenticated = computed(() => !!accessToken.value && !!user.value)
const accessTokenExp = computed<number | null>(() => decodeJwtExp(accessToken.value))
const shouldRefresh = computed(() => {
if (!accessTokenExp.value) return false
return accessTokenExp.value * 1000 - Date.now() < 2 * 60 * 1000 // < 2 min
})
// --- Mutators ---
function _setAccess(token: string, user: IAuthUser): void {
accessToken.value = token
// user goes to localStorage (safe — no secret)
localStorage.setItem(USER_KEY, JSON.stringify(user))
// refresh token goes to Keychain (Tauri) or localStorage (Web)
// (called separately by login/refresh)
}
async function _persistTokenPair(pair: ITokenPair): Promise<void> {
accessToken.value = pair.access_token
user.value = pair.user
writeStoredUser(pair.user)
await tauriAuthStorage.setRefreshToken(pair.refresh_token)
}
function _clear(): void {
accessToken.value = null
// do NOT clear user from localStorage (UI shows cached avatar/role)
// do NOT call tauriAuthStorage.clear here; caller decides
}
// --- Actions ---
async function login(username, password, rememberMe = false): Promise<void> {
const pair = await authApi.login(username, password, rememberMe)
await _persistTokenPair(pair)
startupState.value = 'valid'
}
async function startupCheck(): Promise<AuthStartupState> {
const refresh = await tauriAuthStorage.getRefreshToken()
if (!refresh) {
startupState.value = 'invalid' // not an error — just no token
return startupState.value
}
try {
const result = await authApi.whoami(refresh)
// whoami returns { user, access_token, session }
accessToken.value = result.access_token
user.value = result.user
writeStoredUser(result.user)
startupState.value = 'valid'
} catch (err) {
if (err.status === 401) {
await tauriAuthStorage.clearRefreshToken()
startupState.value = 'invalid'
} else {
startupState.value = 'error' // network or server issue
}
}
return startupState.value
}
async function silentRefresh(): Promise<void> {
const refresh = await tauriAuthStorage.getRefreshToken()
if (!refresh) {
_clear()
throw new Error('no refresh token')
}
try {
const pair = await authApi.refresh(refresh)
await _persistTokenPair(pair)
} catch (err) {
if (err.status === 401) {
// reuse detected or all sessions revoked
await tauriAuthStorage.clearRefreshToken()
}
_clear()
throw err
}
}
async function logout(): Promise<void> {
const refresh = await tauriAuthStorage.getRefreshToken()
if (refresh) {
try { await authApi.logout(refresh) } catch { /* server may be down */ }
}
await tauriAuthStorage.clearRefreshToken()
_clear()
user.value = null // explicit: logged out means no cached user
}
function logoutLocal(): void {
_clear()
user.value = null
}
return { /* state, getters, actions */ }
})
Approach (api/auth.ts additions):
async login(username, password, rememberMe = false): Promise<ITokenPair> {
return this.request('/auth/login', {
method: 'POST',
body: JSON.stringify({ username, password, remember_me: rememberMe }),
})
}
async whoami(refreshToken?: string): Promise<{ user: IAuthUser; access_token: string; session: SessionInfo }> {
// whoami accepts either an access token (normal call) or a refresh token (cold start)
// The base client's auth header injection handles access; for the cold-start case
// we need a special path that uses the refresh token instead.
return this.requestWithAuth('/auth/whoami', refreshToken)
}
async listSessions(): Promise<SessionInfo[]> { ... }
async revokeSession(sid: string): Promise<void> { ... }
async changePassword(oldPassword: string, newPassword: string): Promise<void> { ... }
Approach (api/base.ts interceptor):
this.client.interceptors.request.use(async (config) => {
const auth = useAuthStore()
if (auth.shouldRefresh && auth.accessToken) {
try {
await auth.silentRefresh()
} catch {
// silent refresh failed; let the request go through and 401 will trigger route
}
}
if (auth.accessToken) {
config.headers.Authorization = `Bearer ${auth.accessToken}`
}
return config
})
Test scenarios (auth.test.ts):
login(...)calls authApi.login with remember_me paramlogin(...)persists refresh token via tauriAuthStorage.setRefreshTokenstartupCheck()with no refresh token → state='invalid'startupCheck()with valid refresh → state='valid', user populatedstartupCheck()with 401 from whoami → state='invalid', refresh token clearedstartupCheck()with network error → state='error', refresh token retainedsilentRefresh()succeeds → new access in memory, new refresh in KeychainsilentRefresh()on 401 reuse → all state cleared, refresh token clearedshouldRefreshis true when access expires in <2 minshouldRefreshis false when access expires in >2 min or no accesslogout()calls authApi.logout then clears Keychain + statelogout()doesn't fail when server is down (best-effort)- Access token is NEVER written to localStorage (spy on localStorage.setItem)
Verification: npm run test:unit -- auth.test.ts passes; manual e2e via npm run tauri dev.
U8. Frontend: LoginView "Remember me" + Settings sessions UI
Goal: User-facing changes to the login page and a new "Active Sessions" panel in settings.
Requirements: F2, F7, F8 (user-side).
Dependencies: U7 (store + api), U4 (endpoints).
Files:
src/agentkit/server/frontend/src/views/LoginView.vue— add "Remember me" checkbox; pass to store.loginsrc/agentkit/server/frontend/src/views/SettingsView.vue— new section "Active sessions" (or new route/settings/sessions)src/agentkit/server/frontend/src/components/settings/ActiveSessionsPanel.vue— new componentsrc/agentkit/server/frontend/src/components/settings/ChangePasswordPanel.vue— new componentsrc/agentkit/server/frontend/src/router/index.ts— add/settings/sessionsand/settings/securityroutestests/unit/views/LoginView.test.ts— checkbox behaviortests/unit/components/ActiveSessionsPanel.test.ts
Approach (LoginView additions):
<a-checkbox v-model:checked="form.rememberMe">
记住我(30 天内免登录)
</a-checkbox>
async function handleSubmit() {
await authStore.login(form.username, form.password, form.rememberMe)
router.replace(redirectTarget())
}
Approach (ActiveSessionsPanel.vue):
- On mount: call
authApi.listSessions(), render table (Device / Last active / Created / [Revoke] button) - "Current session" row has a badge; revoke button is disabled for the current row
- "Revoke" calls
authApi.revokeSession(sid)and removes the row - "Revoke all others" button at the top → calls
authApi.logoutOthers()and reloads
Approach (ChangePasswordPanel.vue):
- 3 fields: old password, new password, confirm new password
- Submit:
authApi.changePassword(old, new) - On success: show success message; note "其他设备将自动登出"
Test scenarios:
LoginViewrenders the checkbox; submitting with it checked passesrememberMe=trueto storeActiveSessionsPanelrenders a row per session from the API responseActiveSessionsPanel"Revoke" button callsauthApi.revokeSession(sid)and removes the row optimisticallyActiveSessionsPanel"Revoke all others" callsauthApi.logoutOthers()and reloads the listActiveSessionsPaneldisables Revoke on the current session rowChangePasswordPanelshows field-level validation errors (mismatched passwords)ChangePasswordPanelon success shows toast and clears the form
Verification: npm run test:unit -- LoginView ActiveSessionsPanel ChangePasswordPanel passes; Playwright e2e for the full settings flow.
U9. Admin UI: user sessions management
Goal: Admins can see and revoke any user's active sessions.
Requirements: F7, F8 (admin-side).
Dependencies: U7, U4 (admin endpoints exist), U8 (reuses ActiveSessionsPanel layout).
Files:
src/agentkit/server/frontend/src/views/admin/UsersView.vue(orUserDetailView.vue) — add "Sessions" tabsrc/agentkit/server/frontend/src/components/admin/UserSessionsPanel.vue— admin variantsrc/agentkit/server/frontend/src/api/admin.ts— new filetests/unit/components/UserSessionsPanel.test.ts
Approach:
- Reuse
ActiveSessionsPanelstyling; pass anadminModeprop that adds:- Show username in the table header
- Allow revoke of any session including current
- Show revoked sessions with strikethrough
- API:
adminApi.listUserSessions(userId),adminApi.revokeUserSession(userId, sid)
Test scenarios:
- Admin can see all sessions for a user (active + revoked)
- Admin can revoke any session
- Non-admin attempting to call adminApi endpoints gets a clear 403 error in the UI
Verification: npm run test:unit -- UserSessionsPanel passes; manual e2e with admin login.
U10. Backwards-compat + rollout shim
Goal: Existing in-flight clients (without sid claim) keep working for one minor version.
Requirements: N6.
Dependencies: U4 (the back-compat path in get_current_user).
Files:
src/agentkit/server/dependencies.py—get_current_useraccepts both with-sid and without-sid JWTs; logs a DEBUG for legacysrc/agentkit/server/auth/jwt_utils.py—create_token_pairhas alegacy_mode=Trueflag for the migration window; tokens issued during migration carrysidbut the validator still accepts old onesdocs/migrations/2026-06-20-client-version-rollout.md— new doc explaining the rollout window (server logs a warning when a legacy JWT is accepted)
Approach:
- Add an
X-Client-Versionheader to all requests (set inapi/base.ts) - Server middleware reads this header; if version <
0.5.0, it issues a legacy JWT (no sid) so that client doesn't get a 401 it can't handle - New clients always get a
sid-bearing JWT - After one minor version (~30 days), remove the legacy path in a separate change
Test scenarios:
get_current_userwith a sid-bearing JWT loads the session, validates it, returns the userget_current_userwith a JWT without sid (legacy) accepts it as long as signature + exp are validget_current_userwith a sid-bearing JWT where the session is revoked → 401get_current_userwith a sid-bearing JWT where the session doesn't exist → 401- Legacy middleware path issues tokens without
sidfor clients withX-Client-Version < 0.5.0
Verification: Backwards-compat test using a hand-crafted legacy JWT; new client flow continues to work; manual test with the previous-version frontend.
U11. AuthProvider 抽象层(为未来 IdP 对接留扩展点)
Goal: 把"用户存在哪里 / 密码怎么校验 / 属性怎么同步"封装在可插拔的 AuthProvider adapter 后面。当前实现 LocalAuthProvider(封装 SQLite + bcrypt);同时提供 StubOIDCProvider 占位实现(raise NotImplementedError)作为未来 OIDC 实现的接口契约参考。路由层 / admin API / SessionService 通过 Depends(get_auth_provider) 拿到 provider 引用,未来切 IdP 零修改路由。
Requirements: F13, F14, F15.
Dependencies: None(被 U1/U3/U4 引用;可与 U1-U4 任何阶段并行或先后落地;建议在 Phase 1 早期就上,因为 U1 schema 需要 auth_provider 字段)。
Files:
src/agentkit/server/auth/providers/__init__.py— new package;导出AuthProvider、get_auth_provider()工厂、LocalAuthProvider、StubOIDCProvidersrc/agentkit/server/auth/providers/base.py—AuthProviderProtocol(name: str+authenticate/get_user_by_id/sync_user_attributes/revoke_user4 个 async 方法)src/agentkit/server/auth/providers/local.py—LocalAuthProvider实现,封装现有auth/password.py逻辑(bcrypt 校验 + 查 users 表)src/agentkit/server/auth/providers/oidc_stub.py—StubOIDCProvider占位实现,所有方法raise NotImplementedError并在 docstring 中指向下一迭代 OIDC 实现的 checklistsrc/agentkit/server/config.py— extendAuthConfigwithprovider: Literal["local", "oidc-stub"] = "local"(或新增auth.provider字段)tests/unit/auth/providers/test_base.py— Protocol 静态类型检查(runtime_checkableProtocol 验证)+ mock provider 用例tests/unit/auth/providers/test_local.py—LocalAuthProvider全量单测(复用auth/password.py测试场景)tests/unit/auth/providers/test_oidc_stub.py—StubOIDCProvider调用任意方法均抛NotImplementedError的单测
Approach (AuthProvider Protocol):
# auth/providers/base.py
from typing import Protocol, runtime_checkable
from ..models import User
@runtime_checkable
class AuthProvider(Protocol):
"""所有鉴权后端必须实现的能力。
路由层只调用以下方法,不感知具体实现是 SQLite / OIDC / LDAP。
未来新增 IdP 只需新加一个实现此 Protocol 的 adapter。
"""
name: str # 标识当前 provider,写入 session.auth_provider
async def authenticate(self, *, username: str, password: str) -> User:
"""校验用户名 + 密码,返回 User 对象。失败抛 InvalidCredentials。"""
...
async def get_user_by_id(self, user_id: int) -> User | None:
"""按 id 查 user(admin 端点、session 校验、whoami 都用这个)。"""
...
async def sync_user_attributes(self, user_id: int) -> None:
"""同步用户属性(部门/邮箱/职位等)。LocalAuthProvider: no-op;OidcAuthProvider: 从 IdP 拉最新 profile 写回本地。"""
...
async def revoke_user(self, user_id: int) -> None:
"""禁用用户(离职/锁定)。LocalAuthProvider: UPDATE users SET is_active=0;OidcAuthProvider: 调 IdP 的 disable API(未来)。"""
...
Approach (LocalAuthProvider): 把 routes/auth.py:201-213 的 password 校验逻辑(SQLite SELECT + bcrypt 校验 + load_user)搬到 LocalAuthProvider.authenticate。路由层不再直接调 verify_password / load_user —— 统一走 provider。revoke_user 走 UPDATE users SET is_active=0(admin 端点统一调这个,不再直接写 DB)。
Approach (StubOIDCProvider): 所有方法 raise NotImplementedError,docstring 写明:
当前未实现。下一迭代 OIDC 集成时,重写本类即可,路由 / admin / Session 表零修改。配置
auth.provider: oidc-stub启动会立即报 NotImplementedError(这是设计:避免误启用未完成的功能)。
Approach (DI 工厂):
# auth/providers/__init__.py
from functools import lru_cache
from ...config import get_settings
from .base import AuthProvider
from .local import LocalAuthProvider
from .oidc_stub import StubOIDCProvider
@lru_cache
def get_auth_provider() -> AuthProvider:
settings = get_settings()
provider_name = settings.auth.provider
if provider_name == "local":
db = get_auth_db() # 现有 aiosqlite 连接(需改造为模块级单例)
return LocalAuthProvider(db)
elif provider_name == "oidc-stub":
return StubOIDCProvider()
else:
raise ValueError(f"unknown auth provider: {provider_name}")
Approach (config 扩展):
# agentkit.yaml
auth:
provider: local # local | oidc-stub (未来: oidc-keycloak, oidc-feishu, ...)
session:
table: auth_sessions
access_ttl_seconds: 900
refresh_ttl_seconds: 604800
refresh_ttl_remember_me_seconds: 2592000
jwt:
secret_env: AGENTKIT_JWT_SECRET
algorithm: HS256
Test scenarios (test_base.py + test_local.py + test_oidc_stub.py):
LocalAuthProviderwith valid username+password returns UserLocalAuthProviderwith wrong password raisesInvalidCredentialsLocalAuthProviderwith unknown username raisesInvalidCredentialsLocalAuthProviderwith inactive user (is_active=0) raisesInvalidCredentialsLocalAuthProvider.get_user_by_idreturns the user or NoneLocalAuthProvider.sync_user_attributesis a no-op (returns None)LocalAuthProvider.revoke_usersetsis_active=0and subsequentauthenticatefailsLocalAuthProvider.name == "local"StubOIDCProvider.authenticateraisesNotImplementedErrorwith helpful messageStubOIDCProvider.get_user_by_idraisesNotImplementedErrorStubOIDCProvider.sync_user_attributesraisesNotImplementedErrorStubOIDCProvider.revoke_userraisesNotImplementedErrorStubOIDCProvider.name == "oidc-stub"get_auth_provider()withauth.provider=localreturnsLocalAuthProviderinstanceget_auth_provider()withauth.provider=oidc-stubreturnsStubOIDCProviderinstanceget_auth_provider()withauth.provider=unknownraisesValueErrorget_auth_provider()is memoized (lru_cache; second call returns same instance)runtime_checkable(AuthProvider): both Local and Stub passisinstance(prov, AuthProvider)check- Protocol violation: a class missing
authenticatemethod does NOT passisinstancecheck (negative test)
Patterns to follow:
- Protocol + runtime_checkable pattern (Python typing best practice)
- DI 工厂 + lru_cache 单例(与现有
get_settings一致) - error 类型
InvalidCredentials放到auth/providers/exceptions.py(新建)
Verification:
pytest tests/unit/auth/providers/ -v全部通过mypy src/agentkit/server/auth/providers/无报错- 启动 dev server,配置
auth.provider: oidc-stub→ 第一次/auth/login返回 501 NotImplementedError(确认 stub 起作用) - 启动 dev server,配置
auth.provider: local→ 走现有登录流程,确认未破坏 - admin 踢人功能调用
provider.revoke_user(user_id)后,user 再authenticate失败(cross-check LocalAuthProvider.revoke_user 行为)
未来 IdP 对接 checklist(下一迭代参考):
auth/providers/oidc.py— 实现OidcAuthProvider(authenticate / get_user / sync_attributes / revoke_user)auth/oauth_routes.py—/auth/oauth/{provider}/redirect和/auth/oauth/{provider}/callback端点auth/state_cache.py— OAuth state 参数防 CSRF(Redis TTL 5min)- 用户首次从 IdP 登录时的「本地账号创建」策略(justeer / 拒绝 / 邀请制)
- IdP 端的 session 同步(IdP 登出时本地 session 也撤销)
- 集团部门 / 职位属性映射到本地 users 表
本次迭代只做 Protocol + Local 实现 + Stub 占位 + DI 工厂 + 上述 1-3 项的占位(接口定义),其余列入下一迭代独立 brainstorm。
System-Wide Impact
| Stakeholder | Impact | Mitigation |
|---|---|---|
| End users (Tauri) | First login → no more login prompts for 7d (30d if "remember me"). | Pre-emptive refresh + Keychain storage prevent the failure modes that broke the existing flow. |
| End users (Web) | Same as Tauri but refresh in localStorage (degraded security). | Document the trade-off; Keychain is Tauri-only. |
| Admins | New capability: see active sessions, kick any user. | UI in admin pages; surface clearly in the Users view. |
| Developers (auth code) | New session module, denylist, cache, AuthProvider 抽象层. | U3 is the single source of truth — routes don't duplicate logic. U11 is the single source of auth backend — routes don't import password.py directly. |
| 未来集团 IdP 集成团队 | 切到 OIDC / SAML / LDAP 时只新增 adapter,不重写路由 / admin | U11 Protocol + LocalAuthProvider 已上;下一迭代 auth/providers/oidc.py 直接实现 Protocol 即可 |
| Existing in-flight clients | Unaffected during 30-day window. | U10 shim. |
| Server load | +1 cache lookup per request (cached 60s). | Redis-backed cache makes this sub-ms. |
| DB schema | New auth_sessions table (含 auth_provider 字段); existing user_sessions deprecated. |
Alembic migration; keep user_sessions reads working for one version. |
Risks & Dependencies
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
keyring crate compatibility issues on Linux without gnome-keyring / kwallet |
Medium | Low (Tauri dev) | Document apt install gnome-keyring in README; fallback to localStorage as per KTD-confirmed decision. |
| Tauri WebView localStorage might be cleared on Tauri upgrade | Low | Medium (forces re-login) | Refresh token is in Keychain, not localStorage, so this is no longer a re-login trigger. Only the cached user (avatar) is lost. |
| Refresh token rotation causes concurrent-request races | Medium | Medium (false-positive reuse detection) | The 30s denylist window catches the case; legitimate retries complete in <1s. Add a metric for reuse detection so we can spot flapping. |
| Migration corrupts existing refresh tokens | Low | High (users locked out) | Test migration on a copy of prod DB; preserve user_sessions reads for back-compat. |
| Session cap eviction surprises users (they didn't expect to be kicked) | Low | Low (visible at next login) | Make the cap (10) generous; document it; do not log evicted users out silently. |
Test mocks diverge from real keyring behavior |
Medium | Medium (CI passes, manual fails) | Use keyring::mock feature in CI; document that real-platform testing is manual. |
| JWT secret rotation in dev mode invalidates all sessions | Low | High (Tauri dev loops) | Document the behavior; provide agentkit doctor to check. |
AuthProvider 切换时遗留 routes 直接调 verify_password / 改 users 表(KTD-10) |
Medium | Medium(切 IdP 时必须清理) | U11 引入后强制要求所有 routes 走 Depends(get_auth_provider);code review 模板加 checklist「禁止 routes 直接调 password/auth 函数」 |
lru_cache 单例 + 测试隔离冲突(U11) |
Low | Low(测试 flaky) | get_auth_provider 提供 cache_clear() helper;conftest.py 在每个 test fixture 前后清缓存 |
未来 IdP 接管时 LocalAuthProvider 残留依赖 |
Low | Low(迁移期保留即可) | U11 checklist 显式列出:Local 仍可用作"本地应急账号",OIDC 接管后不删 Local,仅调整路由默认 provider |
External Dependencies
| Dependency | Version | Required For |
|---|---|---|
keyring (Rust crate) |
3.x | Tauri Keychain integration (U5) |
pyjwt (Python) |
already in use | JWT signing/verification (U2) |
aiosqlite (Python) |
already in use | DB layer (U1, U3) |
alembic (Python) |
already in use | Migrations (U1) |
redis (Python) |
already in use | Session cache (U3) — optional; in-process fallback |
@tauri-apps/api (TS) |
2.x | Tauri command invocation (U6) |
Phased Delivery
This plan has natural phasing based on dependency order. Each phase lands as a single PR.
Phase 1: Backend foundation (U1, U2, U3)
auth_sessionstable + migration- JWT sid/jti claims
- SessionService with rotation + reuse detection
- Redis/in-process cache
- ~3-4 days of work, no frontend changes
Rollout gate: Deploy to dev. All existing clients continue to work (legacy JWT path). New login creates auth_sessions rows; old user_sessions rows are no longer written.
Phase 2: New endpoints (U4, U10)
- All new auth + admin endpoints
- Backwards-compat shim
- Admin endpoint tests
- ~2 days of work, frontend still on old flow
Rollout gate: Deploy to dev. New endpoints are available; old /auth/login and /auth/refresh still work (with legacy tokens).
Phase 3: Tauri Keychain (U5, U6)
- Rust commands + Cargo dep
- Frontend tauri-auth adapter
- ~1-2 days of work
Rollout gate: Build a new Tauri release. Verify on macOS (Keychain Access.app shows the entry). Linux without keyring daemon → manual test fallback.
Phase 4: Frontend refactor (U7, U8, U9)
- Auth store rewrite (3-state, pre-emptive refresh, no access in localStorage)
- LoginView "Remember me"
- Active Sessions panel in Settings
- Admin user sessions panel
- ~3-4 days of work
Rollout gate: Frontend rebuild. End-to-end manual test on Tauri (macOS) + Web. Run Playwright suite.
Phase 5: Cleanup (after one minor version, ~30 days)
- Remove the legacy JWT back-compat path
- Drop the
user_sessionstable - Update
X-Client-Versionfloor - ~1 day of work
Phase 6: AuthProvider 抽象层(U11 + 关联改造)
2026-06-20 新增 Phase(合并 AuthProvider scope)
auth/providers/base.py—AuthProviderProtocol +runtime_checkableauth/providers/local.py—LocalAuthProvider(封装现有routes/auth.py:201-213的 password 校验逻辑)auth/providers/oidc_stub.py—StubOIDCProvider(raise NotImplementedError占位)auth/providers/__init__.py—get_auth_provider()DI 工厂(lru_cache单例)config.py— 新增auth.provider: local | oidc-stub配置- U1 schema 加
auth_provider字段(合并入 Phase 1 U1) - U3 SessionService
create_session接受auth_provider参数(合并入 Phase 1 U3) - U4 routes
Depends(get_auth_provider)注入;admin 端点调provider.revoke_user(user_id)而不是直接改 users 表(合并入 Phase 2 U4) - ~1.5 days of work(可以与 Phase 1 早期并行落地)
Rollout gate:
pytest tests/unit/auth/providers/ -v全部通过- 启动 dev server,配置
auth.provider: oidc-stub→ 第一次/auth/login返回 501 NotImplementedError - 启动 dev server,配置
auth.provider: local→ 现有登录流程不受影响 - admin 踢人功能调用
provider.revoke_user(user_id)行为与原 DB 直接 UPDATE 等价
未来 IdP 集成入口:下一迭代 OIDC 集成只需新加 auth/providers/oidc.py + auth/oauth_routes.py(见 U11 checklist),路由 / admin / Session 表零修改。
Open Questions
These are deferred to implementation and tracked here for visibility:
- Q1: Should "Active Sessions" be a tab in Settings or a separate route (
/settings/sessions)? Plan defaults to a Settings tab; revisit if UX testing suggests otherwise. - Q2: Should the admin UI show
revoked_reasonfor kicked sessions? Plan defaults to YES (audit value); revisit if it adds too much visual noise. - Q3: Should the cap-eviction trigger a server-side notification (e.g. an
audit_event)? Plan defaults to writing a row to a futureauth_audit_logtable; for now, just therevoked_reason='session_cap_eviction'field is enough. - Q4: Should
change_passwordrate-limit (e.g. 5 attempts per hour)? Out of scope here but worth a follow-up security brainstorm. - Q5: macOS Tauri builds need code-signing for Keychain access. The dev binary is unsigned → Keychain prompts "always allow". Plan documents this; production builds must be signed.
- Q6 (新增 2026-06-20): AuthProvider 抽象层与现有
routes/auth.py:201-213的 password 校验逻辑如何共存?计划方案:U11 第一步LocalAuthProvider完整复刻现有逻辑(行为等价),第二步 U4 routes 改造时一次性切换;U11 落地时写"行为等价"测试套件确认切换前后行为一致 - Q7 (新增 2026-06-20):
get_auth_provider()的lru_cache单例在测试环境如何隔离?计划方案:导出cache_clear()helper;conftest.py在每个 test fixture 前后get_auth_provider.cache_clear();不引入dependency_overrides(避免 FastAPI app 状态污染)
Sources & Research
Codebase references
- src/agentkit/server/auth/models.py — current
UserSessionModel+ aiosqlite bootstrap pattern - src/agentkit/server/auth/jwt_utils.py — current JWT issuance
- src/agentkit/server/routes/auth.py — current login/refresh/logout/me
- src/agentkit/server/auth/password.py — bcrypt cost=12
- src/agentkit/server/auth/dependencies.py —
require_authenticated - src/agentkit/server/app.py:928 — router registration
- src/agentkit/server/frontend/src/stores/auth.ts — current Pinia store
- src/agentkit/server/frontend/src/router/index.ts:166-189 — route guard
- src/agentkit/server/frontend/src/views/LoginView.vue — login page
- src/agentkit/server/frontend/src/api/auth.ts — frontend auth API client
- src/agentkit/server/frontend/src/api/base.ts — base API client + interceptor
- src/agentkit/server/frontend/src-tauri/Cargo.toml — current Tauri deps
- src/agentkit/server/frontend/src-tauri/src/lib.rs — current Tauri command registration
External references
- OWASP JWT Security Cheat Sheet — refresh token rotation, denylist patterns
- Auth0 Refresh Token Rotation docs (https://auth0.com/docs/secure/tokens/refresh-tokens/refresh-token-rotation)
keyringcrate v3 docs (https://docs.rs/keyring/latest/keyring/) — cross-platform credential storage- Tauri 2.x Capabilities system — command allowlisting (https://v2.tauri.app/security/capabilities/)
Institutional learnings
- Project context: AGENTS.md + .trae/rules/project_rules.md — security and async generator safety rules apply
- Existing tests:
tests/unit/auth/+tests/integration/auth/— patterns to follow for new test files - The current
_refreshFailedsticky flag in stores/auth.ts:112 is the root cause of the "logged out for no reason" UX — the rewrite in U7 eliminates it by always re-trying the refresh before giving up
Acceptance Examples (for the executor / reviewer)
The following end-to-end flows must work after this plan lands. Each is testable in Playwright or manual e2e.
AE-1: First login → cold start → main app (Covers F1, F3, F10, F11)
- Launch Tauri (clean state, no Keychain entry)
- Login with valid credentials → land on
/agent - Close Tauri window
- Re-launch Tauri (cold start)
- Expected: brief splash, then
/agent. No login page seen. Keychain Access.app shows an entry forcom.fischer.agentkit / refresh_token.
AE-2: Token expiry mid-session → silent refresh (Covers F10)
- Log in; access token exp 15 min
- Wait 13 minutes (or manually expire the token in DB)
- Make an API call (e.g. fetch conversations)
- Expected: request succeeds (silent refresh happened before the call); no 401 surfaced to the user.
AE-3: Refresh token reuse → all sessions revoked (Covers F5, F9)
- Log in from Tauri (session A)
- Log in from Web (session B)
- Copy A's refresh token from Keychain
- Wait for A to refresh once legitimately (A's old refresh is now in the 30s denylist, and A has a new refresh)
- Try to use the copied old refresh token
- Expected: 401 with
error: "token_reuse_detected". A's session is revoked. B's session is also revoked. Both clients get bounced to /login.
AE-4: Password change → other device kicked (Covers F9)
- Log in from Tauri (session A) and Web (session B) as the same user
- From A, change password
- From B, make any API call
- Expected: B gets 401 → bounced to /login. A continues to work.
AE-5: Admin kicks a session (Covers F7, F8)
- User logs in from Tauri and Web
- Admin opens the Users view, selects the user, opens the Sessions tab
- Admin clicks "Revoke" on the Tauri session
- Expected: Tauri client's next API call returns 401 → bounced to /login. Web session is unaffected.
AE-6: Remember me toggle (Covers F2)
- Log in with "Remember me" UNCHECKED
- Expected: refresh token exp is 7 days
- Log out, log in with "Remember me" CHECKED
- Expected: refresh token exp is 30 days
AE-7: Session cap eviction (Covers F12 + the cap)
- Log in 10 times from 10 different simulated clients (use curl with different User-Agent headers)
- Expected: 10 sessions exist, all active
- Log in an 11th time
- Expected: the oldest non-current session is revoked (visible in DB with
revoked_reason='session_cap_eviction'); the 11 sessions are now the 2nd-10th + the new 11th
AE-8: Web fallback to localStorage (Covers F4)
- Open the app in a browser (not Tauri)
- Log in
- Expected:
localStorage.getItem('agentkit.refresh_token')returns the token. DevTools shows the value. - (Note: this is the documented degraded security model for Web clients)
AE-9: Old client still works during migration (Covers N6)
- Build a previous-version frontend
- Log in (gets a legacy JWT without sid)
- Make API calls
- Expected: server validates the legacy JWT via the back-compat path; user is not affected
AE-10: AuthProvider 切换(local → oidc-stub 验证接口契约)(Covers F13, F14)
2026-06-20 新增(KTD-10 / U11 验证)
- 配置
agentkit.yaml的auth.provider: local,启动 dev server - 调
POST /auth/login用现有 admin 账号 - Expected: 200 OK,返回 TokenResponse;DB 中
auth_sessions.auth_provider='local' - 改配置为
auth.provider: oidc-stub,重启 dev server - 调
POST /auth/login同样账号 - Expected: 501 Not Implemented(StubOIDCProvider 抛 NotImplementedError)
- 验证 admin 端点
/admin/users/{id}/sessions仍能列出步骤 3 创建的 session(含auth_provider='local'字段) - Expected: admin 看 session 列表功能不受 provider 切换影响(KTD-10 核心承诺)
- 调
isinstance(provider_instance, AuthProvider)验证 Local 和 Stub 都通过 Protocol 检查 - Expected: 两者都返回
True(runtime_checkableProtocol 行为正确)
AE-11: 审计字段 auth_provider 写入(覆盖历史 + 新建)(Covers F15)
- 在 AE-1 步骤 1-2 完成后,调
GET /auth/sessions列出当前 user 的所有 active session - Expected: 每个 session 包含
auth_provider: "local"字段(即使是 backfill 自user_sessions的行也是'local',因为 backfill 走默认值) - admin 调
GET /admin/users/{id}/sessions跨 user 看 - Expected: 所有 session 都带
auth_provider字段,admin 可按 provider 过滤(即使当前只有 local,未来 oidc 接入后会有 oidc-* 区分) SessionService.list_active_by_provider('local')返回所有 local session- Expected: count = 步骤 2 看到的总数
SessionService.list_active_by_provider('oidc-stub')在当前实现下返回空 list- Expected: count = 0(证明字段存在但无数据,未来 OIDC 接入后才会有值)
- Server log shows DEBUG: "Legacy JWT without sid; using exp-only validation"