fischer-agentkit/docs/plans/2026-06-20-002-feat-central...

94 KiB
Raw Permalink Blame History

Fischer AgentKit — Centralized Auth & Token Persistence (Plan)

Date: 2026-06-20 Status: active Branch: feat/auth-server-token-persistence(原 feat/centralized-auth-token-persistence Type: feat Origin: docs/brainstorms/2026-06-20-centralized-auth-token-persistence-requirements.md

2026-06-20 更新:合并 AuthProvider 抽象层 scopeorigin §5.5),新增 KTD-10、U1 auth_provider 字段、U3/U4 改造点、U11 实施单元、Phase 6、AE-10/AE-11。


Summary

Replace the current minimal JWT + localStorage auth with a production-grade scheme: server-side session table (track every login, enable forced revocation), Tauri OS Keychain storage for refresh tokens (encrypted at rest), refresh token rotation (defense against token leakage), pre-emptive token refresh (no 401 storms), a three-state startup (valid / invalid / error), and an AuthProvider 抽象层 that decouples routes / admin API / session table from the concrete auth backend (Local today; OIDC / SAML / LDAP tomorrow via adapter). Goal: after first login, Tauri cold-start goes directly to the main app, no login page; admin can see/force-revoke any user's active sessions; password change instantly invalidates all other devices; future enterprise IdP integration requires no rewrite of the routing or admin layer.


Problem Frame

The current auth flow has three structural gaps:

  1. Token at rest in plaintextaccess_token, refresh_token, and user are stored unencrypted in WebView localStorage (~/Library/WebKit/.../LocalStorage/ on macOS). Any process with file access can read them.
  2. No revocation surfaceuser_sessions table only stores a refresh-token hash with revoked_at. There is no device fingerprint, no IP, no "kick this session" admin endpoint, no "change password → kick everywhere" flow. Sessions outlive the user's intent.
  3. No rotation — the same refresh token can be used for the full 7-day window. If leaked, the attacker has a week of access with no detection.

The user's primary stated need is "after I log in once, subsequent app opens should go straight to the main app." The current code attempts this via localStorage rehydration, but two failure modes break it: (a) refresh hits _refreshFailed and the auth store clears itself; (b) when the access token expires mid-session and refresh fails (server restart, network blip), the store clears and the user is bounced to /login. We need both stronger local persistence and server-side session awareness to make this experience reliable.

The secondary stated needs are "集团统一管理" (centralized enterprise management) and "和集团的账号密码对接" (eventual IdP integration). Without a session table and admin endpoints, an admin cannot: see who is logged in, force-logout a lost device, or ensure that a compromised employee is immediately removed from all devices. The session table is the same data model an IdP would feed. To keep the future IdP integration from requiring a routing / admin rewrite, the auth backend must be pluggable behind an AuthProvider Protocol (see KTD-10 and U11). Local today, OIDC tomorrow — without touching routes or admin code.


Scope Boundaries

In Scope

  • New auth_sessions SQLAlchemy model + table + Alembic migration
  • JWT payload extended with sid (session id) and jti (token id); session validation on every request
  • Refresh token rotation on every /auth/refresh call; old token enters a 30s short-window denylist
  • Refresh-token reuse detection → revoke all sessions for that user (defense against token theft)
  • New endpoints: GET /auth/whoami, GET /auth/sessions, DELETE /auth/sessions/{id}, POST /auth/logout-others, POST /auth/change-password, GET /admin/users/{id}/sessions, DELETE /admin/users/{id}/sessions/{sid}
  • Active session cap = 10 per user; login that would exceed the cap evicts the oldest non-current session
  • "Remember me" login option: refresh TTL = 30 days (vs default 7 days)
  • Tauri Rust commands: store_refresh_token / load_refresh_token / clear_refresh_token using the keyring crate (macOS Keychain / Windows Credential Manager / Linux Secret Service)
  • Frontend tauri-auth.ts adapter with localStorage fallback when Keychain is unavailable
  • Frontend auth-store: 3-state startup (valid / invalid / error), pre-emptive refresh when access expires in <2 min, no localStorage write of access token
  • Frontend "Remember me" checkbox on LoginView
  • Frontend "Active sessions" management UI in SettingsView (list current devices, kick others)
  • Admin UI: see any user's active sessions, kick any session
  • Backwards-compat for one minor version: old clients without sid claim still work via user_sessions table fallback
  • AuthProvider 抽象层 (auth/providers/base.py Protocol + LocalAuthProvider + StubOIDCProvider) — routes / admin / SessionService 通过 Depends(get_auth_provider) 拿到 provider切换 IdP 不重写路由
  • auth_sessionsauth_provider 字段记录登录来源(local / oidc-stub / 未来 oidc-keycloak / saml / ldap
  • 配置开关 auth.provider: local | oidc-stubagentkit.yaml未来加新 provider 只需新 adapter

Out of Scope (deferred to follow-up work)

  • Enterprise IdP / SSO (OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微) — separate brainstorm
  • 2FA / TOTP / WebAuthn / Passkey — separate brainstorm
  • Multi-tenant / org isolation — separate brainstorm
  • Password strength policy / password expiry / password history — separate IAM brainstorm
  • Login failure lockout / sliding-window rate-limit — separate security brainstorm
  • Email / SMS notifications for reuse detection — requires notification service
  • Full audit log search / export — separate observability brainstorm
  • Per-session device "trust" flag (e.g. "this Mac is trusted for 90 days") — defer until IdP work

Resolved Decisions (locked in from the brainstorm)

# Question Decision
1 Remember me TTL 30 days (vs default 7 days)
2 Active session cap 10 per user, evict oldest non-current on overflow
3 Tauri Keychain unavailable behavior Silently fall back to localStorage, log warning

Requirements (carried from origin)

The plan must satisfy all of the following origin IDs (see requirements doc):

  • F1 First login → cold-start app goes directly to main UI, never shows login
  • F2 "Remember me" toggle: 7d / 30d refresh TTL
  • F3 Tauri: refresh token stored in OS Keychain, never on localStorage disk
  • F4 Web: refresh token in localStorage (degraded security, accepted)
  • F5 Refresh token rotation: every /auth/refresh invalidates the old token
  • F6 Server: every login recorded with device/IP/time
  • F7 Admin: see any user's active sessions
  • F8 Admin / self: kick any session
  • F9 Password change: kick all other sessions
  • F10 Pre-emptive refresh when access expires in <2 min
  • F11 Startup distinguishes valid / invalid / error
  • F12 Multiple Tauri / Web clients can log in the same user simultaneously (independent sessions)
  • F13 AuthProvider 可插拔(auth.provider 配置切换 local ↔ oidc-stub路由/Admin/Session 表零修改)
  • F14 admin 端点与认证后端解耦(未来切 IdPadmin 看 session 列表 / 踢人功能不变)
  • F15 审计日志记录 auth_provider 字段(登录来源可溯源)
  • N1 Token validation P99 < 5ms (Redis cache for session metadata)
  • N5 All auth code has unit + integration tests
  • N6 Backwards-compat for old clients (1 minor version)

Key Technical Decisions

KTD-1: auth_sessions table (new) vs extending user_sessions (existing)

Decision: Create a new auth_sessions table; deprecate user_sessions over 1 minor version.

Rationale: user_sessions only stores refresh_token_hash + revoked_at (3 fields). The new design needs device_fingerprint, device_label, ip, user_agent, last_active_at, expires_at, revoked_reason, previous_session_id. Adding 7 columns to an existing table breaks its existing semantics (the table is also referenced in production hardening tests). Clean break with migration is safer than schema bloat.

Trade-off: Two-table coexistence during the deprecation window. Mitigated by: keep user_sessions reads working for clients without sid claim (N6).

KTD-2: Session validation on every request (not just refresh)

Decision: get_current_user dependency reads sid from JWT, queries auth_sessions table (with Redis cache, 60s TTL) to confirm revoked=False and expires_at > now.

Rationale: Without this, a kicked-out user keeps their access token for up to 15 min (access TTL). With it, the kicked session is dead on the next request. The cost is +1 DB/cache lookup per request; cache makes this sub-ms.

Trade-off: One cache miss per request adds ~5ms; with the 60s cache the actual DB query rate is ~1/min/active-session.

KTD-3: Refresh token rotation + 30s denylist

Decision: Every successful /auth/refresh issues a new refresh token. The old token's hash is added to an in-memory + Redis denylist for 30 seconds. If the old token is reused within that window → TokenReuseDetected → revoke ALL sessions for that user.

Rationale: Industry standard (Auth0, Okta, AWS). Closes the window where an attacker who captured the old token can still use it after the legitimate user has refreshed.

Trade-off: The 30s window is a small UX cost (concurrent refresh calls from the same client during retry) but acceptable; legitimate retries complete in <1s and don't hit the window.

KTD-4: Tauri Keychain via keyring crate (not tauri-plugin-stronghold)

Decision: Use the keyring crate directly. It provides unified API across macOS Keychain, Windows Credential Manager, and Linux Secret Service with a single dependency.

Rationale: tauri-plugin-stronghold is a Tauri-team plugin but the v2 ecosystem is still maturing and the docs lag. keyring is the de-facto Rust standard for OS credential storage, used by cargo, git-credential-manager, and others. Smaller surface, fewer moving parts.

Trade-off: We write 3 small Tauri commands (store_refresh_token / load_refresh_token / clear_refresh_token) instead of using a plugin's auto-generated bindings. ~50 lines of Rust.

KTD-5: Access token in memory only (not persisted)

Decision: Access token lives only in the auth store's reactive ref<string | null>. Never written to localStorage or Keychain.

Rationale: Access tokens are short-lived (15 min). The cost of losing one (re-auth) is low; the security cost of persisting them (broader attack surface) is high. Refresh token is the only thing that needs durable storage.

Trade-off: App reload requires one refresh round-trip to get a new access token. Mitigated by the pre-emptive refresh + 3-state startup: by the time the app needs to call an API, the access token is already fresh.

KTD-6: Redis cache for session metadata (not just in-memory)

Decision: Use Redis (when available) to cache auth_sessions rows by sid. Fallback to in-process LRU (size=1024) when Redis is unavailable.

Rationale: The Tauri sidecar may run without Redis (zero-config dev mode). In-process LRU gives the same hit rate for single-process deployments. When Redis IS available (server deployment, multi-instance), it's the right cross-process answer.

Trade-off: Two code paths. Mitigated by a small SessionCache interface with two impls.

KTD-7: Session cap eviction strategy = LRU (oldest non-current)

Decision: When login would create the 11th session for a user, the oldest non-current session is revoked (with revoked_reason='session_cap_eviction') before the new one is created.

Rationale: LRU is intuitive ("the device I haven't used in a month should be the first to go"). Kicking "current" is wrong because the user is actively logging in.

Trade-off: None meaningful. Cap=10 is generous; the eviction is invisible to all but the user on the kicked device.

KTD-8: Pre-emptive refresh in api/base.ts interceptor (not in Pinia getter)

Decision: A request interceptor in BaseApiClient checks shouldRefresh() (access exp <2 min) BEFORE sending, and awaits silentRefresh() if needed.

Rationale: An interceptor guarantees the check runs for every request. A Pinia getter would only fire on accessToken access, which is not all requests (e.g. background fetches that don't read the getter).

Trade-off: One async function call before each request when expiring; negligible.

KTD-9: Backwards-compat shim for old clients

Decision: dependencies.py:get_current_user accepts JWTs with or without sid claim. Missing sid → fall back to user_sessions.refresh_token_hash validation. This path is logged and gated to one minor version.

Rationale: Avoids breaking in-flight clients. Lets us roll out gradually.

Trade-off: Two validation paths in get_current_user. Mitigated by extracting the session-lookup into a helper that both paths share.

KTD-10: AuthProvider 抽象层(为未来 IdP 对接留扩展点)

Decision: 鉴权逻辑走 auth/providers/base.py:AuthProvider Protocolname / authenticate / get_user_by_id / sync_user_attributes / revoke_user),路由层用 Depends(get_auth_provider) 注入。当前默认 LocalAuthProvider(封装 SQLite + bcrypt未来 OidcAuthProvider 接管时路由 / admin / Session 表零修改StubOIDCProvider 作为占位(raise NotImplementedError),用于未来接口契约验证。

Rationale: 用户明确"未来要和集团账号密码对接"OIDC / SAML / LDAP / 飞书 / 钉钉 / 企微)。如果现在把"用户存在哪里 / 密码怎么校验"写死在 routes/admin 里,未来切 IdP 必须重写所有路由层 + admin 端点。提前抽象可以让未来 IdP 集成只需新增一个 adapter~300-500 行),不触及现有 routes / admin / SessionService。auth_sessions 表加 auth_provider 字段记录登录来源,审计可溯源。

Trade-off:

  • 多 1 个抽象层(auth/providers/base.py Protocol+ 1 个 DI 工厂(get_auth_provider+ 1 个 StubOIDCProvider 占位
  • 收益:未来 IdP 集成不重写路由层 + admin APIadmin 踢人 / 看 session 列表跨 provider 一致

Alternatives considered:

  • 不预留扩展点,只做当下 LocalAuthProvider未来切 IdP 必须重写 routes + admin + SessionService
  • 直接实现 OIDC拉长本迭代 2-3 倍

High-Level Technical Design

Component Map

┌─────────────────────────────────────────────────────────────────────┐
│  Tauri Desktop (macOS / Windows / Linux)                            │
│  ┌──────────────────────────────────────────────────────────┐       │
│  │  WebView (Vue 3 frontend)                                │       │
│  │  ┌────────────┐  ┌──────────────┐  ┌────────────────┐   │       │
│  │  │ auth store │──│ api/base.ts  │──│ Pinia + Router │   │       │
│  │  │ (memory)   │  │ interceptor  │  │                │   │       │
│  │  └────────────┘  └──────────────┘  └────────────────┘   │       │
│  │         │                │                              │       │
│  │         │                │ silentRefresh                │       │
│  │         │                ▼                              │       │
│  │         │       ┌──────────────────┐                    │       │
│  │         │       │ tauri-auth.ts    │  invoke()          │       │
│  │         │       └──────────────────┘    │               │       │
│  │         │ localStorage fallback         ▼               │       │
│  │         │                  ┌──────────────────────┐    │       │
│  │         └─────────────────▶│ src-tauri/src/auth.rs│    │       │
│  │                            │ keyring::Entry       │    │       │
│  │                            └──────────────────────┘    │       │
│  │                                       │                │       │
│  └───────────────────────────────────────┼────────────────┘       │
│                                          │ HTTP                    │
│                                          ▼                         │
│  ┌──────────────────────────────────────────────────────────┐       │
│  │  FastAPI server (Python sidecar)                         │       │
│  │  ┌────────────────────────┐   ┌──────────────────────┐        │       │
│  │  │ routes/auth.py         │──▶│ auth/session.py      │        │       │
│  │  │ + admin routes         │   │ - create / rotate    │        │       │
│  │  │ Depends(get_auth_      │   │ - revoke / kick      │        │       │
│  │  │         provider) ─────┼──▶│ - reuse detection    │        │       │
│  │  └────────────────────────┘   └──────────────────────┘        │       │
│  │         │                                │                     │       │
│  │         ▼                                ▼                     │       │
│  │  ┌────────────────────────┐   ┌──────────────────────┐        │       │
│  │  │ auth/providers/        │   │ auth/models.py       │        │       │
│  │  │ - base.py (Protocol)   │   │ AuthSessionModel     │        │       │
│  │  │ - local.py (Local)     │   │ + auth_provider col  │        │       │
│  │  │ - oidc_stub.py (stub)  │   └──────────────────────┘        │       │
│  │  │ get_auth_provider() DI │                                   │       │
│  │  └────────────────────────┘                                   │       │
│  │         │                                                      │       │
│  │         ▼                                                      │       │
│  │  ┌─────────────────────────────────────────────┐                │       │
│  │  │  auth/cache.py (Redis or in-process LRU)    │                │       │
│  │  └─────────────────────────────────────────────┘                │       │
│  └──────────────────────────────────────────────────────────┘       │
│                                          │                         │
│                                          ▼                         │
│                          ┌──────────────────────────┐              │
│                          │  data/auth.db (SQLite)   │              │
│                          │  + auth_sessions table   │              │
│                          └──────────────────────────┘              │
└─────────────────────────────────────────────────────────────────────┘

State Machine — Client Auth

                ┌──────────────┐
   app start    │              │   valid refresh
   ────────────▶│  STARTUP     │────────────────────▶ READY
                │              │                      │ │
                └──────────────┘                      │ │
                     │  │  │                          │ │
        invalid  ────┘  │  └──── network err          │ │
                          ▼                            │ │
                ┌──────────────┐                       │ │
                │   ERROR      │ retry ──────┐         │ │
                │   "刷新"     │            │         │ │
                └──────────────┘            │         │ │
                          ▼                │         │ │
                ┌──────────────┐            │         │ │
                │  INVALID     │ retry ─────┤         │ │
                │  "请重登"    │            │         │ │
                └──────────────┘            │         │ │
                                            │         │ │
                  ┌─────────────────────────┘         │ │
                  │                                   │ │
                  │ 401 in flight                      │ │
                  │ ◀──────────────────────────────────┘ │
                  │                                     │
                  ▼                                     │
            ┌──────────────┐                            │
            │ silentRefresh│                            │
            └──────────────┘                            │
                  │                                     │
            ok ◀──┴──▶ fail → back to STARTUP / INVALID

State Machine — Server Session

                          login
                           │
                           ▼
                    ┌────────────┐
                    │  CREATED   │ sid in JWT
                    │  active    │
                    └────────────┘
                       │  │  │
        refresh ok ─────┘  │  └──── logout → REVOKED (user)
        (rotated)          │
                           └── admin / password change / reuse detected
                                → REVOKED (system)

Sequence — Cold Start (Tauri)

Window opens
    │
    ▼
App.vue mounted
    │
    ▼
bootstrapBackend()
    │  start_backend (sidecar)
    │  health check
    │
    ▼
authStore.startupCheck()
    │
    ├── 1. tauriAuthStorage.getRefreshToken()
    │       Keychain (Tauri) → localStorage (Web fallback)
    │
    ├── 2. GET /api/v1/auth/whoami  (Authorization: Bearer <refresh>)
    │       (the access token is gone, so we attach the refresh token;
    │        the server uses a separate "whoami" code path that accepts
    │        either type)
    │
    ├── 3. response handling
    │       200 → { access_token, user } → state = VALID → /agent
    │       401 → state = INVALID → /login (with "会话已过期")
    │       network err → state = ERROR → /login (with "无法连接")
    │
    ▼
Router beforeEach
    │  state = VALID → next()
    │  state != VALID → next('/login')

Data Model — auth_sessions Table

erDiagram
    auth_sessions {
        TEXT id PK "uuid"
        TEXT user_id FK
        TEXT refresh_token_hash
        TEXT device_fingerprint
        TEXT device_label
        TEXT ip
        TEXT user_agent
        TEXT created_at
        TEXT last_active_at
        TEXT expires_at
        INTEGER revoked
        TEXT revoked_reason
        TEXT previous_session_id
    }
    users {
        TEXT id PK
        TEXT username
        TEXT password_hash
        ...
    }
    auth_sessions }o--|| users : "user_id"

Sequence — Refresh Token Rotation + Reuse Detection

Client                         Server
  │                              │
  │  POST /auth/refresh          │
  │  { refresh_token: "old" }    │
  │ ───────────────────────────▶ │
  │                              │  decode old → sid
  │                              │  lookup auth_sessions[sid]
  │                              │  hash(old) == session.refresh_token_hash? NO
  │                              │  → denylist check: hash(old) in denylist?
  │                              │     YES → REUSE DETECTED
  │                              │  → revoke ALL sessions for this user
  │                              │  → audit log "reuse_detected"
  │                              │  ← 401 { error: "token_reuse_detected" }
  │  client clears state,        │
  │  routes to /login            │
  │                              │
  │  -- legit refresh --         │
  │  POST /auth/refresh          │
  │  { refresh_token: "valid" }  │
  │ ───────────────────────────▶ │
  │                              │  hash(valid) == session.refresh_token_hash? YES
  │                              │  rotate: session.refresh_token_hash = hash(new)
  │                              │  add hash(old) to denylist (30s)
  │                              │  issue new access + new refresh
  │                              │ ← 200 { access_token, refresh_token }
  │  store new refresh in        │
  │  Keychain, access in memory  │

Implementation Units

U1. Schema: AuthSessionModel + extended bootstrap + backfill

Goal: Add the auth_sessions table with all required fields and indexes, AND backfill existing user_sessions rows on first startup.

Requirements: F6, F15, N5, N6 (the table backs every session-aware endpoint; backfill prevents forced re-login; auth_provider field enables future IdP audit traceability).

Dependencies: None.

Files:

  • src/agentkit/server/auth/models.py — add AuthSessionModel (SQLAlchemy 2 typed) + extend _SCHEMA_SQL for direct aiosqlite init + add _SCHEMA_VERSION = 2 constant + extend init_auth_db() to run the backfill
  • tests/unit/auth/test_models.py — model serialization + index smoke + backfill tests

Approach (schema):

  • Use UUID strings as PK (matches existing users.id style in this codebase)
  • device_info is a JSON string (reuse pattern from UserSessionModel.device_info)
  • expires_at is ISO-8601 string (matches UserModel.last_login_at)
  • revoked is INTEGER (0/1) for SQLite compatibility
  • Add the new CREATE TABLE auth_sessions block to _SCHEMA_SQL (line 234-242 is the current user_sessions block; append after it) with these indexes:
    • idx_auth_sessions_user_id_active on (user_id, revoked, expires_at) — supports the cap-count query and the list-active query
    • idx_auth_sessions_expires_at on (expires_at) — supports cleanup sweeps
    • idx_auth_sessions_refresh_token_hash on (refresh_token_hash) — unique
    • idx_auth_sessions_auth_provider on (auth_provider) — supports future IdP "list sessions by provider" query
  • Add auth_provider column (NEW per KTD-10): TEXT NOT NULL DEFAULT 'local' — records which provider created the session. Values: local (current) / oidc-stub (future stub) / oidc-keycloak / saml / ldap (future real adapters). Backfilled rows get 'local' via the default.
  • Bump _SCHEMA_VERSION = 2 (currently implicit; the existing init_auth_db is idempotent via CREATE TABLE IF NOT EXISTS so version is mostly for the backfill gate)

Approach (backfill) — critical, was missing from the original plan: The current routes/auth.py:201-213 writes to user_sessions on login. After the new schema lands, the new SessionService.create_session writes to auth_sessions instead. To prevent forcing every existing user to re-login on the deploy, init_auth_db() runs a one-time backfill on startup:

async def _backfill_user_sessions(db: aiosqlite.Connection) -> int:
    """One-time backfill from user_sessions to auth_sessions.

    Runs only when auth_sessions is empty AND user_sessions has rows.
    Idempotent: subsequent restarts are no-ops.
    """
    cursor = await db.execute("SELECT COUNT(*) FROM auth_sessions")
    (count,) = await cursor.fetchone()
    if count > 0:
        return 0  # already backfilled

    cursor = await db.execute(
        "SELECT id, user_id, refresh_token_hash, device_info, created_at, expires_at, revoked_at "
        "FROM user_sessions WHERE revoked_at IS NULL"
    )
    rows = await cursor.fetchall()
    backfilled = 0
    for row in rows:
        device_info = json.loads(row["device_info"]) if row["device_info"] else {}
        # Use existing user_sessions.id as the auth_sessions.id so that
        # legacy clients holding the old refresh_token_hash still match
        # a row in the new table (this is what the back-compat path in
        # U10 relies on).
        await db.execute(
            "INSERT OR IGNORE INTO auth_sessions "
            "(id, user_id, refresh_token_hash, device_fingerprint, device_label, "
            " ip, user_agent, created_at, last_active_at, expires_at, revoked, revoked_reason) "
            "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
            (
                row["id"],  # reuse legacy id for back-compat
                row["user_id"],
                row["refresh_token_hash"],
                device_info.get("fingerprint", "unknown"),
                device_info.get("label", "Unknown device"),
                device_info.get("ip", ""),
                device_info.get("user_agent", ""),
                row["created_at"],
                row["created_at"],  # last_active_at defaults to created_at
                row["expires_at"],
                0,  # not revoked (already filtered)
                None,
            ),
        )
        backfilled += 1
    if backfilled:
        logger.info(f"Backfilled {backfilled} user_sessions rows to auth_sessions")
    return backfilled

Approach (idempotency):

  • The INSERT OR IGNORE on auth_sessions.id PK makes the backfill safe to re-run
  • The count > 0 early-exit means after the first backfill, subsequent startups are < 1ms

Approach (rolled-back risk):

  • The backfill does NOT delete user_sessions rows. They are kept for 1 minor version as the legacy read path. U10's Phase 5 cleanup drops the table.

Test scenarios (test_models.py):

  • Create session, query by sid, find it
  • Create 11 sessions for one user, count = 11 (cap check is in U3)
  • Query WHERE user_id=? AND revoked=0 AND expires_at > now returns active sessions
  • Index (user_id, revoked, expires_at) is present (verify via PRAGMA index_list)
  • Index idx_auth_sessions_auth_provider is present
  • auth_provider column tests (NEW per KTD-10):
    • Default value is 'local' when column is omitted from INSERT
    • WHERE auth_provider = 'local' returns only local-created sessions
    • WHERE auth_provider = 'oidc-stub' returns zero rows in current code
  • Backfill tests (NEW):
    • init_auth_db on a DB with user_sessions rows but empty auth_sessions → backfills all non-revoked rows
    • init_auth_db on a DB with existing auth_sessions rows → does NOT re-backfill (idempotent)
    • Backfilled rows have the original user_sessions.id as their auth_sessions.id
    • Backfilled rows have revoked=0
    • Backfilled rows have their expires_at preserved
    • Backfill does NOT touch user_sessions rows that are already revoked (revoked_at IS NOT NULL)

Verification: pytest tests/unit/auth/test_models.py -v passes; init_auth_db runs cleanly on a copy of prod DB with the existing user_sessions table; backfill log line appears exactly once per fresh DB.

Note on Alembic: This codebase does not use Alembic. There is no alembic.ini, no migrations/ directory, and no alembic dependency in pyproject.toml. The auth DB schema is managed via the _SCHEMA_SQL constant + init_auth_db() pattern (see auth/models.py:202-333). This U1 unit aligns with that pattern; the original plan's Alembic reference was incorrect.


U2. JWT utils: sid + jti claims, dual decode path

Goal: Add sid and jti to issued JWTs; teach verify_token to read both old and new claim shapes.

Requirements: F5, F12, N6 (rotation + multi-client + backwards compat).

Dependencies: U1 (the sid references a row in auth_sessions).

Files:

  • src/agentkit/server/auth/jwt_utils.pycreate_token_pair(...) now takes session_id: str; verify_token(...) returns decoded payload including sid + jti; back-compat: missing sid is logged at DEBUG and accepted (caller decides what to do)
  • src/agentkit/server/auth/denylist.py — new module: RecentlyRevokedTokens class backed by in-memory OrderedDict + Redis pub/sub for cross-process; add(token_hash, ttl=30), contains(token_hash) -> bool
  • tests/unit/auth/test_jwt_utils.py — extend existing tests: round-trip with sid, decode legacy token, decode tampered token

Approach:

  • create_token_pair(user_id, session_id, ttl_pair)access payload: {sub, sid, jti, type, exp, iat}; refresh payload: same minus jti (refresh tokens are long-lived; jti would be regenerated on every rotation, which is wasteful)
  • verify_token(token, expected_type) — return full payload dict; legacy payload (no sid) is preserved as-is, callers branch on 'sid' in payload
  • RecentlyRevokedTokens — single-process OrderedDict keyed by SHA-256 hash, max 10k entries; contains is O(1); add evicts oldest if at capacity
  • Redis adapter: SADD + EXPIRE; SISMEMBER for check; the in-process impl is the fallback when Redis is unavailable

Test scenarios:

  • create_token_pair(...) produces tokens with sid and jti (access only)
  • verify_token on a token without sid returns the payload unchanged (caller must handle)
  • verify_token on an expired token raises ExpiredSignatureError
  • RecentlyRevokedTokens.add(hash, ttl) + contains(hash) returns True within 30s, False after
  • RecentlyRevokedTokens with 10001 entries evicts the oldest (capacity test)
  • Redis adapter mock: SADD + SISMEMBER + EXPIRE called with correct args

Verification: pytest tests/unit/auth/test_jwt_utils.py -v passes; manual curl round-trip works against a running dev server.


U3. Session service: CRUD + rotation + reuse detection

Goal: Centralize all session operations behind a SessionService class so routes don't duplicate the logic.

Requirements: F5, F6, F8, F9, F11, F13, F15 (rotation, recording, kick, password change, three-state validation, provider-pluggability, audit field).

Dependencies: U1 (model), U2 (denylist).

Files:

  • src/agentkit/server/auth/session.py — new module: SessionService class
  • src/agentkit/server/auth/cache.py — new module: SessionCache interface + RedisSessionCache + InProcessLRUSessionCache impls
  • tests/unit/auth/test_session.py — full service test suite

Approach (SessionService methods):

  • async create_session(user_id, device_fingerprint, device_label, ip, user_agent, remember_me: bool, auth_provider: str = "local") -> AuthSessionModel
    • Cap check first: count active sessions for user; if ≥10, mark oldest non-current as revoked with revoked_reason='session_cap_eviction'
    • Generate new sid (uuid4), jti (uuid4)
    • Compute expires_at based on remember_me (30d vs 7d)
    • Store auth_provider from caller (U4 passes provider.name); enables F15 audit traceability
    • Insert row, return model
  • async get_active_session(sid: str) -> AuthSessionModel | None
    • First check SessionCache.get(sid); on miss, query DB, write to cache (60s TTL)
    • Return None if revoked=True or expires_at < now
  • async rotate_refresh(old_refresh_token: str) -> tuple[AuthSessionModel, TokenPair]
    • Decode old_refresh_token; get sid; lookup session
    • Reuse detection: compare sha256(old_refresh_token) against session.refresh_token_hash. If different, this is a reuse → call revoke_all_for_user(user_id, reason='reuse_detected') + raise TokenReuseDetected
    • Also check RecentlyRevokedTokens.contains(sha256(old_refresh_token)) — if yes, same handling
    • On legitimate use: generate new refresh_token, update session.refresh_token_hash = sha256(new), session.last_active_at = now, session.expires_at = now + ttl, session.previous_session_id = old sid (audit), auth_provider preserved (rotation doesn't change provider)
    • Add sha256(old_refresh_token) to denylist for 30s
    • Issue new access + refresh JWTs (call into jwt_utils)
    • Invalidate cache entry for this sid
  • async revoke_session(sid: str, reason: str) -> None
    • Mark revoked=True, revoked_reason=reason; invalidate cache
  • async revoke_all_for_user(user_id: str, except_sid: str | None, reason: str) -> int
    • Bulk update; returns count of revoked sessions
  • async list_active_for_user(user_id: str) -> list[AuthSessionModel]
  • async list_all_for_admin(user_id: str) -> list[AuthSessionModel] (admin endpoint)
  • async list_active_by_provider(auth_provider: str) -> list[AuthSessionModel] (NEW per KTD-10) — supports future "show me all OIDC sessions" admin view

Approach (SessionCache):

class SessionCache(Protocol):
    async def get(self, sid: str) -> AuthSessionModel | None: ...
    async def set(self, sid: str, session: AuthSessionModel, ttl: int = 60) -> None: ...
    async def invalidate(self, sid: str) -> None: ...
  • InProcessLRUSessionCache: OrderedDict[sid, (session, expires_at)]; cap=1024; lazy eviction on get
  • RedisSessionCache: GET / SETEX / DEL; pickle the model for storage

Test scenarios (test_session.py):

  • create_session inserts a row with all fields populated
  • create_session with remember_me=True sets expires_at 30d out, else 7d
  • create_session for a user with 10 active sessions evicts the oldest non-current one
  • create_session for a user with 10 active sessions, the new login is one of them, the evicted one is the OLDEST non-new
  • create_session with auth_provider='oidc-stub' stores that value in the row (NEW per KTD-10)
  • get_active_session returns the row when valid
  • get_active_session returns None when revoked=True
  • get_active_session returns None when expires_at < now
  • get_active_session second call within 60s hits cache (spy on DB call count)
  • rotate_refresh with the CURRENT token returns new pair
  • rotate_refresh preserves the original auth_provider value (NEW per KTD-10)
  • rotate_refresh with a REUSED old token (different hash) → TokenReuseDetected raised + ALL sessions for user revoked
  • rotate_refresh with a token in the denylist → same handling
  • rotate_refresh updates previous_session_id to the old sid
  • revoke_session sets revoked=True, revoked_reason, invalidates cache
  • revoke_all_for_user except_sid=None revokes everything
  • revoke_all_for_user except_sid= keeps the current session
  • list_active_for_user returns only revoked=False AND expires_at > now
  • list_all_for_admin returns all rows including revoked (for audit)
  • list_active_by_provider('local') returns only local sessions; ('oidc-stub') returns empty in current code (NEW per KTD-10)

Verification: All unit tests pass; pytest tests/unit/auth/test_session.py -v shows 100% line coverage of session.py.


U4. Routes: new auth + admin endpoints

Goal: Expose all session operations as HTTP endpoints.

Requirements: F1, F2, F5, F6, F7, F8, F9, F10, F11, F13, F14, F15.

Dependencies: U3 (the service), U11 (AuthProvider 抽象层 — must land first or alongside).

Files:

  • src/agentkit/server/routes/auth.py — extend LoginRequest with remember_me: bool = False; add WhoamiResponse, SessionInfoResponse; add new endpoints; DI 注入 AuthProvider 通过 Depends(get_auth_provider)KTD-10
  • src/agentkit/server/routes/admin.py — new module: admin session management endpoints (or extend existing admin module); 调用 provider.revoke_user(user_id) 而不是直接改 users 表KTD-10
  • src/agentkit/server/dependencies.pyget_current_user extension to look up session via sid; back-compat fallback for old tokens
  • src/agentkit/server/auth/password.py — extend with change_password(user_id, new_password) that revokes all other sessions
  • tests/integration/auth/test_auth_routes.py — full endpoint suite; 追加 provider mock 注入测试KTD-10
  • tests/integration/auth/test_admin_routes.py — admin endpoints

Approach (new endpoints):

Method Path Body / Query Auth Behavior
POST /auth/login {username, password, remember_me?} none provider.authenticate(username, password)SessionService.create_session(auth_provider=provider.name) → return TokenResponse
POST /auth/refresh {refresh_token} refresh SessionService.rotate_refresh → return new TokenResponse; on TokenReuseDetected → 401 {error: "token_reuse_detected"}
POST /auth/logout {refresh_token} access (optional) revoke_session(sid, reason='user_terminated')
GET /auth/whoami access OR refresh Returns {user, session: {sid, device_label, ip, auth_provider, created_at, last_active_at, expires_at}}. Accepts refresh token to support cold-start where access is gone.
GET /auth/sessions access List current user's active sessions (each annotated with auth_provider)
DELETE /auth/sessions/{sid} access Revoke that session (if owned by current user)
POST /auth/logout-others access Revoke all sessions except current
POST /auth/change-password {old_password, new_password} access provider.authenticate 校验 old → provider.revoke_user(user_id) 失效其他 sessionKTD-10: 跨 provider 行为一致)

Approach (admin endpoints):

Method Path Auth Behavior
GET /admin/users/{user_id}/sessions admin List all that user's sessions (incl revoked)
DELETE /admin/users/{user_id}/sessions/{sid} admin Force-revoke any session

Approach (/auth/whoami middleware bypass — critical fix):

The current AuthMiddleware._verify_jwt (in src/agentkit/server/auth/middleware.py:80-91) only accepts type=access tokens and 401s on type=refresh. The cold-start sequence sends a refresh token (because the access token is gone). To make this work without weakening auth, /auth/whoami is added to AuthMiddleware.WHITELIST_PATHS and the route does its own auth:

# In auth/middleware.py:
WHITELIST_PATHS = (
    "/api/v1/health",
    "/api/v1/auth/login",
    "/api/v1/auth/refresh",
    "/api/v1/auth/logout",
    "/api/v1/auth/whoami",  # NEW: route does its own auth
    "/docs",
    "/openapi.json",
    "/redoc",
)

The /auth/whoami route accepts either an access token (normal call) or a refresh token (cold-start), and the auth check happens inside the route via verify_token + session lookup:

@router.get("/whoami")
async def whoami(request: Request) -> WhoamiResponse:
    """Returns the current user + session metadata.

    Accepts either type=access (normal) or type=refresh (cold-start).
    On 401 from this endpoint, the client treats it as 'invalid' state
    (NOT 'error' state) so the router redirects to /login.
    """
    auth_header = request.headers.get("Authorization", "")
    if not auth_header.startswith("Bearer "):
        raise HTTPException(401, "missing bearer token")
    token = auth_header[7:]

    try:
        payload = verify_token(token, expected_type=None)  # accept both types
    except jwt.ExpiredSignatureError:
        raise HTTPException(401, "token expired")
    except jwt.InvalidTokenError:
        raise HTTPException(401, "invalid token")

    sid = payload.get("sid")
    if sid:
        # New-style: validate session in DB
        session = await session_service.get_active_session(sid)
        if not session:
            raise HTTPException(401, "session revoked or expired")
        user = await load_user(session.user_id)
        # Issue a fresh access token so the client doesn't need a separate /refresh
        new_access = create_access_token(user_id=user.id, session_id=session.id)
        return WhoamiResponse(
            user=user_to_response(user),
            access_token=new_access,
            session=session_to_response(session),
        )
    else:
        # Legacy token without sid — back-compat path (U10)
        user = await load_user(payload["sub"])
        if not user or not user.is_active:
            raise HTTPException(401, "user not found or inactive")
        new_access = create_access_token(user_id=user.id, session_id=None)  # legacy
        return WhoamiResponse(
            user=user_to_response(user),
            access_token=new_access,
            session=None,  # no session metadata for legacy
        )

Approach (defined phantom functions):

The plan's pseudo-code references several functions that don't exist yet. Define them explicitly:

# In auth/dependencies.py — NEW dependency for current session
async def get_current_session(request: Request) -> AuthSession:
    """Return the active session for the current request.

    Reads request.state.session (set by get_current_user middleware/dependency).
    Raises 401 if no session (legacy tokens) or session is revoked.
    """
    session = getattr(request.state, "session", None)
    if session is None:
        raise HTTPException(401, "no active session (legacy token)")
    return session

# In auth/dependencies.py — keep existing get_current_user but extend it
async def get_current_user(request: Request) -> User:
    """Return the current authenticated user.

    Strategy:
    - If request.state.current_user is already set (by AuthMiddleware for
      type=access tokens), return it.
    - Otherwise, this is called from a path that bypassed middleware
      (e.g. /auth/whoami). The route must have set request.state.user
      via its own auth check.
    - Legacy tokens (no sid) only set current_user, not session.
    """
    user = getattr(request.state, "current_user", None)
    if user is None:
        user = getattr(request.state, "user", None)  # set by whoami route
    if user is None:
        raise HTTPException(401, "not authenticated")
    return user

# In auth/users.py — NEW helper
async def load_user(user_id: str) -> User | None:
    """Load a user by id. Returns None if not found or inactive."""
    async with aiosqlite.connect(str(DEFAULT_AUTH_DB_PATH)) as db:
        cursor = await db.execute(
            "SELECT * FROM users WHERE id = ? AND is_active = 1", (user_id,)
        )
        row = await cursor.fetchone()
        return user_row_to_dict(row) if row else None

Approach (get_current_user back-compat with sid validation):

The new get_current_user is called by routes after AuthMiddleware has run. The middleware sets request.state.current_user (a dict with id, username, role, etc.) for type=access tokens. With the new sid-bearing tokens, the middleware is extended to also set request.state.session:

# In auth/middleware.py — extend _verify_jwt to also load session
def _verify_jwt(self, token: str) -> dict[str, Any] | None:
    # ... existing signature/expiry check ...
    sid = payload.get("sid")
    if sid:
        # Synchronous check is not possible (DB call). Defer to a
        # per-route dependency. Middleware only checks signature + expiry
        # for new tokens; the session-revoked check happens in the
        # get_current_user dependency.
        pass
    return payload

The session-revoked check is then done lazily in get_current_session, which calls SessionService.get_active_session(sid). This is one extra DB-or-cache call per request, mitigated by the 60s Redis cache (KTD-6).

Approach (change_password):

@router.post("/change-password")
async def change_password(
    payload: ChangePasswordRequest,
    user: User = Depends(get_current_user),
    session: AuthSession = Depends(get_current_session),
):
    if not verify_password(payload.old_password, user.password_hash):
        raise HTTPException(400, "old password incorrect")
    new_hash = hash_password(payload.new_password)
    async with aiosqlite.connect(str(DEFAULT_AUTH_DB_PATH)) as db:
        await db.execute(
            "UPDATE users SET password_hash=?, updated_at=? WHERE id=?",
            (new_hash, _now_iso(), user.id),
        )
        await db.commit()
    revoked_count = await session_service.revoke_all_for_user(
        user.id, except_sid=session.id, reason="password_changed"
    )
    logger.info(f"Password changed for user {user.id}; revoked {revoked_count} other sessions")
    return {"ok": True, "revoked_sessions": revoked_count}

Test scenarios (test_auth_routes.py):

  • Happy path:
    • POST /auth/login with valid creds → 200, returns token pair + user
    • POST /auth/login with remember_me=true → refresh token exp 30d
    • POST /auth/login with remember_me=false → refresh token exp 7d
    • POST /auth/refresh with current token → 200, new pair (different from old)
    • GET /auth/whoami with access token → 200, returns user + session metadata
    • GET /auth/whoami with refresh token (cold-start case) → 200
    • GET /auth/sessions → list of current user's active sessions
    • DELETE /auth/sessions/{sid} for own session → 200, that session now revoked
    • POST /auth/logout-others → 200, all other sessions revoked
    • POST /auth/change-password with correct old → 200, other sessions revoked
  • Error paths:
    • POST /auth/login with wrong password → 401 (constant-time)
    • POST /auth/login with unknown user → 401 (constant-time)
    • POST /auth/login with inactive user → 403
    • POST /auth/refresh with reused old token → 401 {error: "token_reuse_detected"}
    • POST /auth/refresh with denylisted token → 401
    • POST /auth/refresh with tampered token → 401
    • GET /auth/whoami with no Authorization header → 401
    • GET /auth/whoami with expired access token → 401
    • DELETE /auth/sessions/{sid} for someone else's session → 403
    • POST /auth/change-password with wrong old password → 400
    • POST /auth/change-password with weak new password (if validation added) → 422
  • Integration:
    • Login from client A, login from client B (different IPs / fingerprints) → both have independent sessions
    • Login as user from 11 different fingerprints → 11th login evicts the 1st (oldest non-current)
    • Change password → other devices get 401 on next request → bounced to /login

Test scenarios (test_admin_routes.py):

  • GET /admin/users/{id}/sessions as admin → returns all sessions (active + revoked)
  • GET /admin/users/{id}/sessions as non-admin → 403
  • DELETE /admin/users/{id}/sessions/{sid} as admin → that session revoked
  • DELETE /admin/users/{id}/sessions/{sid} as non-admin → 403

Verification: All integration tests pass; pytest tests/integration/auth/ -v shows green.


U5. Tauri: keyring integration + commands

Goal: Add three Tauri commands to read/write/clear the refresh token in OS Keychain.

Requirements: F3.

Dependencies: None on the auth side; only depends on Tauri Cargo config.

Files:

  • src/agentkit/server/frontend/src-tauri/Cargo.toml — add keyring = { version = "3", features = ["apple-native", "windows-native", "linux-native"] } (or just default features if 3 platforms covered)
  • src/agentkit/server/frontend/src-tauri/src/auth.rs — new module with 3 #[tauri::command] functions
  • src/agentkit/server/frontend/src-tauri/src/lib.rs — register the commands in tauri::Builder::default().invoke_handler(...)
  • src/agentkit/server/frontend/src-tauri/capabilities/default.json — add the 3 commands to the permissions allowlist
  • tests/unit-tauri/test_keyring.rs — Rust unit tests using keyring::mock feature

Approach (auth.rs):

const SERVICE: &str = "com.fischer.agentkit";
const USERNAME: &str = "refresh_token";

#[tauri::command]
pub async fn store_refresh_token(token: String) -> Result<(), String> {
    let entry = keyring::Entry::new(SERVICE, USERNAME)
        .map_err(|e| format!("keychain init failed: {e}"))?;
    entry.set_password(&token)
        .map_err(|e| format!("keychain write failed: {e}"))
}

#[tauri::command]
pub async fn load_refresh_token() -> Result<Option<String>, String> {
    let entry = keyring::Entry::new(SERVICE, USERNAME)
        .map_err(|e| format!("keychain init failed: {e}"))?;
    match entry.get_password() {
        Ok(t) => Ok(Some(t)),
        Err(keyring::Error::NoEntry) => Ok(None),
        Err(e) => Err(format!("keychain read failed: {e}")),
    }
}

#[tauri::command]
pub async fn clear_refresh_token() -> Result<(), String> {
    let entry = keyring::Entry::new(SERVICE, USERNAME)
        .map_err(|e| format!("keychain init failed: {e}"))?;
    match entry.delete_credential() {
        Ok(()) => Ok(()),
        Err(keyring::Error::NoEntry) => Ok(()),
        Err(e) => Err(format!("keychain delete failed: {e}")),
    }
}

Approach (Cargo.toml):

  • Add keyring = "3" under [dependencies]
  • macOS: requires the binary to be signed (Keychain access); for unsigned dev builds, fallback to keyring::mock via feature flag (not needed in this plan; document in README instead)

Approach (capabilities/default.json):

  • Add 3 entries to the permissions array:
    • "core:default:allow-store-refresh-token"
    • "core:default:allow-load-refresh-token"
    • "core:default:allow-clear-refresh-token"

Test scenarios (test_keyring.rs):

  • store_refresh_token("abc") then load_refresh_token() returns Some("abc")
  • clear_refresh_token() then load_refresh_token() returns None
  • load_refresh_token() on a fresh keyring returns None (not error)
  • Use keyring::mock feature for CI tests; real platform tests are manual on macOS dev machine

Verification: cargo test --manifest-path src/agentkit/server/frontend/src-tauri/Cargo.toml passes; manual smoke: launch Tauri dev, log in, check macOS Keychain Access.app for the entry.


U6. Frontend: tauri-auth.ts adapter

Goal: Abstract Keychain (Tauri) / localStorage (Web) access behind a single async API.

Requirements: F3, F4.

Dependencies: U5 (the Rust commands must exist for invoke() to work).

Files:

  • src/agentkit/server/frontend/src/api/tauri-auth.ts — new module
  • tests/unit/api/tauri-auth.test.ts — unit tests with mocked invoke

Approach:

const SERVICE = 'agentkit.refresh_token'

function isTauri(): boolean {
  return typeof window !== 'undefined' && '__TAURI_INTERNALS__' in window
}

export const tauriAuthStorage = {
  async setRefreshToken(token: string): Promise<void> {
    if (isTauri()) {
      try {
        const { invoke } = await import('@tauri-apps/api/core')
        await invoke('store_refresh_token', { token })
        return
      } catch (e) {
        console.warn('[auth] Keychain write failed, falling back to localStorage', e)
      }
    }
    localStorage.setItem(SERVICE, token)
  },

  async getRefreshToken(): Promise<string | null> {
    if (isTauri()) {
      try {
        const { invoke } = await import('@tauri-apps/api/core')
        return await invoke<string | null>('load_refresh_token')
      } catch (e) {
        console.warn('[auth] Keychain read failed, falling back to localStorage', e)
      }
    }
    return localStorage.getItem(SERVICE)
  },

  async clearRefreshToken(): Promise<void> {
    if (isTauri()) {
      try {
        const { invoke } = await import('@tauri-apps/api/core')
        await invoke('clear_refresh_token')
      } catch (e) {
        console.warn('[auth] Keychain clear failed, falling back to localStorage', e)
      }
    }
    localStorage.removeItem(SERVICE)
  },
}

Test scenarios (tauri-auth.test.ts):

  • isTauri() returns true when __TAURI_INTERNALS__ is in window
  • setRefreshToken in Tauri mode calls invoke('store_refresh_token', { token })
  • setRefreshToken in Tauri mode falls back to localStorage when invoke throws
  • setRefreshToken in Web mode (no Tauri) writes to localStorage directly
  • getRefreshToken in Tauri mode returns the value from invoke('load_refresh_token')
  • getRefreshToken in Tauri mode falls back to localStorage when invoke throws
  • clearRefreshToken in Tauri mode calls invoke('clear_refresh_token')
  • clearRefreshToken in Web mode removes from localStorage

Verification: npm run test:unit -- tauri-auth.test.ts passes; manual test: launch Tauri, log in, verify entry in macOS Keychain.


U7. Frontend: auth store refactor (3-state startup, pre-emptive refresh)

Goal: Rewrite stores/auth.ts to support the new flow.

Requirements: F1, F10, F11, F12.

Dependencies: U6 (adapter), U4 (server endpoints).

Files:

  • src/agentkit/server/frontend/src/stores/auth.ts — major refactor
  • src/agentkit/server/frontend/src/api/auth.ts — add whoami(), login(rememberMe), changePassword(), listSessions(), revokeSession()
  • tests/unit/stores/auth.test.ts — extend existing test file

Approach (new auth store shape):

type AuthStartupState = 'valid' | 'invalid' | 'error' | 'pending'

export const useAuthStore = defineStore('auth', () => {
  // --- State ---
  const accessToken = ref<string | null>(null)  // memory only, never persisted
  const user = ref<IAuthUser | null>(readStoredUser())  // localStorage cache for avatar/role
  const startupState = ref<AuthStartupState>('pending')
  const isLoading = ref(false)
  const error = ref<string | null>(null)

  // --- Getters ---
  const isAuthenticated = computed(() => !!accessToken.value && !!user.value)
  const accessTokenExp = computed<number | null>(() => decodeJwtExp(accessToken.value))
  const shouldRefresh = computed(() => {
    if (!accessTokenExp.value) return false
    return accessTokenExp.value * 1000 - Date.now() < 2 * 60 * 1000  // < 2 min
  })

  // --- Mutators ---
  function _setAccess(token: string, user: IAuthUser): void {
    accessToken.value = token
    // user goes to localStorage (safe — no secret)
    localStorage.setItem(USER_KEY, JSON.stringify(user))
    // refresh token goes to Keychain (Tauri) or localStorage (Web)
    // (called separately by login/refresh)
  }

  async function _persistTokenPair(pair: ITokenPair): Promise<void> {
    accessToken.value = pair.access_token
    user.value = pair.user
    writeStoredUser(pair.user)
    await tauriAuthStorage.setRefreshToken(pair.refresh_token)
  }

  function _clear(): void {
    accessToken.value = null
    // do NOT clear user from localStorage (UI shows cached avatar/role)
    // do NOT call tauriAuthStorage.clear here; caller decides
  }

  // --- Actions ---
  async function login(username, password, rememberMe = false): Promise<void> {
    const pair = await authApi.login(username, password, rememberMe)
    await _persistTokenPair(pair)
    startupState.value = 'valid'
  }

  async function startupCheck(): Promise<AuthStartupState> {
    const refresh = await tauriAuthStorage.getRefreshToken()
    if (!refresh) {
      startupState.value = 'invalid'  // not an error — just no token
      return startupState.value
    }
    try {
      const result = await authApi.whoami(refresh)
      // whoami returns { user, access_token, session }
      accessToken.value = result.access_token
      user.value = result.user
      writeStoredUser(result.user)
      startupState.value = 'valid'
    } catch (err) {
      if (err.status === 401) {
        await tauriAuthStorage.clearRefreshToken()
        startupState.value = 'invalid'
      } else {
        startupState.value = 'error'  // network or server issue
      }
    }
    return startupState.value
  }

  async function silentRefresh(): Promise<void> {
    const refresh = await tauriAuthStorage.getRefreshToken()
    if (!refresh) {
      _clear()
      throw new Error('no refresh token')
    }
    try {
      const pair = await authApi.refresh(refresh)
      await _persistTokenPair(pair)
    } catch (err) {
      if (err.status === 401) {
        // reuse detected or all sessions revoked
        await tauriAuthStorage.clearRefreshToken()
      }
      _clear()
      throw err
    }
  }

  async function logout(): Promise<void> {
    const refresh = await tauriAuthStorage.getRefreshToken()
    if (refresh) {
      try { await authApi.logout(refresh) } catch { /* server may be down */ }
    }
    await tauriAuthStorage.clearRefreshToken()
    _clear()
    user.value = null  // explicit: logged out means no cached user
  }

  function logoutLocal(): void {
    _clear()
    user.value = null
  }

  return { /* state, getters, actions */ }
})

Approach (api/auth.ts additions):

async login(username, password, rememberMe = false): Promise<ITokenPair> {
  return this.request('/auth/login', {
    method: 'POST',
    body: JSON.stringify({ username, password, remember_me: rememberMe }),
  })
}

async whoami(refreshToken?: string): Promise<{ user: IAuthUser; access_token: string; session: SessionInfo }> {
  // whoami accepts either an access token (normal call) or a refresh token (cold start)
  // The base client's auth header injection handles access; for the cold-start case
  // we need a special path that uses the refresh token instead.
  return this.requestWithAuth('/auth/whoami', refreshToken)
}

async listSessions(): Promise<SessionInfo[]> { ... }
async revokeSession(sid: string): Promise<void> { ... }
async changePassword(oldPassword: string, newPassword: string): Promise<void> { ... }

Approach (api/base.ts interceptor):

this.client.interceptors.request.use(async (config) => {
  const auth = useAuthStore()
  if (auth.shouldRefresh && auth.accessToken) {
    try {
      await auth.silentRefresh()
    } catch {
      // silent refresh failed; let the request go through and 401 will trigger route
    }
  }
  if (auth.accessToken) {
    config.headers.Authorization = `Bearer ${auth.accessToken}`
  }
  return config
})

Test scenarios (auth.test.ts):

  • login(...) calls authApi.login with remember_me param
  • login(...) persists refresh token via tauriAuthStorage.setRefreshToken
  • startupCheck() with no refresh token → state='invalid'
  • startupCheck() with valid refresh → state='valid', user populated
  • startupCheck() with 401 from whoami → state='invalid', refresh token cleared
  • startupCheck() with network error → state='error', refresh token retained
  • silentRefresh() succeeds → new access in memory, new refresh in Keychain
  • silentRefresh() on 401 reuse → all state cleared, refresh token cleared
  • shouldRefresh is true when access expires in <2 min
  • shouldRefresh is false when access expires in >2 min or no access
  • logout() calls authApi.logout then clears Keychain + state
  • logout() doesn't fail when server is down (best-effort)
  • Access token is NEVER written to localStorage (spy on localStorage.setItem)

Verification: npm run test:unit -- auth.test.ts passes; manual e2e via npm run tauri dev.


U8. Frontend: LoginView "Remember me" + Settings sessions UI

Goal: User-facing changes to the login page and a new "Active Sessions" panel in settings.

Requirements: F2, F7, F8 (user-side).

Dependencies: U7 (store + api), U4 (endpoints).

Files:

  • src/agentkit/server/frontend/src/views/LoginView.vue — add "Remember me" checkbox; pass to store.login
  • src/agentkit/server/frontend/src/views/SettingsView.vue — new section "Active sessions" (or new route /settings/sessions)
  • src/agentkit/server/frontend/src/components/settings/ActiveSessionsPanel.vue — new component
  • src/agentkit/server/frontend/src/components/settings/ChangePasswordPanel.vue — new component
  • src/agentkit/server/frontend/src/router/index.ts — add /settings/sessions and /settings/security routes
  • tests/unit/views/LoginView.test.ts — checkbox behavior
  • tests/unit/components/ActiveSessionsPanel.test.ts

Approach (LoginView additions):

<a-checkbox v-model:checked="form.rememberMe">
  记住我30 天内免登录
</a-checkbox>
async function handleSubmit() {
  await authStore.login(form.username, form.password, form.rememberMe)
  router.replace(redirectTarget())
}

Approach (ActiveSessionsPanel.vue):

  • On mount: call authApi.listSessions(), render table (Device / Last active / Created / [Revoke] button)
  • "Current session" row has a badge; revoke button is disabled for the current row
  • "Revoke" calls authApi.revokeSession(sid) and removes the row
  • "Revoke all others" button at the top → calls authApi.logoutOthers() and reloads

Approach (ChangePasswordPanel.vue):

  • 3 fields: old password, new password, confirm new password
  • Submit: authApi.changePassword(old, new)
  • On success: show success message; note "其他设备将自动登出"

Test scenarios:

  • LoginView renders the checkbox; submitting with it checked passes rememberMe=true to store
  • ActiveSessionsPanel renders a row per session from the API response
  • ActiveSessionsPanel "Revoke" button calls authApi.revokeSession(sid) and removes the row optimistically
  • ActiveSessionsPanel "Revoke all others" calls authApi.logoutOthers() and reloads the list
  • ActiveSessionsPanel disables Revoke on the current session row
  • ChangePasswordPanel shows field-level validation errors (mismatched passwords)
  • ChangePasswordPanel on success shows toast and clears the form

Verification: npm run test:unit -- LoginView ActiveSessionsPanel ChangePasswordPanel passes; Playwright e2e for the full settings flow.


U9. Admin UI: user sessions management

Goal: Admins can see and revoke any user's active sessions.

Requirements: F7, F8 (admin-side).

Dependencies: U7, U4 (admin endpoints exist), U8 (reuses ActiveSessionsPanel layout).

Files:

  • src/agentkit/server/frontend/src/views/admin/UsersView.vue (or UserDetailView.vue) — add "Sessions" tab
  • src/agentkit/server/frontend/src/components/admin/UserSessionsPanel.vue — admin variant
  • src/agentkit/server/frontend/src/api/admin.ts — new file
  • tests/unit/components/UserSessionsPanel.test.ts

Approach:

  • Reuse ActiveSessionsPanel styling; pass an adminMode prop that adds:
    • Show username in the table header
    • Allow revoke of any session including current
    • Show revoked sessions with strikethrough
  • API: adminApi.listUserSessions(userId), adminApi.revokeUserSession(userId, sid)

Test scenarios:

  • Admin can see all sessions for a user (active + revoked)
  • Admin can revoke any session
  • Non-admin attempting to call adminApi endpoints gets a clear 403 error in the UI

Verification: npm run test:unit -- UserSessionsPanel passes; manual e2e with admin login.


U10. Backwards-compat + rollout shim

Goal: Existing in-flight clients (without sid claim) keep working for one minor version.

Requirements: N6.

Dependencies: U4 (the back-compat path in get_current_user).

Files:

  • src/agentkit/server/dependencies.pyget_current_user accepts both with-sid and without-sid JWTs; logs a DEBUG for legacy
  • src/agentkit/server/auth/jwt_utils.pycreate_token_pair has a legacy_mode=True flag for the migration window; tokens issued during migration carry sid but the validator still accepts old ones
  • docs/migrations/2026-06-20-client-version-rollout.md — new doc explaining the rollout window (server logs a warning when a legacy JWT is accepted)

Approach:

  • Add an X-Client-Version header to all requests (set in api/base.ts)
  • Server middleware reads this header; if version < 0.5.0, it issues a legacy JWT (no sid) so that client doesn't get a 401 it can't handle
  • New clients always get a sid-bearing JWT
  • After one minor version (~30 days), remove the legacy path in a separate change

Test scenarios:

  • get_current_user with a sid-bearing JWT loads the session, validates it, returns the user
  • get_current_user with a JWT without sid (legacy) accepts it as long as signature + exp are valid
  • get_current_user with a sid-bearing JWT where the session is revoked → 401
  • get_current_user with a sid-bearing JWT where the session doesn't exist → 401
  • Legacy middleware path issues tokens without sid for clients with X-Client-Version < 0.5.0

Verification: Backwards-compat test using a hand-crafted legacy JWT; new client flow continues to work; manual test with the previous-version frontend.


U11. AuthProvider 抽象层(为未来 IdP 对接留扩展点)

Goal: 把"用户存在哪里 / 密码怎么校验 / 属性怎么同步"封装在可插拔的 AuthProvider adapter 后面。当前实现 LocalAuthProvider(封装 SQLite + bcrypt同时提供 StubOIDCProvider 占位实现(raise NotImplementedError)作为未来 OIDC 实现的接口契约参考。路由层 / admin API / SessionService 通过 Depends(get_auth_provider) 拿到 provider 引用,未来切 IdP 零修改路由

Requirements: F13, F14, F15.

Dependencies: None被 U1/U3/U4 引用;可与 U1-U4 任何阶段并行或先后落地;建议在 Phase 1 早期就上,因为 U1 schema 需要 auth_provider 字段)。

Files:

  • src/agentkit/server/auth/providers/__init__.py — new package导出 AuthProviderget_auth_provider() 工厂、LocalAuthProviderStubOIDCProvider
  • src/agentkit/server/auth/providers/base.pyAuthProvider Protocolname: str + authenticate / get_user_by_id / sync_user_attributes / revoke_user 4 个 async 方法)
  • src/agentkit/server/auth/providers/local.pyLocalAuthProvider 实现,封装现有 auth/password.py 逻辑bcrypt 校验 + 查 users 表)
  • src/agentkit/server/auth/providers/oidc_stub.pyStubOIDCProvider 占位实现,所有方法 raise NotImplementedError 并在 docstring 中指向下一迭代 OIDC 实现的 checklist
  • src/agentkit/server/config.py — extend AuthConfig with provider: Literal["local", "oidc-stub"] = "local"(或新增 auth.provider 字段)
  • tests/unit/auth/providers/test_base.py — Protocol 静态类型检查(runtime_checkable Protocol 验证)+ mock provider 用例
  • tests/unit/auth/providers/test_local.pyLocalAuthProvider 全量单测(复用 auth/password.py 测试场景)
  • tests/unit/auth/providers/test_oidc_stub.pyStubOIDCProvider 调用任意方法均抛 NotImplementedError 的单测

Approach (AuthProvider Protocol):

# auth/providers/base.py
from typing import Protocol, runtime_checkable
from ..models import User

@runtime_checkable
class AuthProvider(Protocol):
    """所有鉴权后端必须实现的能力。

    路由层只调用以下方法,不感知具体实现是 SQLite / OIDC / LDAP。
    未来新增 IdP 只需新加一个实现此 Protocol 的 adapter。
    """

    name: str  # 标识当前 provider写入 session.auth_provider

    async def authenticate(self, *, username: str, password: str) -> User:
        """校验用户名 + 密码,返回 User 对象。失败抛 InvalidCredentials。"""
        ...

    async def get_user_by_id(self, user_id: int) -> User | None:
        """按 id 查 useradmin 端点、session 校验、whoami 都用这个)。"""
        ...

    async def sync_user_attributes(self, user_id: int) -> None:
        """同步用户属性(部门/邮箱/职位等。LocalAuthProvider: no-opOidcAuthProvider: 从 IdP 拉最新 profile 写回本地。"""
        ...

    async def revoke_user(self, user_id: int) -> None:
        """禁用用户(离职/锁定。LocalAuthProvider: UPDATE users SET is_active=0OidcAuthProvider: 调 IdP 的 disable API未来。"""
        ...

Approach (LocalAuthProvider): 把 routes/auth.py:201-213 的 password 校验逻辑SQLite SELECT + bcrypt 校验 + load_user搬到 LocalAuthProvider.authenticate。路由层不再直接调 verify_password / load_user —— 统一走 provider。revoke_userUPDATE users SET is_active=0admin 端点统一调这个,不再直接写 DB

Approach (StubOIDCProvider): 所有方法 raise NotImplementedErrordocstring 写明:

当前未实现。下一迭代 OIDC 集成时,重写本类即可,路由 / admin / Session 表零修改。配置 auth.provider: oidc-stub 启动会立即报 NotImplementedError这是设计避免误启用未完成的功能

Approach (DI 工厂):

# auth/providers/__init__.py
from functools import lru_cache
from ...config import get_settings
from .base import AuthProvider
from .local import LocalAuthProvider
from .oidc_stub import StubOIDCProvider

@lru_cache
def get_auth_provider() -> AuthProvider:
    settings = get_settings()
    provider_name = settings.auth.provider
    if provider_name == "local":
        db = get_auth_db()  # 现有 aiosqlite 连接(需改造为模块级单例)
        return LocalAuthProvider(db)
    elif provider_name == "oidc-stub":
        return StubOIDCProvider()
    else:
        raise ValueError(f"unknown auth provider: {provider_name}")

Approach (config 扩展):

# agentkit.yaml
auth:
  provider: local  # local | oidc-stub  (未来: oidc-keycloak, oidc-feishu, ...)
  session:
    table: auth_sessions
    access_ttl_seconds: 900
    refresh_ttl_seconds: 604800
    refresh_ttl_remember_me_seconds: 2592000
  jwt:
    secret_env: AGENTKIT_JWT_SECRET
    algorithm: HS256

Test scenarios (test_base.py + test_local.py + test_oidc_stub.py):

  • LocalAuthProvider with valid username+password returns User
  • LocalAuthProvider with wrong password raises InvalidCredentials
  • LocalAuthProvider with unknown username raises InvalidCredentials
  • LocalAuthProvider with inactive user (is_active=0) raises InvalidCredentials
  • LocalAuthProvider.get_user_by_id returns the user or None
  • LocalAuthProvider.sync_user_attributes is a no-op (returns None)
  • LocalAuthProvider.revoke_user sets is_active=0 and subsequent authenticate fails
  • LocalAuthProvider.name == "local"
  • StubOIDCProvider.authenticate raises NotImplementedError with helpful message
  • StubOIDCProvider.get_user_by_id raises NotImplementedError
  • StubOIDCProvider.sync_user_attributes raises NotImplementedError
  • StubOIDCProvider.revoke_user raises NotImplementedError
  • StubOIDCProvider.name == "oidc-stub"
  • get_auth_provider() with auth.provider=local returns LocalAuthProvider instance
  • get_auth_provider() with auth.provider=oidc-stub returns StubOIDCProvider instance
  • get_auth_provider() with auth.provider=unknown raises ValueError
  • get_auth_provider() is memoized (lru_cache; second call returns same instance)
  • runtime_checkable(AuthProvider): both Local and Stub pass isinstance(prov, AuthProvider) check
  • Protocol violation: a class missing authenticate method does NOT pass isinstance check (negative test)

Patterns to follow:

  • Protocol + runtime_checkable pattern (Python typing best practice)
  • DI 工厂 + lru_cache 单例(与现有 get_settings 一致)
  • error 类型 InvalidCredentials 放到 auth/providers/exceptions.py(新建)

Verification:

  • pytest tests/unit/auth/providers/ -v 全部通过
  • mypy src/agentkit/server/auth/providers/ 无报错
  • 启动 dev server配置 auth.provider: oidc-stub → 第一次 /auth/login 返回 501 NotImplementedError确认 stub 起作用)
  • 启动 dev server配置 auth.provider: local → 走现有登录流程,确认未破坏
  • admin 踢人功能调用 provider.revoke_user(user_id)user 再 authenticate 失败cross-check LocalAuthProvider.revoke_user 行为)

未来 IdP 对接 checklist(下一迭代参考):

  • auth/providers/oidc.py — 实现 OidcAuthProviderauthenticate / get_user / sync_attributes / revoke_user
  • auth/oauth_routes.py/auth/oauth/{provider}/redirect/auth/oauth/{provider}/callback 端点
  • auth/state_cache.py — OAuth state 参数防 CSRFRedis TTL 5min
  • 用户首次从 IdP 登录时的「本地账号创建」策略justeer / 拒绝 / 邀请制)
  • IdP 端的 session 同步IdP 登出时本地 session 也撤销)
  • 集团部门 / 职位属性映射到本地 users 表

本次迭代只做 Protocol + Local 实现 + Stub 占位 + DI 工厂 + 上述 1-3 项的占位(接口定义),其余列入下一迭代独立 brainstorm。


System-Wide Impact

Stakeholder Impact Mitigation
End users (Tauri) First login → no more login prompts for 7d (30d if "remember me"). Pre-emptive refresh + Keychain storage prevent the failure modes that broke the existing flow.
End users (Web) Same as Tauri but refresh in localStorage (degraded security). Document the trade-off; Keychain is Tauri-only.
Admins New capability: see active sessions, kick any user. UI in admin pages; surface clearly in the Users view.
Developers (auth code) New session module, denylist, cache, AuthProvider 抽象层. U3 is the single source of truth — routes don't duplicate logic. U11 is the single source of auth backend — routes don't import password.py directly.
未来集团 IdP 集成团队 切到 OIDC / SAML / LDAP 时只新增 adapter不重写路由 / admin U11 Protocol + LocalAuthProvider 已上;下一迭代 auth/providers/oidc.py 直接实现 Protocol 即可
Existing in-flight clients Unaffected during 30-day window. U10 shim.
Server load +1 cache lookup per request (cached 60s). Redis-backed cache makes this sub-ms.
DB schema New auth_sessions table (含 auth_provider 字段); existing user_sessions deprecated. Alembic migration; keep user_sessions reads working for one version.

Risks & Dependencies

Risk Likelihood Impact Mitigation
keyring crate compatibility issues on Linux without gnome-keyring / kwallet Medium Low (Tauri dev) Document apt install gnome-keyring in README; fallback to localStorage as per KTD-confirmed decision.
Tauri WebView localStorage might be cleared on Tauri upgrade Low Medium (forces re-login) Refresh token is in Keychain, not localStorage, so this is no longer a re-login trigger. Only the cached user (avatar) is lost.
Refresh token rotation causes concurrent-request races Medium Medium (false-positive reuse detection) The 30s denylist window catches the case; legitimate retries complete in <1s. Add a metric for reuse detection so we can spot flapping.
Migration corrupts existing refresh tokens Low High (users locked out) Test migration on a copy of prod DB; preserve user_sessions reads for back-compat.
Session cap eviction surprises users (they didn't expect to be kicked) Low Low (visible at next login) Make the cap (10) generous; document it; do not log evicted users out silently.
Test mocks diverge from real keyring behavior Medium Medium (CI passes, manual fails) Use keyring::mock feature in CI; document that real-platform testing is manual.
JWT secret rotation in dev mode invalidates all sessions Low High (Tauri dev loops) Document the behavior; provide agentkit doctor to check.
AuthProvider 切换时遗留 routes 直接调 verify_password / 改 users 表KTD-10 Medium Medium切 IdP 时必须清理) U11 引入后强制要求所有 routes 走 Depends(get_auth_provider)code review 模板加 checklist「禁止 routes 直接调 password/auth 函数」
lru_cache 单例 + 测试隔离冲突U11 Low Low测试 flaky get_auth_provider 提供 cache_clear() helperconftest.py 在每个 test fixture 前后清缓存
未来 IdP 接管时 LocalAuthProvider 残留依赖 Low Low迁移期保留即可 U11 checklist 显式列出Local 仍可用作"本地应急账号"OIDC 接管后不删 Local仅调整路由默认 provider

External Dependencies

Dependency Version Required For
keyring (Rust crate) 3.x Tauri Keychain integration (U5)
pyjwt (Python) already in use JWT signing/verification (U2)
aiosqlite (Python) already in use DB layer (U1, U3)
alembic (Python) already in use Migrations (U1)
redis (Python) already in use Session cache (U3) — optional; in-process fallback
@tauri-apps/api (TS) 2.x Tauri command invocation (U6)

Phased Delivery

This plan has natural phasing based on dependency order. Each phase lands as a single PR.

Phase 1: Backend foundation (U1, U2, U3)

  • auth_sessions table + migration
  • JWT sid/jti claims
  • SessionService with rotation + reuse detection
  • Redis/in-process cache
  • ~3-4 days of work, no frontend changes

Rollout gate: Deploy to dev. All existing clients continue to work (legacy JWT path). New login creates auth_sessions rows; old user_sessions rows are no longer written.

Phase 2: New endpoints (U4, U10)

  • All new auth + admin endpoints
  • Backwards-compat shim
  • Admin endpoint tests
  • ~2 days of work, frontend still on old flow

Rollout gate: Deploy to dev. New endpoints are available; old /auth/login and /auth/refresh still work (with legacy tokens).

Phase 3: Tauri Keychain (U5, U6)

  • Rust commands + Cargo dep
  • Frontend tauri-auth adapter
  • ~1-2 days of work

Rollout gate: Build a new Tauri release. Verify on macOS (Keychain Access.app shows the entry). Linux without keyring daemon → manual test fallback.

Phase 4: Frontend refactor (U7, U8, U9)

  • Auth store rewrite (3-state, pre-emptive refresh, no access in localStorage)
  • LoginView "Remember me"
  • Active Sessions panel in Settings
  • Admin user sessions panel
  • ~3-4 days of work

Rollout gate: Frontend rebuild. End-to-end manual test on Tauri (macOS) + Web. Run Playwright suite.

Phase 5: Cleanup (after one minor version, ~30 days)

  • Remove the legacy JWT back-compat path
  • Drop the user_sessions table
  • Update X-Client-Version floor
  • ~1 day of work

Phase 6: AuthProvider 抽象层U11 + 关联改造)

2026-06-20 新增 Phase(合并 AuthProvider scope

  • auth/providers/base.pyAuthProvider Protocol + runtime_checkable
  • auth/providers/local.pyLocalAuthProvider(封装现有 routes/auth.py:201-213 的 password 校验逻辑)
  • auth/providers/oidc_stub.pyStubOIDCProviderraise NotImplementedError 占位)
  • auth/providers/__init__.pyget_auth_provider() DI 工厂(lru_cache 单例)
  • config.py — 新增 auth.provider: local | oidc-stub 配置
  • U1 schema 加 auth_provider 字段(合并入 Phase 1 U1
  • U3 SessionService create_session 接受 auth_provider 参数(合并入 Phase 1 U3
  • U4 routes Depends(get_auth_provider) 注入admin 端点调 provider.revoke_user(user_id) 而不是直接改 users 表(合并入 Phase 2 U4
  • ~1.5 days of work可以与 Phase 1 早期并行落地)

Rollout gate:

  • pytest tests/unit/auth/providers/ -v 全部通过
  • 启动 dev server配置 auth.provider: oidc-stub → 第一次 /auth/login 返回 501 NotImplementedError
  • 启动 dev server配置 auth.provider: local → 现有登录流程不受影响
  • admin 踢人功能调用 provider.revoke_user(user_id) 行为与原 DB 直接 UPDATE 等价

未来 IdP 集成入口:下一迭代 OIDC 集成只需新加 auth/providers/oidc.py + auth/oauth_routes.py(见 U11 checklist路由 / admin / Session 表零修改。


Open Questions

These are deferred to implementation and tracked here for visibility:

  1. Q1: Should "Active Sessions" be a tab in Settings or a separate route (/settings/sessions)? Plan defaults to a Settings tab; revisit if UX testing suggests otherwise.
  2. Q2: Should the admin UI show revoked_reason for kicked sessions? Plan defaults to YES (audit value); revisit if it adds too much visual noise.
  3. Q3: Should the cap-eviction trigger a server-side notification (e.g. an audit_event)? Plan defaults to writing a row to a future auth_audit_log table; for now, just the revoked_reason='session_cap_eviction' field is enough.
  4. Q4: Should change_password rate-limit (e.g. 5 attempts per hour)? Out of scope here but worth a follow-up security brainstorm.
  5. Q5: macOS Tauri builds need code-signing for Keychain access. The dev binary is unsigned → Keychain prompts "always allow". Plan documents this; production builds must be signed.
  6. Q6 (新增 2026-06-20): AuthProvider 抽象层与现有 routes/auth.py:201-213 的 password 校验逻辑如何共存计划方案U11 第一步 LocalAuthProvider 完整复刻现有逻辑(行为等价),第二步 U4 routes 改造时一次性切换U11 落地时写"行为等价"测试套件确认切换前后行为一致
  7. Q7 (新增 2026-06-20): get_auth_provider()lru_cache 单例在测试环境如何隔离?计划方案:导出 cache_clear() helperconftest.py 在每个 test fixture 前后 get_auth_provider.cache_clear();不引入 dependency_overrides(避免 FastAPI app 状态污染)

Sources & Research

Codebase references

External references

Institutional learnings

  • Project context: AGENTS.md + .trae/rules/project_rules.md — security and async generator safety rules apply
  • Existing tests: tests/unit/auth/ + tests/integration/auth/ — patterns to follow for new test files
  • The current _refreshFailed sticky flag in stores/auth.ts:112 is the root cause of the "logged out for no reason" UX — the rewrite in U7 eliminates it by always re-trying the refresh before giving up

Acceptance Examples (for the executor / reviewer)

The following end-to-end flows must work after this plan lands. Each is testable in Playwright or manual e2e.

AE-1: First login → cold start → main app (Covers F1, F3, F10, F11)

  1. Launch Tauri (clean state, no Keychain entry)
  2. Login with valid credentials → land on /agent
  3. Close Tauri window
  4. Re-launch Tauri (cold start)
  5. Expected: brief splash, then /agent. No login page seen. Keychain Access.app shows an entry for com.fischer.agentkit / refresh_token.

AE-2: Token expiry mid-session → silent refresh (Covers F10)

  1. Log in; access token exp 15 min
  2. Wait 13 minutes (or manually expire the token in DB)
  3. Make an API call (e.g. fetch conversations)
  4. Expected: request succeeds (silent refresh happened before the call); no 401 surfaced to the user.

AE-3: Refresh token reuse → all sessions revoked (Covers F5, F9)

  1. Log in from Tauri (session A)
  2. Log in from Web (session B)
  3. Copy A's refresh token from Keychain
  4. Wait for A to refresh once legitimately (A's old refresh is now in the 30s denylist, and A has a new refresh)
  5. Try to use the copied old refresh token
  6. Expected: 401 with error: "token_reuse_detected". A's session is revoked. B's session is also revoked. Both clients get bounced to /login.

AE-4: Password change → other device kicked (Covers F9)

  1. Log in from Tauri (session A) and Web (session B) as the same user
  2. From A, change password
  3. From B, make any API call
  4. Expected: B gets 401 → bounced to /login. A continues to work.

AE-5: Admin kicks a session (Covers F7, F8)

  1. User logs in from Tauri and Web
  2. Admin opens the Users view, selects the user, opens the Sessions tab
  3. Admin clicks "Revoke" on the Tauri session
  4. Expected: Tauri client's next API call returns 401 → bounced to /login. Web session is unaffected.

AE-6: Remember me toggle (Covers F2)

  1. Log in with "Remember me" UNCHECKED
  2. Expected: refresh token exp is 7 days
  3. Log out, log in with "Remember me" CHECKED
  4. Expected: refresh token exp is 30 days

AE-7: Session cap eviction (Covers F12 + the cap)

  1. Log in 10 times from 10 different simulated clients (use curl with different User-Agent headers)
  2. Expected: 10 sessions exist, all active
  3. Log in an 11th time
  4. Expected: the oldest non-current session is revoked (visible in DB with revoked_reason='session_cap_eviction'); the 11 sessions are now the 2nd-10th + the new 11th

AE-8: Web fallback to localStorage (Covers F4)

  1. Open the app in a browser (not Tauri)
  2. Log in
  3. Expected: localStorage.getItem('agentkit.refresh_token') returns the token. DevTools shows the value.
  4. (Note: this is the documented degraded security model for Web clients)

AE-9: Old client still works during migration (Covers N6)

  1. Build a previous-version frontend
  2. Log in (gets a legacy JWT without sid)
  3. Make API calls
  4. Expected: server validates the legacy JWT via the back-compat path; user is not affected

AE-10: AuthProvider 切换local → oidc-stub 验证接口契约)(Covers F13, F14)

2026-06-20 新增KTD-10 / U11 验证)

  1. 配置 agentkit.yamlauth.provider: local,启动 dev server
  2. POST /auth/login 用现有 admin 账号
  3. Expected: 200 OK返回 TokenResponseDB 中 auth_sessions.auth_provider='local'
  4. 改配置为 auth.provider: oidc-stub,重启 dev server
  5. POST /auth/login 同样账号
  6. Expected: 501 Not ImplementedStubOIDCProvider 抛 NotImplementedError
  7. 验证 admin 端点 /admin/users/{id}/sessions 仍能列出步骤 3 创建的 sessionauth_provider='local' 字段)
  8. Expected: admin 看 session 列表功能不受 provider 切换影响KTD-10 核心承诺)
  9. isinstance(provider_instance, AuthProvider) 验证 Local 和 Stub 都通过 Protocol 检查
  10. Expected: 两者都返回 Trueruntime_checkable Protocol 行为正确)

AE-11: 审计字段 auth_provider 写入(覆盖历史 + 新建)(Covers F15)

  1. 在 AE-1 步骤 1-2 完成后,调 GET /auth/sessions 列出当前 user 的所有 active session
  2. Expected: 每个 session 包含 auth_provider: "local" 字段(即使是 backfill 自 user_sessions 的行也是 'local',因为 backfill 走默认值)
  3. admin 调 GET /admin/users/{id}/sessions 跨 user 看
  4. Expected: 所有 session 都带 auth_provider 字段admin 可按 provider 过滤(即使当前只有 local未来 oidc 接入后会有 oidc-* 区分)
  5. SessionService.list_active_by_provider('local') 返回所有 local session
  6. Expected: count = 步骤 2 看到的总数
  7. SessionService.list_active_by_provider('oidc-stub') 在当前实现下返回空 list
  8. Expected: count = 0证明字段存在但无数据未来 OIDC 接入后才会有值)
  9. Server log shows DEBUG: "Legacy JWT without sid; using exp-only validation"