refactor: follow-up tech debt cleanup (except Exception + Any 治理) #9

Merged
fischer merged 3 commits from refactor/followup-tech-debt into main 2026-07-01 03:03:02 +08:00
Owner

Follow-up Tech Debt Cleanup: except Exception + Any 治理

Summary

承接 PR #8 (U1-U5 主债务清理) 的 Known Residuals,本 PR 集中治理 3 类系统性技术债务残留:

  1. server/routes/except Exception 窄化(23 文件,131 处)— 把框架边界外的宽泛异常捕获替换为具体异常元组,框架边界保留并加 asyncio.CancelledError 守卫
  2. llm/ + memory/ + client/Any 消除(31 文件,172 处)— 用 TypeAlias / object / TYPE_CHECKING Protocol 替换 Any
  3. orchestrator/ + bitable/Any 消除(19 文件,112 处)— 同上策略,并对 RecalcTask 用具体类型,对 Coroutine[Any, Any, Any]Coroutine[object, object, object]

总计 415 处治理,覆盖 73 个文件。

Key Design Decisions

except Exception 分类策略(PR #1)

按调用场景分类异常元组,而非一刀切:

  • WebSocket ops(ConnectionError, RuntimeError, asyncio.TimeoutError)
  • DB/Store(ConnectionError, OSError, asyncio.TimeoutError, ValueError, KeyError, RuntimeError)
  • EventQueue(asyncio.QueueFull, RuntimeError, ConnectionError)
  • Config(ValueError, TypeError, KeyError)
  • Cleanup(RuntimeError, asyncio.TimeoutError, ConnectionError)

框架边界保留auth.pyauth_flowws.py 的 WebSocket 主循环等框架入口仍需 except Exception 兜底防止进程崩溃,但加 except asyncio.CancelledError: raise 守卫确保取消信号正确传播。

2 处 noqa 保留skills.py 的 DB 韧性边界(业务上需要捕获所有异常以保证 skill 加载不阻塞)。

Any 治理策略(PR #2 + #3)

按优先级选择替代类型:

  1. TypeAlias(首选,用于有明确语义的 dict)— MetadataValue/MetadataDict/RAGSearchResult/SkillConfigDict
  2. object(最严格任意类型)— dict[str, Any]dict[str, object]**kwargs: Any**kwargs: objectlist[Any]list[object];返回值 Anyobject
  3. TYPE_CHECKING Protocol(用于外部依赖对象)— _RedisLike/_RedisPipelineLike/_LLMGatewayLike/_QuotaServiceLike/_RAGServiceLike/_GraphServiceLike/_PlanLike
  4. TYPE_CHECKING import + 字符串注解(避免循环引用)
  5. Coroutine[Any, Any, Any]Coroutine[object, object, object]
  6. 具体类型 import(如 RecalcTask)替代 Any

关键决策:避免递归 TypeAlias

PR #3 初版在 bitable/models.py 定义了递归 TypeAlias FieldValue = str | int | ... | list["FieldValue"] | dict[str, "FieldValue"],但 Pydantic v2 无法为递归命名别名构建 schema(RecursionError: maximum recursion depth exceeded)。改用 dict[str, object] 并加 ponytail: 注释说明天花板与升级路径。

Testing Notes

模块 测试结果
server/routes/ 与 main 基线一致(110 失败均为基线问题,非本 PR 回归)
llm/ + memory/ + client/ 253 passed(21 litellm 失败为环境缺包,基线一致)
orchestrator/ 154 passed
bitable/ 91 passed, 116 skipped
总计 498 passed, 0 回归

ruff: All checks passed (ruff check src/agentkit/llm/ src/agentkit/memory/ src/agentkit/client/ src/agentkit/orchestrator/ src/agentkit/bitable/)

验证方法: 对每个失败用 git stash && git checkout main 对比基线确认非回归。

Commits

  1. aa6367frefactor(server/routes): classify except Exception in 23 route files(131 处)
  2. 34a89c4refactor(llm+memory+client): remove Any from type signatures(172 处)
  3. 7b1b198refactor(orchestrator+bitable): remove Any from type signatures(112 处)

Post-Deploy Monitoring & Validation

No additional operational monitoring required — 本 PR 是纯类型注解 + 异常分类重构,不改变运行时行为:

  • 类型注解改动不影响运行时(Python 注解默认惰性求值)
  • except Exception 窄化后,原本被吞掉的异常现在会向上传播;若调用方有更上层兜底,行为不变;若无,则可能暴露原本被隐藏的 bug(这是预期改进,但需关注)
  • 监控信号:部署后 24h 内关注 server/routes/ 相关端点的 500 错误率,若某端点 500 上升,可能是窄化过严漏掉了某类运行时异常,需回查该端点的 except 元组
  • 回滚触发:若某端点 500 错误率较基线上升 >2x,回滚该文件到 except Exception 并补充遗漏的异常类型

Known Residuals

无。本 PR 完成了 PR #8 Known Residuals 中列出的全部 3 项 follow-up 工作。剩余的 except ExceptionAny 残留(如有)属于其他子系统,可在后续 PR 中处理。


Compound Engineering

# Follow-up Tech Debt Cleanup: except Exception + Any 治理 ## Summary 承接 PR #8 (U1-U5 主债务清理) 的 Known Residuals,本 PR 集中治理 3 类系统性技术债务残留: 1. **`server/routes/` 的 `except Exception` 窄化**(23 文件,131 处)— 把框架边界外的宽泛异常捕获替换为具体异常元组,框架边界保留并加 `asyncio.CancelledError` 守卫 2. **`llm/` + `memory/` + `client/` 的 `Any` 消除**(31 文件,172 处)— 用 TypeAlias / `object` / TYPE_CHECKING Protocol 替换 `Any` 3. **`orchestrator/` + `bitable/` 的 `Any` 消除**(19 文件,112 处)— 同上策略,并对 `RecalcTask` 用具体类型,对 `Coroutine[Any, Any, Any]` 用 `Coroutine[object, object, object]` **总计 415 处治理**,覆盖 73 个文件。 ## Key Design Decisions ### `except Exception` 分类策略(PR #1) 按调用场景分类异常元组,而非一刀切: - **WebSocket ops** → `(ConnectionError, RuntimeError, asyncio.TimeoutError)` - **DB/Store** → `(ConnectionError, OSError, asyncio.TimeoutError, ValueError, KeyError, RuntimeError)` - **EventQueue** → `(asyncio.QueueFull, RuntimeError, ConnectionError)` - **Config** → `(ValueError, TypeError, KeyError)` - **Cleanup** → `(RuntimeError, asyncio.TimeoutError, ConnectionError)` **框架边界保留**:`auth.py` 的 `auth_flow`、`ws.py` 的 WebSocket 主循环等框架入口仍需 `except Exception` 兜底防止进程崩溃,但加 `except asyncio.CancelledError: raise` 守卫确保取消信号正确传播。 **2 处 noqa 保留**:`skills.py` 的 DB 韧性边界(业务上需要捕获所有异常以保证 skill 加载不阻塞)。 ### `Any` 治理策略(PR #2 + #3) 按优先级选择替代类型: 1. **TypeAlias**(首选,用于有明确语义的 dict)— `MetadataValue`/`MetadataDict`/`RAGSearchResult`/`SkillConfigDict` 等 2. **`object`**(最严格任意类型)— `dict[str, Any]` → `dict[str, object]`;`**kwargs: Any` → `**kwargs: object`;`list[Any]` → `list[object]`;返回值 `Any` → `object` 3. **TYPE_CHECKING Protocol**(用于外部依赖对象)— `_RedisLike`/`_RedisPipelineLike`/`_LLMGatewayLike`/`_QuotaServiceLike`/`_RAGServiceLike`/`_GraphServiceLike`/`_PlanLike` 等 4. **TYPE_CHECKING import + 字符串注解**(避免循环引用) 5. **`Coroutine[Any, Any, Any]`** → `Coroutine[object, object, object]` 6. **具体类型 import**(如 `RecalcTask`)替代 `Any` ### 关键决策:避免递归 TypeAlias PR #3 初版在 `bitable/models.py` 定义了递归 TypeAlias `FieldValue = str | int | ... | list["FieldValue"] | dict[str, "FieldValue"]`,但 Pydantic v2 无法为递归命名别名构建 schema(`RecursionError: maximum recursion depth exceeded`)。改用 `dict[str, object]` 并加 `ponytail:` 注释说明天花板与升级路径。 ## Testing Notes | 模块 | 测试结果 | |------|---------| | `server/routes/` | 与 main 基线一致(110 失败均为基线问题,非本 PR 回归) | | `llm/` + `memory/` + `client/` | 253 passed(21 litellm 失败为环境缺包,基线一致) | | `orchestrator/` | 154 passed | | `bitable/` | 91 passed, 116 skipped | | **总计** | **498 passed, 0 回归** | **ruff**: All checks passed (`ruff check src/agentkit/llm/ src/agentkit/memory/ src/agentkit/client/ src/agentkit/orchestrator/ src/agentkit/bitable/`) **验证方法**: 对每个失败用 `git stash && git checkout main` 对比基线确认非回归。 ## Commits 1. `aa6367f` — `refactor(server/routes): classify except Exception in 23 route files`(131 处) 2. `34a89c4` — `refactor(llm+memory+client): remove Any from type signatures`(172 处) 3. `7b1b198` — `refactor(orchestrator+bitable): remove Any from type signatures`(112 处) ## Post-Deploy Monitoring & Validation **No additional operational monitoring required** — 本 PR 是纯类型注解 + 异常分类重构,不改变运行时行为: - 类型注解改动不影响运行时(Python 注解默认惰性求值) - `except Exception` 窄化后,原本被吞掉的异常现在会向上传播;若调用方有更上层兜底,行为不变;若无,则可能暴露原本被隐藏的 bug(这是预期改进,但需关注) - **监控信号**:部署后 24h 内关注 `server/routes/` 相关端点的 500 错误率,若某端点 500 上升,可能是窄化过严漏掉了某类运行时异常,需回查该端点的 except 元组 - **回滚触发**:若某端点 500 错误率较基线上升 >2x,回滚该文件到 `except Exception` 并补充遗漏的异常类型 ## Known Residuals 无。本 PR 完成了 PR #8 Known Residuals 中列出的全部 3 项 follow-up 工作。剩余的 `except Exception` 和 `Any` 残留(如有)属于其他子系统,可在后续 PR 中处理。 --- [![Compound Engineering](https://img.shields.io/badge/Built_with-Compound_Engineering-6366f1)](https://github.com/EveryInc/compound-engineering-plugin)
fischer added 3 commits 2026-07-01 02:45:34 +08:00
aa6367ff9f refactor(server/routes): classify except Exception in 23 route files
Narrow 131 except Exception to specific exception types across all
server/routes/ modules. Framework boundaries (main execute paths,
WebSocket top-level) retain except Exception with asyncio.CancelledError
guard.

Categories:
- WebSocket ops: (ConnectionError, RuntimeError, asyncio.TimeoutError)
- DB/Store ops: (ConnectionError, OSError, asyncio.TimeoutError, ValueError, KeyError, RuntimeError)
- EventQueue: (asyncio.QueueFull, RuntimeError, ConnectionError)
- Config construction: (ValueError, TypeError, KeyError)
- Cleanup/dissolve: (RuntimeError, asyncio.TimeoutError, ConnectionError)
- HTTP handlers: business-specific exceptions
- Framework boundaries: retain except Exception + CancelledError guard

Stats: 101 narrowed, 31 framework boundary retained, 2 noqa (DB resilience)

Follow-up to PR #8 (U1-U5 systematic tech debt cleanup).
34a89c4873 refactor(llm+memory+client): remove Any from type signatures
Eliminate 172 Any usages across llm/, memory/, client/ via:
- TypeAlias (MetadataValue, MetadataDict, RAGSearchResult, etc.)
- object for arbitrary dict/value types
- TYPE_CHECKING Protocol for Redis/Quota/RAG/Graph services
- TYPE_CHECKING import + string annotations for forward refs
- Remove unused Any imports (18 F401 fixed)

Tests: 253 passed (llm 21 failures are pre-existing litellm env issue)
ruff: All checks passed
Test / backend-test (pull_request) Has been cancelled Details
Test / frontend-unit (pull_request) Has been cancelled Details
Test / api-e2e (pull_request) Has been cancelled Details
Test / frontend-e2e (pull_request) Has been cancelled Details
7b1b198058
refactor(orchestrator+bitable): remove Any from type signatures
Eliminate 112 Any usages across orchestrator/ (62) and bitable/ (50) via:
- TYPE_CHECKING Protocol for Redis/LLMGateway/Plan/Dispatcher/StateManager
- object for arbitrary dict/list/value types (Pydantic v2 serializes fine)
- RecalcTask concrete import (replacing Any in recalc_worker.py)
- Coroutine[object, object, object] for async generic
- Remove unused Any imports (F401 cleanup)

Note: Avoided recursive TypeAlias (FieldValue) because Pydantic v2 cannot
build schemas for recursive named aliases (RecursionError).

Tests: 245 passed (bitable 91 + orchestrator 154), 0 regressions
ruff: All checks passed
fischer merged commit 838a05772e into main 2026-07-01 03:03:02 +08:00
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: fischer/fischer-agentkit#9
No description provided.