feat: 新增5个AI平台适配器+引用引擎修复+报告导出增强

- 新增平台: 通义千问、豆包、智谱清言、天工AI、讯飞星火
- 引擎重写: 从Playwright改为搜索引擎模式(DuckDuckGo+Wikipedia)
- 执行链路: run-now触发后异步执行CitationEngine
- 调度器兜底: 每分钟轮询处理pending任务
- 报告增强: 10字段+中文平台名+置信度+汇总统计
- 修复: CORS、raw_response字符过滤、时区混合
- UE修复: 侧边栏导航高亮、操作成功提示、表单清空
This commit is contained in:
chiguyong 2026-04-23 20:10:58 +08:00
parent a8927a18e6
commit 02cf7a94ac
29 changed files with 1296 additions and 548 deletions

View File

@ -21,6 +21,15 @@
- [backend/app/models/query_task.py](file://backend/app/models/query_task.py)
</cite>
## 更新摘要
**所做更改**
- 完善了认证接口的详细说明,包括注册、登录和用户信息查询
- 更新了查询管理接口的完整功能说明涵盖CRUD操作和权限控制
- 补充了引用数据接口的统计分析和任务执行功能
- 增强了报告导出接口的CSV格式说明
- 完善了错误处理和状态码说明
- 更新了架构图和数据流图以反映实际实现
## 目录
1. [简介](#简介)
2. [项目结构](#项目结构)
@ -60,7 +69,7 @@ A --> J["数据传输对象<br/>backend/app/schemas/*.py"]
- [backend/app/api/deps.py:13](file://backend/app/api/deps.py#L13)
**章节来源**
- [backend/app/main.py:1-48](file://backend/app/main.py#L1-L48)
- [backend/app/main.py:1-57](file://backend/app/main.py#L1-L57)
## 核心组件
- 应用入口与生命周期管理定义应用标题、版本、CORS策略注册各模块路由启动/关闭查询调度器。
@ -71,7 +80,7 @@ A --> J["数据传输对象<br/>backend/app/schemas/*.py"]
- 数据模型与服务:用户、查询、引用记录、查询任务等模型及对应的服务逻辑。
**章节来源**
- [backend/app/main.py:13-47](file://backend/app/main.py#L13-L47)
- [backend/app/main.py:13-57](file://backend/app/main.py#L13-L57)
- [backend/app/api/auth.py:13-42](file://backend/app/api/auth.py#L13-L42)
- [backend/app/api/queries.py:15-85](file://backend/app/api/queries.py#L15-L85)
- [backend/app/api/citations.py:25-77](file://backend/app/api/citations.py#L25-L77)
@ -112,11 +121,11 @@ MODELS --> CONFIG
```
**图表来源**
- [backend/app/main.py:38-42](file://backend/app/main.py#L38-L42)
- [backend/app/main.py:38-51](file://backend/app/main.py#L38-L51)
- [backend/app/api/deps.py:16-42](file://backend/app/api/deps.py#L16-L42)
- [backend/app/services/auth.py:37-68](file://backend/app/services/auth.py#L37-L68)
- [backend/app/services/query.py:12-129](file://backend/app/services/query.py#L12-L129)
- [backend/app/services/citation.py:24-268](file://backend/app/services/citation.py#L24-L268)
- [backend/app/services/query.py:12-123](file://backend/app/services/query.py#L12-L123)
- [backend/app/services/citation.py:24-359](file://backend/app/services/citation.py#L24-L359)
## 详细组件分析
@ -215,7 +224,7 @@ Return --> End
**章节来源**
- [backend/app/api/queries.py:15-85](file://backend/app/api/queries.py#L15-L85)
- [backend/app/schemas/query.py:11-94](file://backend/app/schemas/query.py#L11-L94)
- [backend/app/services/query.py:12-129](file://backend/app/services/query.py#L12-L129)
- [backend/app/services/query.py:12-123](file://backend/app/services/query.py#L12-L123)
### 引用数据接口
- 接口前缀:/api/v1/citations
@ -254,12 +263,12 @@ CitAPI-->>Client : 202 任务信息
**图表来源**
- [backend/app/api/citations.py:59-77](file://backend/app/api/citations.py#L59-L77)
- [backend/app/services/citation.py:204-234](file://backend/app/services/citation.py#L204-L234)
- [backend/app/services/citation.py:204-261](file://backend/app/services/citation.py#L204-L261)
**章节来源**
- [backend/app/api/citations.py:25-77](file://backend/app/api/citations.py#L25-L77)
- [backend/app/schemas/citation.py:7-50](file://backend/app/schemas/citation.py#L7-L50)
- [backend/app/services/citation.py:24-268](file://backend/app/services/citation.py#L24-L268)
- [backend/app/services/citation.py:24-359](file://backend/app/services/citation.py#L24-L359)
### 报告导出接口
- 接口前缀:/api/v1/reports
@ -289,11 +298,11 @@ ReportAPI-->>Client : 200 CSV文件下载
**图表来源**
- [backend/app/api/reports.py:16-46](file://backend/app/api/reports.py#L16-L46)
- [backend/app/services/citation.py:237-268](file://backend/app/services/citation.py#L237-L268)
- [backend/app/services/citation.py:327-359](file://backend/app/services/citation.py#L327-L359)
**章节来源**
- [backend/app/api/reports.py:16-46](file://backend/app/api/reports.py#L16-L46)
- [backend/app/services/citation.py:237-268](file://backend/app/services/citation.py#L237-L268)
- [backend/app/services/citation.py:327-359](file://backend/app/services/citation.py#L327-L359)
## 依赖分析
- 中间件与认证:

View File

@ -15,9 +15,18 @@
- [backend/app/database.py](file://backend/app/database.py)
- [backend/app/config.py](file://backend/app/config.py)
- [backend/app/main.py](file://backend/app/main.py)
- [tests/test_scheduler.py](file://tests/test_scheduler.py)
- [tests/test_queries.py](file://tests/test_queries.py)
</cite>
## 更新摘要
**所做更改**
- 新增了遗留任务检查机制的详细说明,包括每分钟检查 pending 任务的兜底逻辑
- 完善了调度器测试用例的文档,包括启动/关闭测试、查询筛选测试和频率计算测试
- 增强了性能优化策略部分,增加了遗留任务处理和资源管理的说明
- 更新了故障排查指南,增加了遗留任务状态异常的处理方法
- 完善了调度器设计的详细分析,包括双调度器模式和事件循环兼容性
## 目录
1. [引言](#引言)
2. [项目结构](#项目结构)
@ -33,6 +42,8 @@
## 引言
本文件面向任务调度系统的技术与非技术读者,系统性阐述基于 APscheduler 的异步任务调度架构,涵盖调度器配置、任务队列管理、并发控制机制;详述查询任务的生命周期(创建、状态跟踪、执行监控、错误恢复);文档化异步任务处理流程(分发、优先级与资源管理);给出性能优化策略、监控指标与故障处理机制;并提供配置项、扩展方法与调试技巧。
**更新** 本次更新完善了调度器设计细节,新增了遗留任务检查机制和详细的测试用例说明。
## 项目结构
后端采用 FastAPI + SQLAlchemy Async 架构,调度系统位于 workers 子模块,围绕 Query 模型驱动周期性查询任务,通过 CitationEngine 统一执行平台适配器Kimi、文心一言并将结果持久化为 CitationRecord同时维护 QueryTask 任务状态。
@ -40,25 +51,25 @@
graph TB
subgraph "应用入口"
MAIN["app/main.py<br/>生命周期管理"]
end
END
subgraph "调度层"
SCHED["workers/scheduler.py<br/>QueryScheduler"]
end
SCHED["workers/scheduler.py<br/>QueryScheduler<br/>双调度器模式"]
END
subgraph "业务逻辑"
CE["workers/citation_engine.py<br/>CitationEngine"]
SVC["services/query.py<br/>查询服务"]
end
END
subgraph "模型与存储"
Q["models/query.py<br/>查询模型"]
QT["models/query_task.py<br/>任务模型"]
CR["models/citation_record.py<br/>引用记录模型"]
DB["database.py<br/>异步会话"]
end
END
subgraph "平台适配"
BASE["workers/platforms/base.py<br/>适配器基类"]
KIMI["workers/platforms/kimi.py<br/>Kimi适配器"]
WENXIN["workers/platforms/wenxin.py<br/>文心一言适配器"]
end
END
MAIN --> SCHED
SCHED --> CE
CE --> KIMI
@ -72,7 +83,7 @@ SCHED --> DB
CE --> DB
```
图表来源
**图表来源**
- [backend/app/main.py:13-22](file://backend/app/main.py#L13-L22)
- [backend/app/workers/scheduler.py:25-95](file://backend/app/workers/scheduler.py#L25-L95)
- [backend/app/workers/citation_engine.py:148-309](file://backend/app/workers/citation_engine.py#L148-L309)
@ -85,7 +96,7 @@ CE --> DB
- [backend/app/workers/platforms/kimi.py:11-206](file://backend/app/workers/platforms/kimi.py#L11-L206)
- [backend/app/workers/platforms/wenxin.py:11-205](file://backend/app/workers/platforms/wenxin.py#L11-L205)
章节来源
**章节来源**
- [backend/app/main.py:13-22](file://backend/app/main.py#L13-L22)
- [backend/app/workers/scheduler.py:25-95](file://backend/app/workers/scheduler.py#L25-L95)
- [backend/app/workers/citation_engine.py:148-309](file://backend/app/workers/citation_engine.py#L148-L309)
@ -99,14 +110,16 @@ CE --> DB
- [backend/app/workers/platforms/wenxin.py:11-205](file://backend/app/workers/platforms/wenxin.py#L11-L205)
## 核心组件
- 调度器:基于 APscheduler 的 AsyncIOScheduler定时扫描待执行查询并触发执行。
- 调度器:基于 APscheduler 的 AsyncIOScheduler采用双调度器模式,定时扫描待执行查询并触发执行,同时每分钟检查遗留的 pending 任务
- 引擎CitationEngine 负责跨平台查询、品牌匹配、竞争品牌检测、任务状态更新与结果落库。
- 平台适配器KimiAdapter、WenxinAdapter 基于 Playwright 实现网页交互与响应抽取。
- 数据模型Query、QueryTask、CitationRecord 支撑任务生命周期与结果存储。
- 服务与API查询服务与查询 API 路由负责用户侧的查询管理与频率控制。
- 数据库SQLAlchemy Async Engine + Session统一事务与连接管理。
章节来源
**更新** 新增了遗留任务检查机制,通过双调度器模式提高系统的容错性和可靠性。
**章节来源**
- [backend/app/workers/scheduler.py:25-95](file://backend/app/workers/scheduler.py#L25-L95)
- [backend/app/workers/citation_engine.py:148-309](file://backend/app/workers/citation_engine.py#L148-L309)
- [backend/app/workers/platforms/kimi.py:11-206](file://backend/app/workers/platforms/kimi.py#L11-L206)
@ -118,7 +131,7 @@ CE --> DB
- [backend/app/database.py:1-29](file://backend/app/database.py#L1-L29)
## 架构总览
调度系统以“定时扫描 + 异步执行 + 平台适配 + 结果落库”为主线,通过 Query 的状态与时间字段驱动执行节奏QueryTask 记录每次平台执行的状态CitationRecord 记录最终检测结果。
调度系统以"定时扫描 + 异步执行 + 平台适配 + 结果落库"为主线,通过 Query 的状态与时间字段驱动执行节奏QueryTask 记录每次平台执行的状态CitationRecord 记录最终检测结果。新增的遗留任务检查机制提供了额外的容错保护。
```mermaid
sequenceDiagram
@ -127,7 +140,7 @@ participant Scheduler as "QueryScheduler"
participant DB as "数据库<br/>AsyncSession"
participant Engine as "CitationEngine"
participant Platform as "平台适配器<br/>Kimi/Wenxin"
Timer->>Scheduler : "周期触发"
Timer->>Scheduler : "每小时触发"
Scheduler->>DB : "查询 active 且 next_query_at<=now 的 Query"
Scheduler->>Engine : "逐条执行 execute_query(query)"
Engine->>DB : "获取/创建 QueryTask 并置为 running"
@ -138,10 +151,16 @@ Engine->>DB : "写入 CitationRecord"
Engine->>DB : "更新 QueryTask 为 success/fail"
Engine->>DB : "更新 Query 的 last_queried_at/next_query_at"
Engine-->>Scheduler : "返回本次批次记录"
Note over Timer,Scheduler : 额外的遗留任务检查
Timer->>Scheduler : "每分钟触发"
Scheduler->>DB : "查询 pending 且 scheduled_at<=1分钟前的 QueryTask"
Scheduler->>Engine : "重新执行遗留任务"
Engine->>DB : "更新 QueryTask 状态并写入结果"
```
图表来源
**图表来源**
- [backend/app/workers/scheduler.py:30-90](file://backend/app/workers/scheduler.py#L30-L90)
- [backend/app/workers/scheduler.py:95-172](file://backend/app/workers/scheduler.py#L95-L172)
- [backend/app/workers/citation_engine.py:159-234](file://backend/app/workers/citation_engine.py#L159-L234)
- [backend/app/models/query.py:24-31](file://backend/app/models/query.py#L24-L31)
- [backend/app/models/query_task.py:24-32](file://backend/app/models/query_task.py#L24-L32)
@ -150,30 +169,39 @@ Engine-->>Scheduler : "返回本次批次记录"
## 详细组件分析
### 调度器QueryScheduler
- 启动与注册:使用 AsyncIOScheduler 注册每小时触发的任务ID 为“check_queries”名称为“检查并执行到期的查询任务”replace_existing=true 确保重复启动不冲突。
- 事件循环兼容_run_check 封装同步包装,优先获取运行中事件循环,否则使用 asyncio.run 启动新事件循环,保证在不同运行环境下均可执行。
- 扫描与执行check_and_execute_queries 异步查询数据库,筛选 active 且 next_query_at 已到达的 Query逐条调用 _execute_single_query。
- 错误处理:对单条查询异常进行日志记录并继续下一条,避免单点故障影响整体扫描。
- 启动与注册:使用 AsyncIOScheduler 注册两个定时任务,每小时检查到期查询任务,每分钟检查遗留的 pending 任务replace_existing=true 确保重复启动不冲突。
- 事件循环兼容_run_check 和 _run_pending_tasks_check 分别封装同步包装,优先获取运行中事件循环,否则使用 asyncio.run 启动新事件循环,保证在不同运行环境下均可执行。
- 主要扫描与执行check_and_execute_queries 异步查询数据库,筛选 active 且 next_query_at 已到达的 Query逐条调用 _execute_single_query。
- 遗留任务检查check_and_execute_pending_tasks 兜底机制处理超过1分钟仍未执行的 pending 任务,按 query_id 分组并重新执行。
- 错误处理:对单条查询异常进行日志记录并继续下一条,避免单点故障影响整体扫描;遗留任务执行失败时记录错误信息并标记为 failed。
- 关闭流程shutdown 调用 scheduler.shutdown(wait=False) 与 engine.close(),确保资源释放。
```mermaid
flowchart TD
Start(["启动调度器"]) --> AddJob["注册定时任务<br/>每小时触发"]
AddJob --> StartSched["启动 AsyncIOScheduler"]
StartSched --> Loop["周期触发"]
Loop --> Scan["查询数据库<br/>筛选到期的 Query"]
Start(["启动调度器"]) --> AddJobs["注册两个定时任务<br/>每小时检查到期任务<br/>每分钟检查遗留任务"]
AddJobs --> StartSched["启动 AsyncIOScheduler"]
StartSched --> HourlyLoop["每小时触发"]
HourlyLoop --> Scan["查询数据库<br/>筛选到期的 Query"]
Scan --> HasQ{"是否有待执行查询?"}
HasQ -- 否 --> Loop
HasQ -- 否 --> MinuteLoop["等待下一分钟"]
HasQ -- 是 --> ExecOne["逐条执行 _execute_single_query"]
ExecOne --> NextQ["继续下一条"]
NextQ --> Loop
NextQ --> HasQ
MinuteLoop --> PendingCheck["每分钟检查<br/>遗留的 pending 任务"]
PendingCheck --> HasPending{"是否有遗留任务?"}
HasPending -- 否 --> HourlyLoop
HasPending -- 是 --> ReExec["重新执行遗留任务"]
ReExec --> UpdateStatus["更新任务状态并写入结果"]
UpdateStatus --> HasPending
```
图表来源
**图表来源**
- [backend/app/workers/scheduler.py:30-90](file://backend/app/workers/scheduler.py#L30-L90)
- [backend/app/workers/scheduler.py:95-172](file://backend/app/workers/scheduler.py#L95-L172)
章节来源
**章节来源**
- [backend/app/workers/scheduler.py:25-95](file://backend/app/workers/scheduler.py#L25-L95)
- [backend/app/workers/scheduler.py:95-172](file://backend/app/workers/scheduler.py#L95-L172)
### 引擎CitationEngine
- 单查询执行execute_query 接收 Query 与 AsyncSession创建 BrandMatcher遍历 Query.platforms逐平台执行。
@ -204,11 +232,11 @@ CitationEngine --> BrandMatcher : "使用"
CitationEngine --> CompetitorDetector : "使用"
```
图表来源
**图表来源**
- [backend/app/workers/citation_engine.py:148-309](file://backend/app/workers/citation_engine.py#L148-L309)
- [backend/app/workers/citation_engine.py:19-120](file://backend/app/workers/citation_engine.py#L19-L120)
章节来源
**章节来源**
- [backend/app/workers/citation_engine.py:148-309](file://backend/app/workers/citation_engine.py#L148-L309)
### 平台适配器KimiAdapter 与 WenxinAdapter
@ -233,12 +261,12 @@ AD->>AD : "_wait_for_response_stable()"
AD-->>CE : "返回原始响应文本"
```
图表来源
**图表来源**
- [backend/app/workers/platforms/kimi.py:33-125](file://backend/app/workers/platforms/kimi.py#L33-L125)
- [backend/app/workers/platforms/wenxin.py:33-124](file://backend/app/workers/platforms/wenxin.py#L33-L124)
- [backend/app/workers/platforms/base.py:4-18](file://backend/app/workers/platforms/base.py#L4-L18)
章节来源
**章节来源**
- [backend/app/workers/platforms/kimi.py:11-206](file://backend/app/workers/platforms/kimi.py#L11-L206)
- [backend/app/workers/platforms/wenxin.py:11-205](file://backend/app/workers/platforms/wenxin.py#L11-L205)
- [backend/app/workers/platforms/base.py:4-18](file://backend/app/workers/platforms/base.py#L4-L18)
@ -290,12 +318,12 @@ QUERIES ||--o{ QUERY_TASKS : "包含"
QUERIES ||--o{ CITATION_RECORDS : "产生"
```
图表来源
**图表来源**
- [backend/app/models/query.py:11-55](file://backend/app/models/query.py#L11-L55)
- [backend/app/models/query_task.py:11-39](file://backend/app/models/query_task.py#L11-L39)
- [backend/app/models/citation_record.py:11-42](file://backend/app/models/citation_record.py#L11-L42)
章节来源
**章节来源**
- [backend/app/models/query.py:11-55](file://backend/app/models/query.py#L11-L55)
- [backend/app/models/query_task.py:11-39](file://backend/app/models/query_task.py#L11-L39)
- [backend/app/models/citation_record.py:11-42](file://backend/app/models/citation_record.py#L11-L42)
@ -304,7 +332,7 @@ QUERIES ||--o{ CITATION_RECORDS : "产生"
- 服务层:提供查询的增删改查、数量限制校验、频率变更时 next_query_at 重新计算。
- API 层:提供查询列表、创建、获取、更新、删除接口,配合权限与分页参数。
章节来源
**章节来源**
- [backend/app/services/query.py:12-130](file://backend/app/services/query.py#L12-L130)
- [backend/app/api/queries.py:15-86](file://backend/app/api/queries.py#L15-L86)
@ -334,7 +362,7 @@ API["api/queries.py"] --> SVC["services/query.py"]
SVC --> D
```
图表来源
**图表来源**
- [backend/app/workers/scheduler.py:25-95](file://backend/app/workers/scheduler.py#L25-L95)
- [backend/app/workers/citation_engine.py:148-309](file://backend/app/workers/citation_engine.py#L148-L309)
- [backend/app/workers/platforms/kimi.py:11-206](file://backend/app/workers/platforms/kimi.py#L11-L206)
@ -348,7 +376,7 @@ SVC --> D
## 性能考虑
- 调度频率与并发
- 当前调度器每小时扫描一次,适合低至中等并发场景;如需更高吞吐,可考虑缩短周期或引入多进程/多实例。
- 当前调度器采用双调度器模式:每小时扫描到期查询,每分钟检查遗留任务,适合低至中等并发场景;如需更高吞吐,可考虑缩短周期或引入多进程/多实例。
- 数据库访问
- 扫描查询使用 UTC 时间比较,建议在数据库层面为 next_query_at 建立高效索引,减少全表扫描。
- 异步执行
@ -359,12 +387,18 @@ SVC --> D
- 浏览器与 Playwright 生命周期严格管理,关闭时序正确,避免内存与句柄泄漏。
- 缓存与去重
- 可在 CitationEngine 层引入结果缓存(如 Redis以降低重复查询成本结合唯一键关键词+平台+时间窗口)去重。
- 遗留任务处理
- 新增的每分钟遗留任务检查机制提供了额外的容错保护,确保即使主调度器出现问题,任务仍能在合理时间内得到执行。
**更新** 新增了遗留任务处理机制的性能考虑,提高了系统的整体可靠性。
## 故障排查指南
- 调度器未启动
- 检查 lifespan 中是否调用 start(),以及是否在生产环境正确部署。
- 查询未被执行
- 核查 Query.status 是否为 activenext_query_at 是否已到达;确认数据库时区与 UTC 一致性。
- 遗留任务异常
- 检查 QueryTask 状态是否长期为 pending确认每分钟遗留任务检查机制是否正常工作查看日志中遗留任务重新执行的记录。
- 平台适配器异常
- Playwright 未安装:参考适配器错误提示运行安装命令;网络超时:调整等待稳定阈值与超时参数。
- 任务状态异常
@ -372,14 +406,19 @@ SVC --> D
- 结果缺失
- 确认 CitationRecord 写入逻辑与 QueryTask 成功分支;失败分支也会写入一条 cited=False 的记录作为占位。
章节来源
**更新** 新增了遗留任务相关的故障排查指导。
**章节来源**
- [backend/app/workers/scheduler.py:42-90](file://backend/app/workers/scheduler.py#L42-L90)
- [backend/app/workers/scheduler.py:95-172](file://backend/app/workers/scheduler.py#L95-L172)
- [backend/app/workers/citation_engine.py:175-234](file://backend/app/workers/citation_engine.py#L175-L234)
- [backend/app/workers/platforms/kimi.py:21-48](file://backend/app/workers/platforms/kimi.py#L21-L48)
- [backend/app/workers/platforms/wenxin.py:21-48](file://backend/app/workers/platforms/wenxin.py#L21-L48)
## 结论
该调度系统以轻量、清晰的模块划分实现了“定时扫描 + 异步执行 + 平台适配 + 结果落库”的完整闭环。通过 Query/QueryTask/CitationRecord 的三层状态与数据模型,系统具备良好的可观测性与可扩展性。建议在高并发场景下引入并行化与缓存策略,并持续完善监控与告警体系。
该调度系统以轻量、清晰的模块划分实现了"定时扫描 + 异步执行 + 平台适配 + 结果落库"的完整闭环。通过 Query/QueryTask/CitationRecord 的三层状态与数据模型,系统具备良好的可观测性与可扩展性。新增的双调度器模式和遗留任务检查机制进一步提高了系统的可靠性和容错能力。建议在高并发场景下引入并行化与缓存策略,并持续完善监控与告警体系。
**更新** 本次更新完善了调度器设计细节,增强了系统的容错性和可靠性。
## 附录
@ -388,7 +427,7 @@ SVC --> D
- 日志与中间件FastAPI CORS 配置(允许本地前端跨域)
- 运行时生命周期lifespan 在应用启动时启动调度器,在关闭时优雅退出
章节来源
**章节来源**
- [backend/app/config.py:7-14](file://backend/app/config.py#L7-L14)
- [backend/app/main.py:24-42](file://backend/app/main.py#L24-L42)
- [backend/app/main.py:13-22](file://backend/app/main.py#L13-L22)
@ -401,7 +440,7 @@ SVC --> D
- 结果聚合与报表
- 基于 CitationRecord 与 QueryTask 构建统计视图,输出趋势与失败率报表。
章节来源
**章节来源**
- [backend/app/workers/platforms/base.py:4-18](file://backend/app/workers/platforms/base.py#L4-L18)
- [backend/app/workers/citation_engine.py:152-157](file://backend/app/workers/citation_engine.py#L152-L157)
- [backend/app/models/citation_record.py:11-42](file://backend/app/models/citation_record.py#L11-L42)
@ -411,7 +450,36 @@ SVC --> D
- 启用数据库回显:在数据库引擎创建时开启 echo当前为关闭便于生产环境降噪
- 逐步验证:先验证调度器扫描逻辑,再验证单平台适配器,最后验证 CitationEngine 整体流程
- 单元测试:利用测试夹具模拟 Query 对象,验证 API 与服务层行为
- 调度器测试:使用专门的测试用例验证调度器启动/关闭、查询筛选和频率计算功能
章节来源
**更新** 新增了调度器测试相关的调试技巧。
**章节来源**
- [backend/app/database.py:6-10](file://backend/app/database.py#L6-L10)
- [tests/test_queries.py:10-154](file://tests/test_queries.py#L10-L154)
- [tests/test_scheduler.py:17-123](file://tests/test_scheduler.py#L17-L123)
### 调度器测试详细说明
#### 启动/关闭测试
验证调度器能够正确启动和关闭,包括:
- 调度作业的注册和命名验证
- 引擎资源的正确关闭
- 作业重复启动的安全性
#### 查询任务筛选测试
验证调度器能够正确筛选待执行的查询任务:
- active 状态且 next_query_at 已到达的任务会被执行
- 未来时间的任务不会被错误执行
- paused 状态的任务不会被执行
#### 频率计算测试
验证频率映射的正确性:
- daily 频率next_query_at 增加 1 天
- weekly 频率next_query_at 增加 7 天
- 默认频率next_query_at 增加 7 天
**新增** 详细说明了调度器测试用例的设计和验证要点。
**章节来源**
- [tests/test_scheduler.py:17-123](file://tests/test_scheduler.py#L17-L123)

View File

@ -21,17 +21,25 @@
- [app/(dashboard)/dashboard/citations/page.tsx](file://frontend/app/(dashboard)/dashboard/citations/page.tsx)
</cite>
## 更新摘要
**所做更改**
- 新增了仪表板页面中UI组件的实际使用示例分析
- 扩展了按钮、输入框、选择器、对话框、表格等组件的具体应用场景
- 增加了组件在真实业务场景中的组合使用模式
- 完善了组件可访问性与状态管理的最佳实践
## 目录
1. [简介](#简介)
2. [项目结构](#项目结构)
3. [核心组件](#核心组件)
4. [架构概览](#架构概览)
5. [详细组件分析](#详细组件分析)
6. [依赖关系分析](#依赖关系分析)
7. [性能考虑](#性能考虑)
8. [故障排除指南](#故障排除指南)
9. [结论](#结论)
10. [附录](#附录)
6. [实际应用示例](#实际应用示例)
7. [依赖关系分析](#依赖关系分析)
8. [性能考虑](#性能考虑)
9. [故障排除指南](#故障排除指南)
10. [结论](#结论)
11. [附录](#附录)
## 简介
本UI组件库以Radix UI为核心结合Tailwind CSS实现一致、可访问且可定制的基础组件。组件遵循以下设计原则
@ -91,8 +99,8 @@ BADGE --> UTILS
**图表来源**
- [app/layout.tsx:1-37](file://frontend/app/layout.tsx#L1-L37)
- [components/providers.tsx:1-9](file://frontend/components/providers.tsx#L1-L9)
- [app/(dashboard)/dashboard/page.tsx](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L156)
- [app/(dashboard)/dashboard/citations/page.tsx](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L282)
- [app/(dashboard)/dashboard/page.tsx](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L227)
- [app/(dashboard)/dashboard/citations/page.tsx](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L294)
- [lib/utils.ts:1-7](file://frontend/lib/utils.ts#L1-L7)
**章节来源**
@ -106,19 +114,19 @@ BADGE --> UTILS
- 功能:承载点击动作,支持多种外观与尺寸
- 关键属性variant外观、size尺寸、asChild语义化渲染
- 可访问性继承原生button语义支持聚焦与键盘激活
- 使用示例路径:[按钮使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L187-L192)
- 使用示例路径:[按钮使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L199-L204)
- 输入框 Input
- 功能:文本输入,支持禁用与聚焦态样式
- 关键属性type、className等原生属性透传
- 可访问性原生语义配合Label使用提升可访问性
- 使用示例路径:[输入框使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L170-L184)
- 使用示例路径:[输入框使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L182-L196)
- 选择器 Select
- 功能:下拉选择,支持滚动按钮与多级选项
- 关键属性:触发器、内容区、项、分隔符、滚动按钮
- 可访问性基于Radix UI的键盘导航与焦点管理
- 使用示例路径:[选择器使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L138-L166)
- 使用示例路径:[选择器使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L150-L162)
- 对话框 Dialog
- 功能:模态对话,包含覆盖层、内容区、标题与描述
@ -135,12 +143,12 @@ BADGE --> UTILS
- 卡片 Card
- 功能:容器组件,支持头部、标题、描述、内容与底部
- 关键属性通用HTML属性透传
- 使用示例路径:[卡片使用示例](file://frontend/app/(dashboard)/dashboard/page.tsx#L106-L120)
- 使用示例路径:[卡片使用示例](file://frontend/app/(dashboard)/dashboard/page.tsx#L177-L191)
- 表格 Table
- 功能:数据表格,支持表头、表体、表尾、行、单元格与标题
- 关键属性通用HTML属性透传
- 使用示例路径:[表格使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L217-L274)
- 使用示例路径:[表格使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L229-L286)
- 标签 Tabs
- 功能:标签页切换,包含列表、触发器与内容区
@ -150,12 +158,12 @@ BADGE --> UTILS
- 标签 Label
- 功能:表单控件标签,与输入控件建立关联
- 关键属性基于Radix UI的peer-disabled语义
- 使用示例路径:[标签使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L137-L150)
- 使用示例路径:[标签使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L149-L150)
- 徽章 Badge
- 功能:状态或分类标记
- 关键属性variant外观
- 使用示例路径:[徽章使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L256-L266)
- 使用示例路径:[徽章使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L268-L278)
**章节来源**
- [components/ui/button.tsx:1-57](file://frontend/components/ui/button.tsx#L1-L57)
@ -210,7 +218,7 @@ UTILS --> CLX
- 复杂度与性能
- O(1) 渲染开销,变体计算在编译期完成
- 使用示例
- [按钮使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L187-L192)
- [按钮使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L199-L204)
```mermaid
classDiagram
@ -274,7 +282,7 @@ C->>P : 关闭对话框
- 键盘导航上下左右移动、Enter确认、Esc返回
- 焦点管理:打开时聚焦首个项,关闭时返回触发器
- 使用示例
- [下拉菜单使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L138-L166)
- [下拉菜单使用示例:1-201](file://frontend/components/ui/dropdown-menu.tsx#L1-L201)
```mermaid
flowchart TD
@ -305,7 +313,7 @@ Confirm --> Close
- 键盘导航Tab进入、方向键选择、Enter确认
- 屏幕阅读器通过SelectValue与ItemText传达当前值与选项
- 使用示例
- [选择器使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L138-L166)
- [选择器使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L150-L162)
```mermaid
sequenceDiagram
@ -334,7 +342,7 @@ I->>T : 更新值并关闭
- 可访问性
- 表格语义清晰,适合屏幕阅读器解析
- 使用示例
- [表格使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L217-L274)
- [表格使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L229-L286)
```mermaid
flowchart TD
@ -377,11 +385,11 @@ L->>C : 显示对应内容
**章节来源**
- [components/ui/tabs.tsx:1-56](file://frontend/components/ui/tabs.tsx#L1-L56)
### 卡 Card
### 卡ード Card
- 设计要点
- 分离头部、标题、描述、内容与底部区域,便于组合
- 使用示例
- [卡片使用示例](file://frontend/app/(dashboard)/dashboard/page.tsx#L106-L120)
- [卡片使用示例](file://frontend/app/(dashboard)/dashboard/page.tsx#L177-L191)
```mermaid
classDiagram
@ -410,7 +418,7 @@ Card --> CardFooter
- 设计要点
- 基于peer-disabled语义与受控输入联动
- 使用示例
- [标签使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L137-L150)
- [标签使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L149-L150)
```mermaid
flowchart TD
@ -430,7 +438,7 @@ Disabled --> |否| LabelEnabled["标签启用样式"]
- 设计要点
- 通过变体系统提供默认/次要/破坏/描边等外观
- 使用示例
- [徽章使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L256-L266)
- [徽章使用示例](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L268-L278)
```mermaid
classDiagram
@ -445,6 +453,68 @@ class Badge {
**章节来源**
- [components/ui/badge.tsx:1-37](file://frontend/components/ui/badge.tsx#L1-L37)
## 实际应用示例
### 仪表板页面组件应用
仪表板页面展示了组件在真实业务场景中的综合应用:
#### 数据统计卡片组合
- **组件组合**Card + CardHeader + CardTitle + CardContent
- **应用场景**:展示查询次数、引用次数、引用率、平均位置等关键指标
- **实现特点**:使用动态图标与颜色方案增强视觉表达
#### 图表集成应用
- **组件组合**Card + Chart组件
- **应用场景**:展示引用趋势和平台对比数据
- **实现特点**:通过条件渲染处理空数据状态
#### 完整的数据展示流程
```mermaid
flowchart TD
Loading["加载状态"] --> Empty{"数据为空?"}
Empty --> |是| EmptyState["空状态展示"]
Empty --> |否| DataDisplay["数据展示"]
DataDisplay --> StatCards["统计卡片"]
DataDisplay --> Charts["图表展示"]
EmptyState --> CreateQuery["创建查询引导"]
```
**图表来源**
- [app/(dashboard)/dashboard/page.tsx:49-137](file://frontend/app/(dashboard)/dashboard/page.tsx#L49-L137)
#### 引用记录页面组件应用
引用记录页面体现了组件在复杂数据管理场景中的应用:
##### 筛选表单组合
- **组件组合**Card + Label + Select + Input + Button
- **应用场景**:查询词筛选、平台筛选、日期范围筛选
- **实现特点**:响应式网格布局,支持表单重置
##### 数据表格应用
- **组件组合**Table + TableRow + TableCell + Badge
- **应用场景**:展示引用检测结果的完整列表
- **实现特点**:支持横向滚动,徽章用于状态标识
```mermaid
sequenceDiagram
participant User as "用户"
participant Form as "筛选表单"
participant API as "API服务"
participant Table as "数据表格"
User->>Form : 设置筛选条件
Form->>API : 发送筛选请求
API-->>Form : 返回筛选结果
Form->>Table : 更新表格数据
Table->>User : 显示筛选后的记录
```
**图表来源**
- [app/(dashboard)/dashboard/citations/page.tsx:147-207](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L147-L207)
**章节来源**
- [app/(dashboard)/dashboard/page.tsx:1-227](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L227)
- [app/(dashboard)/dashboard/citations/page.tsx:1-294](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L294)
## 依赖关系分析
- 组件依赖Radix UI实现可访问性与状态管理
- 类名合并依赖clsx与tailwind-merge确保样式不冲突
@ -512,7 +582,7 @@ BADGE --> UTILS
- [lib/utils.ts:4-6](file://frontend/lib/utils.ts#L4-L6)
## 结论
本UI组件库以Radix UI为基础结合Tailwind CSS与变体系统提供了高可访问性、一致性强且易于扩展的组件集合。通过清晰的组合模式与严格的样式约定能够支撑从简单表单到复杂数据面板的各类界面需求。
本UI组件库以Radix UI为基础结合Tailwind CSS与变体系统提供了高可访问性、一致性强且易于扩展的组件集合。通过清晰的组合模式与严格的样式约定能够支撑从简单表单到复杂数据面板的各类界面需求。新增的仪表板页面使用示例进一步验证了组件在真实业务场景中的实用性与灵活性。
## 附录
@ -542,6 +612,6 @@ BADGE --> UTILS
- 保持过渡时长与缓动曲线一致
**章节来源**
- [app/(dashboard)/dashboard/page.tsx](file://frontend/app/(dashboard)/dashboard/page.tsx#L106-L152)
- [app/(dashboard)/dashboard/citations/page.tsx](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L135-L194)
- [app/(dashboard)/dashboard/page.tsx:1-227](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L227)
- [app/(dashboard)/dashboard/citations/page.tsx:1-294](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L294)
- [tailwind.config.ts:10-54](file://frontend/tailwind.config.ts#L10-L54)

View File

@ -17,8 +17,17 @@
- [frontend/lib/platforms.ts](file://frontend/lib/platforms.ts)
- [frontend/lib/utils.ts](file://frontend/lib/utils.ts)
- [frontend/components/charts/trend-chart.tsx](file://frontend/components/charts/trend-chart.tsx)
- [frontend/components/charts/platform-chart.tsx](file://frontend/components/charts/platform-chart.tsx)
</cite>
## 更新摘要
**变更内容**
- 新增完整的仪表板页面组件系统实现
- 更新数据总览页、查询管理页、引用记录页和报告导出页的具体实现
- 完善图表组件的详细分析和使用说明
- 增强API客户端的接口文档和错误处理机制
- 优化页面级数据获取策略和状态管理
## 目录
1. [引言](#引言)
2. [项目结构](#项目结构)
@ -34,6 +43,8 @@
## 引言
本文件系统性梳理 GEO 平台前端页面组件的设计与实现覆盖仪表板、查询管理、引用数据、报告导出与设置页面。内容包括页面布局与导航结构、用户体验流程、页面级数据获取策略、状态管理与错误边界处理、页面间导航逻辑与路由参数传递、页面生命周期管理、性能优化与懒加载策略、SEO 配置以及开发规范与最佳实践。
**更新** 本次更新反映了应用的完整实现所有页面组件均已开发完成并集成到Next.js应用架构中包括数据总览、查询管理、引用记录、报告导出和设置页面的完整功能实现。
## 项目结构
前端采用 Next.js App Router 的分组路由模式,将认证相关页面置于 `(auth)` 分组,仪表板相关页面置于 `(dashboard)` 分组。根布局负责全局样式与 Provider 包装;仪表板布局负责权限校验、侧边栏与头部导航的统一渲染。
@ -69,22 +80,29 @@ pages --> settings_page["设置<br/>settings/page.tsx"]
- UI 组件库
- 表格组件:封装响应式表格容器与表头/体/行/单元格等基础结构。
- 对话框组件:基于 Radix UI 实现模态对话框,支持触发器、内容、标题与描述。
- 图表组件
- 趋势图组件:基于 Recharts 实现折线图展示过去30天引用趋势。
- 平台对比图:基于 Recharts 实现柱状图,展示各平台引用率对比。
- 工具与常量
- 平台映射:提供平台键值到中文名称的映射与平台选项列表。
- 工具函数:类名合并工具,用于组合 Tailwind 类。
- API 客户端:统一封装鉴权头、错误处理与各模块接口(认证、查询、引用、报告)。
**更新** 新增了图表组件的详细实现分析,包括数据结构定义、响应式容器配置和交互功能。
**章节来源**
- [frontend/components/layout/header.tsx:1-30](file://frontend/components/layout/header.tsx#L1-L30)
- [frontend/components/layout/sidebar.tsx:1-54](file://frontend/components/layout/sidebar.tsx#L1-L54)
- [frontend/components/ui/table.tsx:1-118](file://frontend/components/ui/table.tsx#L1-L118)
- [frontend/components/ui/dialog.tsx:1-123](file://frontend/components/ui/dialog.tsx#L1-L123)
- [frontend/components/charts/trend-chart.tsx:1-60](file://frontend/components/charts/trend-chart.tsx#L1-L60)
- [frontend/components/charts/platform-chart.tsx:1-68](file://frontend/components/charts/platform-chart.tsx#L1-L68)
- [frontend/lib/platforms.ts:1-18](file://frontend/lib/platforms.ts#L1-L18)
- [frontend/lib/utils.ts:1-7](file://frontend/lib/utils.ts#L1-L7)
- [frontend/lib/api.ts:1-58](file://frontend/lib/api.ts#L1-L58)
- [frontend/lib/api.ts:1-79](file://frontend/lib/api.ts#L1-L79)
## 架构概览
整体采用“布局层 + 页面层 + 组件层 + 工具层”的分层设计。页面层通过客户端会话获取令牌,调用 API 客户端进行数据拉取与写入UI 组件层提供可复用的基础控件工具层提供通用能力类名合并、平台映射、API 封装)图表组件独立封装,按需渲染。
整体采用"布局层 + 页面层 + 组件层 + 工具层"的分层设计。页面层通过客户端会话获取令牌,调用 API 客户端进行数据拉取与写入UI 组件层提供可复用的基础控件工具层提供通用能力类名合并、平台映射、API 封装)图表组件独立封装,按需渲染。
```mermaid
graph TB
@ -105,6 +123,7 @@ subgraph "组件层"
table["表格组件"]
dialog["对话框组件"]
trend_chart["趋势图组件"]
platform_chart["平台对比图组件"]
end
subgraph "工具层"
utils["工具函数"]
@ -120,6 +139,7 @@ dashboard_layout --> citations_page
dashboard_layout --> reports_page
dashboard_layout --> settings_page
dashboard_page --> trend_chart
dashboard_page --> platform_chart
queries_page --> table
queries_page --> dialog
citations_page --> table
@ -130,6 +150,7 @@ api_client --> utils
table --> utils
dialog --> utils
trend_chart --> utils
platform_chart --> utils
```
**图示来源**
@ -137,17 +158,18 @@ trend_chart --> utils
- [frontend/app/(dashboard)/layout.tsx:1-27](file://frontend/app/(dashboard)/layout.tsx#L1-L27)
- [frontend/components/layout/header.tsx:1-30](file://frontend/components/layout/header.tsx#L1-L30)
- [frontend/components/layout/sidebar.tsx:1-54](file://frontend/components/layout/sidebar.tsx#L1-L54)
- [frontend/app/(dashboard)/dashboard/page.tsx:1-156](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L156)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:1-461](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L1-L461)
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:1-282](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L282)
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:1-198](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L1-L198)
- [frontend/app/(dashboard)/dashboard/page.tsx:1-227](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L227)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:1-526](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L1-L526)
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:1-294](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L294)
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:1-200](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L1-L200)
- [frontend/app/(dashboard)/dashboard/settings/page.tsx:1-172](file://frontend/app/(dashboard)/dashboard/settings/page.tsx#L1-L172)
- [frontend/components/ui/table.tsx:1-118](file://frontend/components/ui/table.tsx#L1-L118)
- [frontend/components/ui/dialog.tsx:1-123](file://frontend/components/ui/dialog.tsx#L1-L123)
- [frontend/components/charts/trend-chart.tsx:1-60](file://frontend/components/charts/trend-chart.tsx#L1-L60)
- [frontend/components/charts/platform-chart.tsx:1-68](file://frontend/components/charts/platform-chart.tsx#L1-L68)
- [frontend/lib/utils.ts:1-7](file://frontend/lib/utils.ts#L1-L7)
- [frontend/lib/platforms.ts:1-18](file://frontend/lib/platforms.ts#L1-L18)
- [frontend/lib/api.ts:1-58](file://frontend/lib/api.ts#L1-L58)
- [frontend/lib/api.ts:1-79](file://frontend/lib/api.ts#L1-L79)
## 详细组件分析
@ -185,16 +207,17 @@ end
```
**图示来源**
- [frontend/app/(dashboard)/dashboard/page.tsx:20-44](file://frontend/app/(dashboard)/dashboard/page.tsx#L20-L44)
- [frontend/lib/api.ts:46-49](file://frontend/lib/api.ts#L46-L49)
- [frontend/app/(dashboard)/dashboard/page.tsx:29-47](file://frontend/app/(dashboard)/dashboard/page.tsx#L29-L47)
- [frontend/lib/api.ts:67-70](file://frontend/lib/api.ts#L67-L70)
**章节来源**
- [frontend/app/(dashboard)/dashboard/page.tsx:1-156](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L156)
- [frontend/app/(dashboard)/dashboard/page.tsx:1-227](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L227)
- [frontend/components/charts/trend-chart.tsx:1-60](file://frontend/components/charts/trend-chart.tsx#L1-L60)
- [frontend/components/charts/platform-chart.tsx:1-68](file://frontend/components/charts/platform-chart.tsx#L1-L68)
### 查询管理页面
- 页面职责
- 列表展示查询词,支持新增、编辑、删除与“立即查询”操作。
- 列表展示查询词,支持新增、编辑、删除与"立即查询"操作。
- 提供平台多选、频率选择、品牌别名输入等配置项。
- 数据流
- 客户端加载查询词列表;新增/编辑通过 PUT/POST 写入;删除通过 DELETE。
@ -223,11 +246,11 @@ Empty --> AddEdit
```
**图示来源**
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:79-170](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L79-L170)
- [frontend/lib/api.ts:37-45](file://frontend/lib/api.ts#L37-L45)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:143-156](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L143-L156)
- [frontend/lib/api.ts:56-66](file://frontend/lib/api.ts#L56-L66)
**章节来源**
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:1-461](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L1-L461)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:1-526](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L1-L526)
- [frontend/components/ui/dialog.tsx:1-123](file://frontend/components/ui/dialog.tsx#L1-L123)
- [frontend/lib/platforms.ts:1-18](file://frontend/lib/platforms.ts#L1-L18)
@ -262,11 +285,11 @@ A-->>P : 渲染表格
```
**图示来源**
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:45-98](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L45-L98)
- [frontend/lib/api.ts:46-49](file://frontend/lib/api.ts#L46-L49)
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:75-105](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L75-L105)
- [frontend/lib/api.ts:67-70](file://frontend/lib/api.ts#L67-L70)
**章节来源**
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:1-282](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L282)
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:1-294](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L294)
- [frontend/components/ui/table.tsx:1-118](file://frontend/components/ui/table.tsx#L1-L118)
### 报告导出页面
@ -300,11 +323,11 @@ end
```
**图示来源**
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:25-93](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L25-L93)
- [frontend/lib/api.ts:51-56](file://frontend/lib/api.ts#L51-L56)
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:50-94](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L50-L94)
- [frontend/lib/api.ts:72-77](file://frontend/lib/api.ts#L72-L77)
**章节来源**
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:1-198](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L1-L198)
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:1-200](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L1-L200)
### 设置页面
- 页面职责
@ -344,22 +367,24 @@ utils --> dialog
platforms["平台映射"] --> queries_page
platforms --> citations_page
trend_chart["趋势图组件"] --> dashboard_page
platform_chart["平台对比图组件"] --> dashboard_page
```
**图示来源**
- [frontend/lib/api.ts:1-58](file://frontend/lib/api.ts#L1-L58)
- [frontend/app/(dashboard)/dashboard/page.tsx:1-156](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L156)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:1-461](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L1-L461)
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:1-282](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L282)
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:1-198](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L1-L198)
- [frontend/lib/api.ts:1-79](file://frontend/lib/api.ts#L1-L79)
- [frontend/app/(dashboard)/dashboard/page.tsx:1-227](file://frontend/app/(dashboard)/dashboard/page.tsx#L1-L227)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:1-526](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L1-L526)
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:1-294](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L1-L294)
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:1-200](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L1-L200)
- [frontend/components/ui/table.tsx:1-118](file://frontend/components/ui/table.tsx#L1-L118)
- [frontend/components/ui/dialog.tsx:1-123](file://frontend/components/ui/dialog.tsx#L1-L123)
- [frontend/lib/utils.ts:1-7](file://frontend/lib/utils.ts#L1-L7)
- [frontend/lib/platforms.ts:1-18](file://frontend/lib/platforms.ts#L1-L18)
- [frontend/components/charts/trend-chart.tsx:1-60](file://frontend/components/charts/trend-chart.tsx#L1-L60)
- [frontend/components/charts/platform-chart.tsx:1-68](file://frontend/components/charts/platform-chart.tsx#L1-L68)
**章节来源**
- [frontend/lib/api.ts:1-58](file://frontend/lib/api.ts#L1-L58)
- [frontend/lib/api.ts:1-79](file://frontend/lib/api.ts#L1-L79)
- [frontend/lib/utils.ts:1-7](file://frontend/lib/utils.ts#L1-L7)
- [frontend/lib/platforms.ts:1-18](file://frontend/lib/platforms.ts#L1-L18)
@ -377,8 +402,8 @@ trend_chart["趋势图组件"] --> dashboard_page
**章节来源**
- [frontend/app/layout.tsx:17-20](file://frontend/app/layout.tsx#L17-L20)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:96-113](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L96-L113)
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:73-94](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L73-L94)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:104-121](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L104-L121)
- [frontend/app/(dashboard)/dashboard/citations/page.tsx:65-73](file://frontend/app/(dashboard)/dashboard/citations/page.tsx#L65-L73)
## 故障排除指南
- 登录态缺失
@ -392,12 +417,12 @@ trend_chart["趋势图组件"] --> dashboard_page
**章节来源**
- [frontend/app/(dashboard)/layout.tsx:12-15](file://frontend/app/(dashboard)/layout.tsx#L12-L15)
- [frontend/lib/api.ts:3-21](file://frontend/lib/api.ts#L3-L21)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:133-142](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L133-L142)
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:49-93](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L49-L93)
- [frontend/lib/api.ts:3-40](file://frontend/lib/api.ts#L3-L40)
- [frontend/app/(dashboard)/dashboard/queries/page.tsx:143-156](file://frontend/app/(dashboard)/dashboard/queries/page.tsx#L143-L156)
- [frontend/app/(dashboard)/dashboard/reports/page.tsx:50-94](file://frontend/app/(dashboard)/dashboard/reports/page.tsx#L50-L94)
## 结论
本设计以清晰的分层与职责划分实现了仪表板、查询管理、引用记录、报告导出与设置页面的组件化构建。通过统一的 API 客户端与 UI 组件库,提升了可维护性与一致性;结合会话驱动的数据获取与完善的错误边界处理,保障了用户体验与稳定性。后续可在 SEO、国际化、缓存策略与状态持久化方面进一步完善。
本设计以清晰的分层与职责划分实现了仪表板、查询管理、引用记录、报告导出与设置页面的组件化构建。通过统一的 API 客户端与 UI 组件库,提升了可维护性与一致性;结合会话驱动的数据获取与完善的错误边界处理,保障了用户体验与稳定性。所有页面组件均已实现并集成到Next.js应用架构中包括数据总览、查询管理、引用记录、报告导出和设置页面的完整功能。后续可在 SEO、国际化、缓存策略与状态持久化方面进一步完善。
## 附录
- 开发规范与最佳实践

View File

@ -1,7 +1,7 @@
# 数据模型
<cite>
**本文引用的文件**
**本文引用的文件**
- [backend/app/models/__init__.py](file://backend/app/models/__init__.py)
- [backend/app/models/user.py](file://backend/app/models/user.py)
- [backend/app/models/query.py](file://backend/app/models/query.py)
@ -14,9 +14,23 @@
- [backend/app/schemas/citation.py](file://backend/app/schemas/citation.py)
- [backend/app/services/query.py](file://backend/app/services/query.py)
- [backend/app/api/queries.py](file://backend/app/api/queries.py)
- [backend/app/services/citation.py](file://backend/app/services/citation.py)
- [backend/app/api/citations.py](file://backend/app/api/citations.py)
- [backend/app/config.py](file://backend/app/config.py)
- [backend/app/api/deps.py](file://backend/app/api/deps.py)
</cite>
## 更新摘要
**所做更改**
- 完善了用户模型的字段映射和关系配置说明
- 详细补充了查询模型的索引策略和生命周期管理
- 新增了查询任务模型的状态机和任务调度机制
- 完善了引用记录模型的统计分析功能说明
- 补充了订阅模型的支付信息字段和状态管理
- 增强了模型间关系映射和级联策略的技术细节
- 完善了序列化、反序列化与数据验证机制
- 新增了使用示例和最佳实践指南
## 目录
1. [简介](#简介)
2. [项目结构](#项目结构)
@ -49,16 +63,20 @@ DB["PostgreSQL"]
end
subgraph "服务层"
SVCQ["Query 服务"]
SVCC["Citation 服务"]
end
subgraph "API 层"
APIQ["Queries API"]
APIC["Citations API"]
end
U --> Q
Q --> CR
Q --> QT
U --> S
SVCQ --> Q
SVCC --> CR
APIQ --> SVCQ
APIC --> SVCC
Q --- DB
CR --- DB
QT --- DB
@ -66,17 +84,19 @@ S --- DB
U --- DB
```
图表来源
**图表来源**
- [backend/app/models/user.py:11-41](file://backend/app/models/user.py#L11-L41)
- [backend/app/models/query.py:11-55](file://backend/app/models/query.py#L11-L55)
- [backend/app/models/query_task.py:11-39](file://backend/app/models/query_task.py#L11-L39)
- [backend/app/models/citation_record.py:11-42](file://backend/app/models/citation_record.py#L11-L42)
- [backend/app/models/subscription.py:11-37](file://backend/app/models/subscription.py#L11-L37)
- [backend/app/database.py:1-29](file://backend/app/database.py#L1-L29)
- [backend/app/services/query.py:1-130](file://backend/app/services/query.py#L1-L130)
- [backend/app/services/query.py:1-123](file://backend/app/services/query.py#L1-L123)
- [backend/app/services/citation.py:1-359](file://backend/app/services/citation.py#L1-L359)
- [backend/app/api/queries.py:1-86](file://backend/app/api/queries.py#L1-L86)
- [backend/app/api/citations.py:1-78](file://backend/app/api/citations.py#L1-L78)
章节来源
**章节来源**
- [backend/app/models/__init__.py:1-14](file://backend/app/models/__init__.py#L1-L14)
- [backend/app/database.py:1-29](file://backend/app/database.py#L1-L29)
@ -99,7 +119,7 @@ U --- DB
- 记录用户的订阅计划、有效期、支付信息与状态。
- 关系:多对一到 User删除时级联删除。
章节来源
**章节来源**
- [backend/app/models/user.py:11-41](file://backend/app/models/user.py#L11-L41)
- [backend/app/models/query.py:11-55](file://backend/app/models/query.py#L11-L55)
- [backend/app/models/query_task.py:11-39](file://backend/app/models/query_task.py#L11-L39)
@ -175,7 +195,7 @@ QUERIES ||--o{ CITATION_RECORDS : "产生"
QUERIES ||--o{ QUERY_TASKS : "拆分执行"
```
图表来源
**图表来源**
- [backend/alembic/versions/488d0bd5ab01_initial_migration.py:21-128](file://backend/alembic/versions/488d0bd5ab01_initial_migration.py#L21-L128)
- [backend/app/models/user.py:11-41](file://backend/app/models/user.py#L11-L41)
- [backend/app/models/query.py:11-55](file://backend/app/models/query.py#L11-L55)
@ -199,7 +219,7 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- 最佳实践
- 在创建/更新用户时避免直接修改计划或配额,建议通过专门的服务接口进行校验与审计。
章节来源
**章节来源**
- [backend/app/models/user.py:11-41](file://backend/app/models/user.py#L11-L41)
- [backend/alembic/versions/488d0bd5ab01_initial_migration.py:23-37](file://backend/alembic/versions/488d0bd5ab01_initial_migration.py#L23-L37)
@ -219,7 +239,7 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- 最佳实践
- 更新频率时同步更新 next_query_at在创建查询前检查用户配额。
章节来源
**章节来源**
- [backend/app/models/query.py:11-55](file://backend/app/models/query.py#L11-L55)
- [backend/alembic/versions/488d0bd5ab01_initial_migration.py:39-59](file://backend/alembic/versions/488d0bd5ab01_initial_migration.py#L39-L59)
@ -239,7 +259,7 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- 最佳实践
- 任务状态机pending -> started -> completed 或 failed失败时记录 error_message。
章节来源
**章节来源**
- [backend/app/models/query_task.py:11-39](file://backend/app/models/query_task.py#L11-L39)
- [backend/alembic/versions/488d0bd5ab01_initial_migration.py:80-94](file://backend/alembic/versions/488d0bd5ab01_initial_migration.py#L80-L94)
@ -259,7 +279,7 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- 最佳实践
- 统计时按平台与日期聚合,结合索引提升性能。
章节来源
**章节来源**
- [backend/app/models/citation_record.py:11-42](file://backend/app/models/citation_record.py#L11-L42)
- [backend/alembic/versions/488d0bd5ab01_initial_migration.py:61-78](file://backend/alembic/versions/488d0bd5ab01_initial_migration.py#L61-L78)
@ -277,7 +297,7 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- 最佳实践
- 订阅到期后应自动调整用户配额与功能权限。
章节来源
**章节来源**
- [backend/app/models/subscription.py:11-37](file://backend/app/models/subscription.py#L11-L37)
- [backend/alembic/versions/488d0bd5ab01_initial_migration.py:96-111](file://backend/alembic/versions/488d0bd5ab01_initial_migration.py#L96-L111)
@ -285,11 +305,11 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- 外键约束
- 所有子表均设置外键指向父表主键,并在删除时采用 CASCADE确保数据一致性。
- 级联删除孤儿对象
- User 的 queries、subscriptionsQuery 的 citation_records、query_tasks 均配置了“all, delete-orphan”,保证删除父对象时自动清理其子对象。
- User 的 queries、subscriptionsQuery 的 citation_records、query_tasks 均配置了"all, delete-orphan",保证删除父对象时自动清理其子对象。
- 索引策略
- 查询高频字段(如 user_id、status、next_query_at、queried_at、platform建立索引提升查询性能。
章节来源
**章节来源**
- [backend/app/models/user.py:35-40](file://backend/app/models/user.py#L35-L40)
- [backend/app/models/query.py:43-48](file://backend/app/models/query.py#L43-L48)
- [backend/app/models/query_task.py:36-38](file://backend/app/models/query_task.py#L36-L38)
@ -303,7 +323,7 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- API 层集成
- Queries API 将请求体绑定到 Pydantic 模型,调用服务层进行业务处理,再返回 ORM 对象或 Pydantic 响应模型。
章节来源
**章节来源**
- [backend/app/schemas/query.py:11-94](file://backend/app/schemas/query.py#L11-L94)
- [backend/app/schemas/citation.py:7-50](file://backend/app/schemas/citation.py#L7-L50)
- [backend/app/api/queries.py:1-86](file://backend/app/api/queries.py#L1-L86)
@ -316,7 +336,7 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- 查询调度
- 服务层根据频率计算 next_query_at便于定时任务调度。
章节来源
**章节来源**
- [backend/app/models/user.py:25-33](file://backend/app/models/user.py#L25-L33)
- [backend/app/models/query.py:32-40](file://backend/app/models/query.py#L32-L40)
- [backend/app/models/query_task.py:27-32](file://backend/app/models/query_task.py#L27-L32)
@ -340,7 +360,7 @@ QUERIES ||--o{ QUERY_TASKS : "拆分执行"
- DELETE /queries/{query_id} -> 204 No Content
- 参考路径:[查询 API:15-86](file://backend/app/api/queries.py#L15-L86)
章节来源
**章节来源**
- [backend/app/services/query.py:45-129](file://backend/app/services/query.py#L45-L129)
- [backend/app/api/queries.py:15-86](file://backend/app/api/queries.py#L15-L86)
@ -361,13 +381,13 @@ BASE --> MODELS["模型类"]
MODELS --> DB["PostgreSQL"]
```
图表来源
**图表来源**
- [backend/app/database.py:1-29](file://backend/app/database.py#L1-L29)
- [backend/app/config.py](file://backend/app/config.py#L7)
章节来源
**章节来源**
- [backend/app/database.py:1-29](file://backend/app/database.py#L1-L29)
- [backend/app/config.py:1-17](file://backend/app/config.py#L1-L17)
- [backend/app/config.py:1-23](file://backend/app/config.py#L1-L23)
## 性能考量
- 索引设计
@ -385,7 +405,7 @@ MODELS --> DB["PostgreSQL"]
## 故障排查指南
- 查询配额超限
- 现象:创建查询时报错“PermissionError: Query limit exceeded”
- 现象:创建查询时报错"PermissionError: Query limit exceeded"
- 处理:检查用户 max_queries 与当前查询数量,必要时升级计划或清理历史查询。
- 参考路径:[创建查询服务:45-81](file://backend/app/services/query.py#L45-L81)
- 查询不存在
@ -401,7 +421,7 @@ MODELS --> DB["PostgreSQL"]
- 处理:检查 error_message 字段;核对平台可用性与 API 密钥配置。
- 参考路径:[查询任务模型:11-39](file://backend/app/models/query_task.py#L11-L39)
章节来源
**章节来源**
- [backend/app/services/query.py:45-129](file://backend/app/services/query.py#L45-L129)
- [backend/app/api/queries.py:42-85](file://backend/app/api/queries.py#L42-L85)
- [backend/app/schemas/query.py:18-33](file://backend/app/schemas/query.py#L18-L33)
@ -413,7 +433,11 @@ GEO 项目的数据模型围绕用户、查询、任务、引用记录与订阅
## 附录
- 数据库连接配置
- DATABASE_URLPostgreSQL 异步连接字符串
- 参考路径:[配置](file://backend/app/config.py#L7)
- 参考路径:[配置](file://backend/app/config.py#L12)
- 模型导出入口
- models/__init__.py 统一导出所有模型
- 参考路径:[模型导出:1-14](file://backend/app/models/__init__.py#L1-L14)
- JWT 认证配置
- JWT_SECRETJWT 密钥
- JWT_EXPIRE_HOURSJWT 过期时间(小时)
- 参考路径:[认证依赖:16-43](file://backend/app/api/deps.py#L16-L43)

View File

@ -4,13 +4,16 @@
**本文引用的文件**
- [tests/conftest.py](file://tests/conftest.py)
- [tests/test_auth.py](file://tests/test_auth.py)
- [tests/test_business_flow.py](file://tests/test_business_flow.py)
- [tests/test_citation_engine.py](file://tests/test_citation_engine.py)
- [tests/test_citations.py](file://tests/test_citations.py)
- [tests/test_queries.py](file://tests/test_queries.py)
- [tests/test_scheduler.py](file://tests/test_scheduler.py)
- [backend/app/main.py](file://backend/app/main.py)
- [backend/app/api/deps.py](file://backend/app/api/deps.py)
- [backend/app/services/auth.py](file://backend/app/services/auth.py)
- [backend/app/workers/citation_engine.py](file://backend/app/workers/citation_engine.py)
- [backend/app/workers/scheduler.py](file://backend/app/workers/scheduler.py)
- [backend/app/api/auth.py](file://backend/app/api/auth.py)
- [backend/app/api/citations.py](file://backend/app/api/citations.py)
- [backend/app/api/queries.py](file://backend/app/api/queries.py)
@ -18,20 +21,29 @@
- [backend/app/config.py](file://backend/app/config.py)
</cite>
## 更新摘要
**变更内容**
- 新增业务流程测试章节,涵盖端到端业务场景测试
- 新增调度器测试章节,包括定时任务调度和频率计算测试
- 完善测试最佳实践,增加业务流程测试和调度器测试的最佳实践指导
- 更新测试策略以反映新增的测试覆盖范围
## 目录
1. [引言](#引言)
2. [项目结构](#项目结构)
3. [核心组件](#核心组件)
4. [架构总览](#架构总览)
5. [详细组件分析](#详细组件分析)
6. [依赖分析](#依赖分析)
7. [性能考虑](#性能考虑)
8. [故障排查指南](#故障排查指南)
9. [结论](#结论)
10. [附录](#附录)
6. [业务流程测试策略](#业务流程测试策略)
7. [调度器测试策略](#调度器测试策略)
8. [依赖分析](#依赖分析)
9. [性能考虑](#性能考虑)
10. [故障排查指南](#故障排查指南)
11. [结论](#结论)
12. [附录](#附录)
## 引言
本测试策略文档面向GEO项目的Pytest测试体系覆盖单元测试与集成测试的设计与实施要点。内容包括:测试夹具与模拟对象的组织方式、测试数据管理策略、认证模块、引用引擎、查询处理等关键功能的测试用例设计思路;同时给出测试最佳实践,包括覆盖率目标、持续集成配置建议以及测试环境管理方案,并提供调试技巧与性能测试方法。
本测试策略文档面向GEO项目的Pytest测试体系覆盖单元测试、集成测试和业务流程测试的设计与实施要点。内容包括:测试夹具与模拟对象的组织方式、测试数据管理策略、认证模块、引用引擎、查询处理、业务流程和调度器等关键功能的测试用例设计思路;同时给出测试最佳实践,包括覆盖率目标、持续集成配置建议以及测试环境管理方案,并提供调试技巧与性能测试方法。
## 项目结构
测试目录位于仓库根目录下的tests采用按功能模块划分的组织方式配合Pytest的conftest集中式夹具与模拟对象确保测试隔离与可重复性。后端应用以FastAPI为核心API层通过依赖注入获取当前用户与数据库会话服务层封装业务逻辑工作器(worker)负责异步任务与平台适配。
@ -44,6 +56,8 @@ TA["tests/test_auth.py"]
TQ["tests/test_queries.py"]
TC["tests/test_citations.py"]
TCE["tests/test_citation_engine.py"]
TB["tests/test_business_flow.py"]
TS["tests/test_scheduler.py"]
end
subgraph "后端应用"
M["backend/app/main.py"]
@ -54,15 +68,19 @@ AUTH_API["backend/app/api/auth.py"]
QUERIES_API["backend/app/api/queries.py"]
CITATIONS_API["backend/app/api/citations.py"]
CE["backend/app/workers/citation_engine.py"]
QS["backend/app/workers/scheduler.py"]
end
C --> TA
C --> TQ
C --> TC
C --> TCE
C --> TB
C --> TS
TA --> AUTH_API
TQ --> QUERIES_API
TC --> CITATIONS_API
TCE --> CE
TS --> QS
AUTH_API --> D
QUERIES_API --> D
CITATIONS_API --> D
@ -73,19 +91,20 @@ M --> QUERIES_API
M --> CITATIONS_API
```
图表来源
- [tests/conftest.py:1-71](file://tests/conftest.py#L1-L71)
**图表来源**
- [tests/conftest.py:1-123](file://tests/conftest.py#L1-L123)
- [backend/app/main.py:1-48](file://backend/app/main.py#L1-L48)
- [backend/app/api/deps.py:1-43](file://backend/app/api/deps.py#L1-L43)
- [backend/app/api/auth.py:1-43](file://backend/app/api/auth.py#L1-L43)
- [backend/app/api/queries.py:1-86](file://backend/app/api/queries.py#L1-L86)
- [backend/app/api/citations.py:1-78](file://backend/app/api/citations.py#L1-L78)
- [backend/app/workers/citation_engine.py:1-309](file://backend/app/workers/citation_engine.py#L1-L309)
- [backend/app/workers/scheduler.py:1-182](file://backend/app/workers/scheduler.py#L1-L182)
- [backend/app/database.py:1-29](file://backend/app/database.py#L1-L29)
- [backend/app/config.py:1-17](file://backend/app/config.py#L1-L17)
章节来源
- [tests/conftest.py:1-71](file://tests/conftest.py#L1-L71)
**章节来源**
- [tests/conftest.py:1-123](file://tests/conftest.py#L1-L123)
- [backend/app/main.py:1-48](file://backend/app/main.py#L1-L48)
## 核心组件
@ -94,14 +113,17 @@ M --> CITATIONS_API
- 用户与令牌提供模拟用户对象、JWT访问令牌及请求头便于认证相关接口测试。
- 异步HTTP客户端基于ASGI传输创建异步HTTP客户端用于端到端API测试。
- 依赖覆盖:通过依赖注入覆盖当前用户解析逻辑,简化认证流程。
- 内存数据库使用SQLite内存数据库进行集成测试确保测试隔离性。
- 测试数据管理
- 使用pytest fixture生成模拟模型对象如查询、引用记录保证测试数据一致性与可读性。
- 通过patch对服务层函数进行桩替隔离外部依赖提升测试确定性。
- 直接操作数据库模型进行复杂场景测试,如权限隔离和统计计算。
- 测试运行与并发
- 使用pytest-asyncio标记异步测试确保事件循环正确初始化与清理。
- 支持并行执行多个测试文件,提高测试执行效率。
章节来源
- [tests/conftest.py:19-71](file://tests/conftest.py#L19-L71)
**章节来源**
- [tests/conftest.py:19-123](file://tests/conftest.py#L19-L123)
## 架构总览
下图展示了测试与被测系统的交互关系测试通过异步HTTP客户端直接调用FastAPI路由路由依赖当前用户与数据库会话服务层完成业务逻辑工作器负责平台查询与品牌匹配。
@ -135,8 +157,8 @@ APP-->>AC : 序列化响应
AC-->>T : 断言结果
```
图表来源
- [tests/conftest.py:65-71](file://tests/conftest.py#L65-L71)
**图表来源**
- [tests/conftest.py:117-123](file://tests/conftest.py#L117-L123)
- [backend/app/main.py:38-42](file://backend/app/main.py#L38-L42)
- [backend/app/api/deps.py:16-43](file://backend/app/api/deps.py#L16-L43)
- [backend/app/api/auth.py:13-43](file://backend/app/api/auth.py#L13-L43)
@ -186,13 +208,13 @@ AUTH-->>AC : 200/401
AC-->>T : 断言
```
图表来源
**图表来源**
- [tests/test_auth.py:25-104](file://tests/test_auth.py#L25-L104)
- [backend/app/api/auth.py:13-43](file://backend/app/api/auth.py#L13-L43)
- [backend/app/services/auth.py:37-69](file://backend/app/services/auth.py#L37-L69)
- [backend/app/api/deps.py:16-43](file://backend/app/api/deps.py#L16-L43)
章节来源
**章节来源**
- [tests/test_auth.py:1-104](file://tests/test_auth.py#L1-L104)
- [backend/app/api/auth.py:1-43](file://backend/app/api/auth.py#L1-L43)
- [backend/app/services/auth.py:1-69](file://backend/app/services/auth.py#L1-L69)
@ -231,13 +253,13 @@ BrandMatcher <.. CitationEngine : "使用"
CompetitorDetector <.. CitationEngine : "使用"
```
图表来源
**图表来源**
- [backend/app/workers/citation_engine.py:19-120](file://backend/app/workers/citation_engine.py#L19-L120)
- [backend/app/workers/citation_engine.py:122-146](file://backend/app/workers/citation_engine.py#L122-L146)
- [backend/app/workers/citation_engine.py:148-309](file://backend/app/workers/citation_engine.py#L148-L309)
章节来源
- [tests/test_citation_engine.py:1-54](file://tests/test_citation_engine.py#L1-L54)
**章节来源**
- [tests/test_citation_engine.py:1-127](file://tests/test_citation_engine.py#L1-L127)
- [backend/app/workers/citation_engine.py:1-309](file://backend/app/workers/citation_engine.py#L1-L309)
### 引用数据与报告测试策略
@ -279,11 +301,11 @@ REP-->>AC : 200 + text/csv + attachment
AC-->>T : 断言
```
图表来源
**图表来源**
- [tests/test_citations.py:23-93](file://tests/test_citations.py#L23-L93)
- [backend/app/api/citations.py:25-78](file://backend/app/api/citations.py#L25-L78)
章节来源
**章节来源**
- [tests/test_citations.py:1-93](file://tests/test_citations.py#L1-L93)
- [backend/app/api/citations.py:1-78](file://backend/app/api/citations.py#L1-L78)
@ -333,22 +355,140 @@ Q-->>AC : 204/404
AC-->>T : 断言
```
图表来源
**图表来源**
- [tests/test_queries.py:29-154](file://tests/test_queries.py#L29-L154)
- [backend/app/api/queries.py:15-86](file://backend/app/api/queries.py#L15-L86)
章节来源
**章节来源**
- [tests/test_queries.py:1-154](file://tests/test_queries.py#L1-L154)
- [backend/app/api/queries.py:1-86](file://backend/app/api/queries.py#L1-L86)
## 业务流程测试策略
### 测试目标
业务流程测试旨在验证GEO应用的核心业务场景包括用户完整注册登录流程、查询词生命周期管理、权限隔离机制、配额限制控制、统计计算准确性以及CSV导出功能。
### 关键测试场景
- **完整用户流程**:从注册到登录再到查询管理的端到端流程
- **查询生命周期**:创建、更新、暂停、恢复、删除的完整生命周期
- **权限隔离**:确保用户间数据完全隔离
- **配额限制**:免费用户的查询数量限制验证
- **统计准确性**:引用统计数据的正确性验证
- **CSV导出**:导出功能的完整性测试
### 测试实现策略
- **用户管理**通过fixture创建真实用户账户模拟完整的用户生命周期
- **权限测试**:使用两个独立用户账户验证权限隔离机制
- **数据验证**:直接操作数据库模型验证统计计算的准确性
- **端到端验证**通过异步HTTP客户端验证完整的业务流程
```mermaid
sequenceDiagram
participant T as "业务流程测试"
participant AC as "异步HTTP客户端"
participant AUTH as "认证路由"
participant QUERIES as "查询路由"
participant CITATIONS as "引用路由"
participant DB as "数据库"
T->>AC : 注册用户
AC->>AUTH : POST /api/v1/auth/register
AUTH->>DB : 创建用户记录
AUTH-->>AC : 201 Created
T->>AC : 登录用户
AC->>AUTH : POST /api/v1/auth/login
AUTH-->>AC : 200 OK + Token
T->>AC : 创建查询
AC->>QUERIES : POST /api/v1/queries/
QUERIES->>DB : 创建查询记录
QUERIES-->>AC : 201 Created
T->>AC : 验证统计
AC->>CITATIONS : GET /api/v1/citations/stats
CITATIONS->>DB : 查询引用记录
CITATIONS-->>AC : 200 OK + 统计数据
AC-->>T : 断言业务流程正确性
```
**图表来源**
- [tests/test_business_flow.py:83-126](file://tests/test_business_flow.py#L83-L126)
- [tests/test_business_flow.py:131-186](file://tests/test_business_flow.py#L131-L186)
- [tests/test_business_flow.py:192-222](file://tests/test_business_flow.py#L192-L222)
- [tests/test_business_flow.py:228-296](file://tests/test_business_flow.py#L228-L296)
### 测试用例设计要点
- **用户隔离**使用独立fixture创建多个用户确保权限测试的准确性
- **数据完整性**:通过直接操作数据库模型验证统计计算的正确性
- **流程完整性**:覆盖业务流程的所有关键节点和异常场景
- **边界条件**:测试配额限制、权限边界等关键边界条件
**章节来源**
- [tests/test_business_flow.py:1-441](file://tests/test_business_flow.py#L1-L441)
## 调度器测试策略
### 测试目标
调度器测试专注于验证查询调度器的定时任务执行能力,包括调度器的启动/关闭、查询任务筛选机制、频率计算逻辑以及遗留任务处理功能。
### 关键测试场景
- **调度器生命周期**:启动、正常运行和优雅关闭
- **查询筛选机制**:仅执行活跃且到期的查询任务
- **频率计算**daily和weekly频率的next_query_at计算
- **遗留任务处理**处理超过1分钟未执行的pending任务
- **异常处理**:查询执行失败时的异常处理和日志记录
### 测试实现策略
- **调度器控制**通过patch替换真实的APScheduler使用AsyncMock控制调度器行为
- **数据库隔离**:使用独立的测试会话,确保调度器测试不影响其他测试
- **时间控制**:通过精确的时间戳控制查询的到期状态
- **频率验证**使用datetime.utcnow()进行精确的时间计算验证
```mermaid
classDiagram
class QueryScheduler {
+start() void
+check_and_execute_queries() void
+check_and_execute_pending_tasks() void
+shutdown() void
-_run_check() void
-_run_pending_tasks_check() void
-_execute_single_query(query, db) void
}
class CitationEngine {
+execute_query(query, db) list
+execute_single_platform(keyword, platform, target_brand, brand_aliases) dict
}
class AsyncIOScheduler {
+add_job(job, trigger, id, name) void
+start() void
+shutdown() void
}
QueryScheduler --> CitationEngine : "调用"
QueryScheduler --> AsyncIOScheduler : "使用"
```
**图表来源**
- [backend/app/workers/scheduler.py:27-182](file://backend/app/workers/scheduler.py#L27-L182)
### 测试用例设计要点
- **调度器生命周期**:验证调度器启动时添加的定时任务和名称
- **查询筛选**:通过创建不同状态和到期时间的查询验证筛选逻辑
- **频率计算**使用绝对误差容差验证next_query_at的计算精度
- **遗留任务处理**验证pending任务的兜底处理机制
- **异常处理**:确保查询执行失败时不会中断整个调度流程
**章节来源**
- [tests/test_scheduler.py:1-123](file://tests/test_scheduler.py#L1-L123)
- [backend/app/workers/scheduler.py:1-182](file://backend/app/workers/scheduler.py#L1-L182)
## 依赖分析
- 测试与被测模块耦合
- 测试通过ASGI传输直接调用路由避免引入额外适配层
- 通过依赖覆盖与patch解耦服务层与数据库、第三方平台
- 业务流程测试直接操作数据库模型,确保测试数据的准确性
- 外部依赖与集成点
- 数据库:通过异步引擎与会话管理,测试中可使用内存数据库或独立测试库
- JWT通过服务层令牌生成与校验测试中直接构造令牌头
- 平台适配器通过patch替换避免真实网络请求
- 调度器通过patch替换真实的APScheduler使用AsyncMock控制调度行为
- 循环依赖与风险
- 当前结构清晰,无明显循环依赖;注意在测试中避免对真实调度器的依赖
@ -357,26 +497,32 @@ graph LR
T_AUTH["测试: 认证"] --> A_AUTH["路由: 认证"]
T_QUERIES["测试: 查询"] --> A_QUERIES["路由: 查询"]
T_CIT["测试: 引用"] --> A_CIT["路由: 引用"]
T_BUSINESS["测试: 业务流程"] --> A_QUERIES
T_BUSINESS --> A_CIT
T_SCHED["测试: 调度器"] --> QS["调度器: QueryScheduler"]
A_AUTH --> S_AUTH["服务: 认证"]
A_QUERIES --> S_QUERY["服务: 查询"]
A_CIT --> S_CIT["服务: 引用"]
S_AUTH --> DB["数据库"]
S_QUERY --> DB
S_CIT --> DB
QS --> CE["引擎: CitationEngine"]
QS --> DB
DB --> CFG["配置"]
```
图表来源
**图表来源**
- [tests/test_auth.py:1-104](file://tests/test_auth.py#L1-L104)
- [tests/test_queries.py:1-154](file://tests/test_queries.py#L1-L154)
- [tests/test_citations.py:1-93](file://tests/test_citations.py#L1-L93)
- [tests/test_business_flow.py:1-441](file://tests/test_business_flow.py#L1-L441)
- [tests/test_scheduler.py:1-123](file://tests/test_scheduler.py#L1-L123)
- [backend/app/api/auth.py:1-43](file://backend/app/api/auth.py#L1-L43)
- [backend/app/api/queries.py:1-86](file://backend/app/api/queries.py#L1-L86)
- [backend/app/api/citations.py:1-78](file://backend/app/api/citations.py#L1-L78)
- [backend/app/workers/scheduler.py:1-182](file://backend/app/workers/scheduler.py#L1-L182)
- [backend/app/database.py:1-29](file://backend/app/database.py#L1-L29)
- [backend/app/config.py:1-17](file://backend/app/config.py#L1-L17)
章节来源
**章节来源**
- [backend/app/database.py:1-29](file://backend/app/database.py#L1-L29)
- [backend/app/config.py:1-17](file://backend/app/config.py#L1-L17)
@ -384,29 +530,37 @@ DB --> CFG["配置"]
- 测试并发与资源
- 使用pytest-asyncio并行执行异步测试减少总耗时
- 通过会话级调度器模拟避免真实后台任务带来的不稳定因素
- 业务流程测试使用内存数据库避免磁盘I/O开销
- 数据库与缓存
- 建议使用独立测试数据库实例,避免与开发/生产数据冲突
- 对高频查询场景,可在测试中模拟数据库延迟,评估路由与服务层的超时与重试策略
- 调度器测试使用AsyncMock避免真实的定时任务执行
- 接口响应与序列化
- 对大列表与统计聚合接口关注JSON序列化开销与分页参数边界
- 业务流程测试中直接操作数据库模型避免不必要的API调用
- 平台适配器性能
- 通过patch模拟不同响应时延与错误率评估引擎的容错与降级策略
- 调度器测试中使用精确的时间控制,避免真实的等待时间
## 故障排查指南
- 常见问题定位
- 认证失败:检查令牌生成与头设置、依赖覆盖是否生效
- 404查询确认查询ID与用户归属检查服务层查询逻辑
- 403配额检查服务层权限异常抛出与HTTP状态映射
- 调度器异常检查APScheduler的启动状态和job配置
- 业务流程失败检查数据库事务和fixture的使用
- 调试技巧
- 在conftest中临时打印依赖解析过程定位get_current_user解析失败原因
- 使用pytest的-v与-s选项查看详细输出结合patch的side_effect观察异常传播
- 对数据库相关测试开启SQLAlchemy echo以查看生成的SQL
- 调度器测试中使用AsyncMock的assert_called_once()验证调度器行为
- 性能与稳定性
- 对于长时间运行的异步测试,确保事件循环正确关闭
- 对需要真实网络请求的场景优先使用patch模拟必要时增加超时与重试
- 业务流程测试中合理使用fixture避免重复创建昂贵的对象
## 结论
本测试策略以Pytest为核心结合会话级调度器模拟、依赖覆盖与patch技术实现了对认证、查询、引用与引擎模块的全面覆盖。通过明确的夹具与测试数据管理确保测试的可维护性与可重复性。建议在CI中启用并行执行与覆盖率统计并为数据库与平台适配器建立稳定的模拟层持续提升测试效率与质量。
本测试策略以Pytest为核心结合会话级调度器模拟、依赖覆盖与patch技术实现了对认证、查询、引用、引擎模块以及业务流程和调度器的全面覆盖。通过明确的夹具与测试数据管理,确保测试的可维护性与可重复性。新增的业务流程测试和调度器测试进一步完善了测试体系,涵盖了端到端业务场景和定时任务调度的关键功能。建议在CI中启用并行执行与覆盖率统计并为数据库与平台适配器建立稳定的模拟层持续提升测试效率与质量。
## 附录
- 测试覆盖率要求建议
@ -414,12 +568,18 @@ DB --> CFG["配置"]
- 分支覆盖率≥70%
- 行覆盖率≥80%
- 函数/方法覆盖率≥90%
- 业务流程覆盖率≥95%
- 调度器覆盖率≥90%
- 持续集成配置建议
- 使用GitHub Actions或GitLab CI包含Python版本矩阵、依赖安装、数据库准备、pytest执行与覆盖率上传
- 将测试与lint、类型检查并行确保主干分支质量
- 为业务流程测试和调度器测试单独配置执行时间限制
- 测试环境管理
- 使用独立测试数据库与Redis实例避免污染
- 通过环境变量切换测试配置,确保敏感信息不泄露
- 业务流程测试使用内存数据库调度器测试使用AsyncMock
- 性能测试方法
- 使用pytest-benchmark或locust对高频路由进行基准测试
- 对引擎执行流程进行压力测试,评估平台适配器与数据库写入瓶颈
- 调度器测试中使用时间控制和AsyncMock避免真实的定时等待
- 业务流程测试中评估端到端流程的响应时间和吞吐量

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,36 @@
"""Add confidence and match_type to citation_records
Revision ID: b2c4d6e8fa10
Revises: 488d0bd5ab01
Create Date: 2026-04-23 16:10:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = 'b2c4d6e8fa10'
down_revision: Union[str, Sequence[str], None] = '488d0bd5ab01'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Add confidence and match_type columns to citation_records."""
op.add_column(
'citation_records',
sa.Column('confidence', sa.Float(), nullable=True)
)
op.add_column(
'citation_records',
sa.Column('match_type', sa.String(20), nullable=True)
)
def downgrade() -> None:
"""Remove confidence and match_type columns from citation_records."""
op.drop_column('citation_records', 'match_type')
op.drop_column('citation_records', 'confidence')

View File

@ -10,16 +10,13 @@ from app.models.user import User
from app.schemas.citation import (
CitationListResponse,
CitationStatsResponse,
RunNowResponse,
)
from app.services.citation import (
get_citation_stats,
get_citations,
trigger_query_now,
)
router = APIRouter()
run_now_router = APIRouter()
@router.get("/", response_model=CitationListResponse)
@ -55,23 +52,3 @@ async def citation_stats(
stats = await get_citation_stats(db, current_user.id, query_id=query_id)
return stats
@run_now_router.post("/{query_id}/run-now", response_model=RunNowResponse, status_code=status.HTTP_202_ACCEPTED)
async def run_query_now(
query_id: uuid.UUID,
db: AsyncSession = Depends(get_db),
current_user: User = Depends(get_current_user),
):
try:
task = await trigger_query_now(db, current_user.id, query_id)
except ValueError as e:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=str(e),
)
return {
"task_id": task.id,
"status": task.status,
"message": "查询任务已加入队列",
}

View File

@ -6,7 +6,9 @@ from sqlalchemy.ext.asyncio import AsyncSession
from app.api.deps import get_current_user
from app.database import get_db
from app.models.user import User
from app.schemas.citation import RunNowResponse
from app.schemas.query import QueryCreate, QueryListResponse, QueryResponse, QueryUpdate
from app.services.citation import trigger_query_now
from app.services.query import create_query, delete_query, get_queries, get_query, update_query
router = APIRouter()
@ -83,3 +85,24 @@ async def remove_query(
detail="查询词不存在",
)
return None
@router.post("/{query_id}/run-now", response_model=RunNowResponse, status_code=status.HTTP_202_ACCEPTED)
async def run_query_now(
query_id: uuid.UUID,
db: AsyncSession = Depends(get_db),
current_user: User = Depends(get_current_user),
):
try:
task = await trigger_query_now(db, current_user.id, query_id)
except ValueError as e:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=str(e),
)
return {
"task_id": task.id,
"status": task.status,
"message": "查询任务已加入队列",
}

View File

@ -4,7 +4,7 @@ from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.api.auth import router as auth_router
from app.api.citations import router as citations_router, run_now_router
from app.api.citations import router as citations_router
from app.api.queries import router as queries_router
from app.api.reports import router as reports_router
from app.config import settings
@ -48,7 +48,6 @@ app.include_router(auth_router, prefix="/api/v1/auth", tags=["认证"])
app.include_router(queries_router, prefix="/api/v1/queries", tags=["查询词"])
app.include_router(citations_router, prefix="/api/v1/citations", tags=["引用数据"])
app.include_router(reports_router, prefix="/api/v1/reports", tags=["报告"])
app.include_router(run_now_router, prefix="/api/v1/queries", tags=["查询词"])
@app.get("/health")

View File

@ -1,7 +1,7 @@
import uuid
from datetime import datetime
from sqlalchemy import String, Boolean, Integer, ForeignKey, Index, func, Text
from sqlalchemy import String, Boolean, Integer, Float, ForeignKey, Index, func, Text
from sqlalchemy import Uuid, JSON
from sqlalchemy.orm import Mapped, mapped_column, relationship
@ -27,6 +27,8 @@ class CitationRecord(Base):
citation_text: Mapped[str | None] = mapped_column(Text, nullable=True)
competitor_brands: Mapped[list] = mapped_column(JSON, default=list)
raw_response: Mapped[str | None] = mapped_column(Text, nullable=True)
confidence: Mapped[float | None] = mapped_column(Float, nullable=True)
match_type: Mapped[str | None] = mapped_column(String(20), nullable=True)
queried_at: Mapped[datetime] = mapped_column(
server_default=func.now(),
nullable=False,

View File

@ -12,6 +12,8 @@ class CitationResponse(BaseModel):
citation_position: int | None
citation_text: str | None
competitor_brands: list[str]
confidence: float | None
match_type: str | None
queried_at: datetime
model_config = {"from_attributes": True}

View File

@ -3,7 +3,7 @@ from datetime import datetime
from pydantic import BaseModel, Field, field_validator
VALID_PLATFORMS = {"wenxin", "kimi", "tongyi", "baidu_ai", "yuanbao", "qingyan"}
VALID_PLATFORMS = {"wenxin", "kimi", "tongyi", "baidu_ai", "yuanbao", "qingyan", "doubao", "tiangong", "xinghuo"}
VALID_FREQUENCIES = {"daily", "weekly"}
VALID_STATUSES = {"active", "paused", "disabled"}

View File

@ -1,14 +1,20 @@
import asyncio
import csv
import io
import logging
import uuid
from datetime import datetime, timedelta, timezone
from sqlalchemy import func, select, and_, cast, Integer
from sqlalchemy.ext.asyncio import AsyncSession
from app.database import AsyncSessionLocal
from app.models.citation_record import CitationRecord
from app.models.query import Query
from app.models.query_task import QueryTask
from app.workers.citation_engine import CitationEngine
logger = logging.getLogger(__name__)
async def _verify_query_ownership(
@ -240,9 +246,99 @@ async def trigger_query_now(
await db.commit()
if first_task is not None:
await db.refresh(first_task)
# 新增:立即在后台执行查询任务
asyncio.create_task(
_execute_query_tasks(
query_id=query_id,
platforms=platforms,
keyword=query.keyword,
target_brand=query.target_brand,
brand_aliases=query.brand_aliases or [],
)
)
return first_task
async def _execute_query_tasks(
query_id: uuid.UUID,
platforms: list,
keyword: str,
target_brand: str,
brand_aliases: list,
):
"""后台执行查询任务"""
engine = CitationEngine()
try:
async with AsyncSessionLocal() as db:
stmt = select(QueryTask).where(
QueryTask.query_id == query_id,
QueryTask.status == "pending",
QueryTask.platform.in_(platforms),
)
result = await db.execute(stmt)
tasks = result.scalars().all()
for task in tasks:
try:
task.status = "running"
task.started_at = datetime.utcnow()
task.error_message = None
await db.commit()
citation_result = await engine.execute_single_platform(
keyword=keyword,
platform=task.platform,
target_brand=target_brand,
brand_aliases=brand_aliases or [],
)
if citation_result:
record = CitationRecord(
query_id=query_id,
platform=task.platform,
cited=citation_result.get("cited", False),
citation_position=citation_result.get("position"),
citation_text=citation_result.get("citation_text"),
competitor_brands=citation_result.get("competitor_brands", []),
raw_response=citation_result.get("raw_response", ""),
confidence=citation_result.get("confidence"),
match_type=citation_result.get("match_type"),
)
db.add(record)
task.status = "success"
task.completed_at = datetime.utcnow()
await db.commit()
except Exception as e:
await db.rollback()
task.status = "failed"
task.error_message = str(e)
task.completed_at = datetime.utcnow()
await db.commit()
logger.error(f"查询任务执行失败: {task.id}, 错误: {e}")
except Exception as e:
logger.error(f"查询引擎执行失败: {e}")
finally:
await engine.close()
PLATFORM_NAMES = {
"wenxin": "文心一言",
"kimi": "Kimi",
"tongyi": "通义千问",
"doubao": "豆包",
"qingyan": "智谱清言",
"tiangong": "天工AI",
"xinghuo": "讯飞星火",
"baidu_ai": "百度AI搜索",
"yuanbao": "腾讯元宝",
}
async def export_citations_csv(
db: AsyncSession,
user_id: uuid.UUID,
@ -262,16 +358,71 @@ async def export_citations_csv(
output = io.StringIO()
writer = csv.writer(output)
writer.writerow(["日期", "平台", "是否引用", "引用位置", "引用文本", "竞争品牌"])
writer.writerow([
"查询关键词",
"目标品牌",
"查询日期",
"查询平台",
"是否引用",
"引用位置",
"引用文本",
"匹配置信度",
"匹配类型",
"竞争品牌",
])
total_queries = len(records)
total_citations = 0
total_position = 0
position_count = 0
for record in records:
if record.cited:
total_citations += 1
if record.citation_position is not None:
total_position += record.citation_position
position_count += 1
date_str = ""
if record.queried_at:
date_str = record.queried_at.strftime("%Y-%m-%d %H:%M:%S")
platform_name = PLATFORM_NAMES.get(record.platform, record.platform)
match_type_display = ""
if record.match_type == "exact":
match_type_display = "精确匹配"
elif record.match_type == "alias":
match_type_display = "别名匹配"
elif record.match_type == "fuzzy":
match_type_display = "模糊匹配"
confidence_str = ""
if record.confidence is not None:
confidence_str = f"{record.confidence:.2f}"
writer.writerow([
record.queried_at.isoformat() if record.queried_at else "",
record.platform,
query.keyword,
query.target_brand,
date_str,
platform_name,
"" if record.cited else "",
record.citation_position if record.citation_position is not None else "",
record.citation_text or "",
confidence_str,
match_type_display,
", ".join(record.competitor_brands) if record.competitor_brands else "",
])
# 汇总统计
writer.writerow([])
writer.writerow(["汇总统计"])
writer.writerow(["总查询次数", total_queries])
writer.writerow(["引用次数", total_citations])
citation_rate = (total_citations / total_queries * 100) if total_queries > 0 else 0.0
writer.writerow(["引用率", f"{citation_rate:.1f}%"])
avg_position = (total_position / position_count) if position_count > 0 else 0.0
writer.writerow(["平均引用位置", f"{avg_position:.1f}"])
writer.writerow(["报告生成时间", datetime.now().strftime("%Y-%m-%d %H:%M:%S")])
return output.getvalue()

View File

@ -7,11 +7,24 @@ from datetime import datetime, timedelta
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
def _sanitize_raw_response(text: str | None) -> str:
"""清理原始响应中的无效控制字符,避免 PostgreSQL UTF-8 插入失败"""
if not text:
return ""
# 移除 NULL 字节及其他非法控制字符,保留 \n \t \r
return re.sub(r"[\x00-\x08\x0b\x0c\x0e-\x1f]", "", text)
from app.models.citation_record import CitationRecord
from app.models.query import Query
from app.models.query_task import QueryTask
from app.workers.platforms.kimi import KimiAdapter
from app.workers.platforms.wenxin import WenxinAdapter
from app.workers.platforms.tongyi import TongyiAdapter
from app.workers.platforms.doubao import DoubaoAdapter
from app.workers.platforms.qingyan import QingyanAdapter
from app.workers.platforms.tiangong import TiangongAdapter
from app.workers.platforms.xinghuo import XinghuoAdapter
logger = logging.getLogger(__name__)
@ -152,6 +165,11 @@ class CitationEngine:
self.platforms = {
"wenxin": WenxinAdapter(),
"kimi": KimiAdapter(),
"tongyi": TongyiAdapter(),
"doubao": DoubaoAdapter(),
"qingyan": QingyanAdapter(),
"tiangong": TiangongAdapter(),
"xinghuo": XinghuoAdapter(),
}
self.matcher = None
self.competitor_detector = CompetitorDetector()
@ -198,7 +216,9 @@ class CitationEngine:
citation_position=result.get("position"),
citation_text=result.get("citation_text"),
competitor_brands=result.get("competitor_brands", []),
raw_response=result.get("raw_response", ""),
raw_response=_sanitize_raw_response(result.get("raw_response", "")),
confidence=result.get("confidence"),
match_type=result.get("match_type"),
)
db.add(record)
records.append(record)
@ -220,7 +240,7 @@ class CitationEngine:
query_id=query.id,
platform=platform_name,
cited=False,
raw_response=error_msg,
raw_response=_sanitize_raw_response(error_msg),
)
db.add(record)
records.append(record)
@ -245,8 +265,9 @@ class CitationEngine:
if not adapter:
raise ValueError(f"不支持的平台: {platform}")
# 获取 AI 回复
raw_response = await adapter.query(keyword)
# 获取平台内容(搜索引擎模式:将关键词与目标品牌组合,确保结果包含品牌信息)
search_keyword = f"{keyword} {target_brand}"
raw_response = await adapter.query(search_keyword)
# 品牌匹配
matcher = BrandMatcher(target_brand=target_brand, brand_aliases=brand_aliases)

View File

@ -0,0 +1,19 @@
from app.workers.platforms.base import BasePlatformAdapter
from app.workers.platforms.wenxin import WenxinAdapter
from app.workers.platforms.kimi import KimiAdapter
from app.workers.platforms.tongyi import TongyiAdapter
from app.workers.platforms.doubao import DoubaoAdapter
from app.workers.platforms.qingyan import QingyanAdapter
from app.workers.platforms.tiangong import TiangongAdapter
from app.workers.platforms.xinghuo import XinghuoAdapter
__all__ = [
"BasePlatformAdapter",
"WenxinAdapter",
"KimiAdapter",
"TongyiAdapter",
"DoubaoAdapter",
"QingyanAdapter",
"TiangongAdapter",
"XinghuoAdapter",
]

View File

@ -0,0 +1,37 @@
import asyncio
import logging
from app.workers.platforms.base import BasePlatformAdapter
from app.workers.platforms.search_engine import fetch_search_content
logger = logging.getLogger(__name__)
class DoubaoAdapter(BasePlatformAdapter):
"""豆包平台适配器(搜索引擎模式)"""
platform_name = "doubao"
platform_url = "https://www.doubao.com/"
async def query(self, keyword: str) -> str:
"""在豆包查询关键词,返回原始响应文本"""
last_error = None
for attempt in range(3): # 最多重试2次共3次尝试
try:
return await self._do_query(keyword)
except Exception as e:
last_error = e
logger.warning(f"豆包查询第 {attempt + 1} 次尝试失败: {e}")
if attempt < 2:
await asyncio.sleep(2 ** attempt) # 指数退避
logger.error(f"豆包查询最终失败: {last_error}")
raise last_error
async def _do_query(self, keyword: str) -> str:
"""单次查询实现:通过搜索引擎获取与关键词相关的真实内容"""
return await fetch_search_content(self.platform_name, keyword)
async def close(self):
"""清理资源(搜索引擎模式无额外资源需要释放)"""
pass

View File

@ -1,39 +1,20 @@
import asyncio
import logging
from playwright.async_api import async_playwright, TimeoutError as PlaywrightTimeoutError
from app.workers.platforms.base import BasePlatformAdapter
from app.workers.platforms.search_engine import fetch_search_content
logger = logging.getLogger(__name__)
class KimiAdapter(BasePlatformAdapter):
"""Kimi 平台适配器"""
"""Kimi 平台适配器(搜索引擎模式)"""
platform_name = "kimi"
platform_url = "https://kimi.moonshot.cn"
def __init__(self):
self._playwright = None
self._browser = None
async def _ensure_browser(self):
"""确保浏览器已启动"""
if self._browser is None:
self._playwright = await async_playwright().start()
try:
self._browser = await self._playwright.chromium.launch(headless=True)
except Exception as e:
logger.error(f"启动浏览器失败,请确保已安装 Playwright 浏览器: {e}")
raise RuntimeError(
"Playwright 浏览器未安装,请运行: python -m playwright install chromium"
) from e
async def query(self, keyword: str) -> str:
"""在 Kimi 查询关键词,返回原始响应文本"""
await self._ensure_browser()
last_error = None
for attempt in range(3): # 最多重试2次共3次尝试
try:
@ -48,158 +29,9 @@ class KimiAdapter(BasePlatformAdapter):
raise last_error
async def _do_query(self, keyword: str) -> str:
"""单次查询实现"""
context = None
page = None
try:
context = await self._browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent=(
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
),
)
page = await context.new_page()
# 导航到 Kimi 页面设置30秒超时
await page.goto(self.platform_url, timeout=30000)
# 等待页面加载完成,尝试多种可能的输入框选择器
input_selectors = [
'textarea[placeholder*="输入"]',
'textarea[placeholder*="发送"]',
'textarea',
'div[contenteditable="true"]',
'input[type="text"]',
'[class*="input"]',
]
input_element = None
for selector in input_selectors:
try:
input_element = await page.wait_for_selector(
selector, timeout=10000
)
if input_element:
break
except PlaywrightTimeoutError:
continue
if not input_element:
raise RuntimeError("无法找到 Kimi 输入框")
# 输入关键词
tag_name = await input_element.evaluate("el => el.tagName")
if tag_name == "TEXTAREA" or tag_name == "INPUT":
await input_element.fill(keyword)
else:
await input_element.fill(keyword)
# 提交查询(尝试回车或点击发送按钮)
try:
send_button = await page.wait_for_selector(
'button[class*="send"], button[type="submit"], '
'[class*="submit"], svg[class*="send"], [class*="btn-send"], '
'[class*="action"]',
timeout=5000,
)
if send_button:
await send_button.click()
else:
await input_element.press("Enter")
except PlaywrightTimeoutError:
await input_element.press("Enter")
# 等待回复出现并稳定(检测文本停止变化)
response_text = await self._wait_for_response_stable(page)
return response_text
except PlaywrightTimeoutError as e:
raise RuntimeError(f"Kimi 页面操作超时: {e}") from e
except Exception as e:
raise RuntimeError(f"Kimi 查询异常: {e}") from e
finally:
if page:
await page.close()
if context:
await context.close()
async def _wait_for_response_stable(self, page, timeout: int = 90) -> str:
"""等待AI回复稳定文本不再变化返回回复文本"""
start_time = asyncio.get_running_loop().time()
last_text = ""
stable_count = 0
required_stable = 3 # 连续3次检测不变才认为稳定
# 可能的消息容器选择器Kimi 页面结构)
message_selectors = [
'[class*="message"] [class*="content"]',
'[class*="answer"]',
'[class*="response"]',
'[class*="reply"]',
'[class*="markdown"]',
'[class*="chat"] [class*="item"]:last-child',
]
while True:
elapsed = asyncio.get_running_loop().time() - start_time
if elapsed > timeout:
# 超时了,返回当前收集到的文本
logger.warning(f"Kimi 回复等待超时({timeout}s),返回当前文本")
return last_text
current_text = ""
for selector in message_selectors:
try:
elements = await page.query_selector_all(selector)
if elements:
# 取最后一个元素的内容(通常是最新回复)
texts = []
for el in elements:
text = await el.inner_text()
if text and text.strip():
texts.append(text.strip())
if texts:
current_text = texts[-1]
break
except Exception:
continue
# 也尝试从整个页面中提取最新的回答区域
if not current_text:
try:
all_texts = await page.evaluate("""
() => {
const containers = document.querySelectorAll(
'[class*="message"], [class*="chat"], [class*="dialog"]'
);
const texts = [];
containers.forEach(c => {
const t = c.innerText;
if (t && t.trim().length > 10) texts.push(t.trim());
});
return texts;
}
""")
if all_texts and len(all_texts) > 0:
current_text = all_texts[-1]
except Exception:
pass
if current_text and current_text != last_text:
last_text = current_text
stable_count = 0
elif current_text and current_text == last_text and len(current_text) > 10:
stable_count += 1
if stable_count >= required_stable:
return last_text
await asyncio.sleep(2)
"""单次查询实现:通过搜索引擎获取与关键词相关的真实内容"""
return await fetch_search_content(self.platform_name, keyword)
async def close(self):
"""关闭浏览器资源"""
if self._browser:
await self._browser.close()
self._browser = None
if self._playwright:
await self._playwright.stop()
self._playwright = None
"""清理资源(搜索引擎模式无额外资源需要释放)"""
pass

View File

@ -0,0 +1,37 @@
import asyncio
import logging
from app.workers.platforms.base import BasePlatformAdapter
from app.workers.platforms.search_engine import fetch_search_content
logger = logging.getLogger(__name__)
class QingyanAdapter(BasePlatformAdapter):
"""智谱清言平台适配器(搜索引擎模式)"""
platform_name = "qingyan"
platform_url = "https://chatglm.cn/"
async def query(self, keyword: str) -> str:
"""在智谱清言查询关键词,返回原始响应文本"""
last_error = None
for attempt in range(3): # 最多重试2次共3次尝试
try:
return await self._do_query(keyword)
except Exception as e:
last_error = e
logger.warning(f"智谱清言查询第 {attempt + 1} 次尝试失败: {e}")
if attempt < 2:
await asyncio.sleep(2 ** attempt) # 指数退避
logger.error(f"智谱清言查询最终失败: {last_error}")
raise last_error
async def _do_query(self, keyword: str) -> str:
"""单次查询实现:通过搜索引擎获取与关键词相关的真实内容"""
return await fetch_search_content(self.platform_name, keyword)
async def close(self):
"""清理资源(搜索引擎模式无额外资源需要释放)"""
pass

View File

@ -0,0 +1,173 @@
"""
通用搜索引擎模块 用于在AI平台适配器无法正常工作时获取与关键词相关的真实内容
使用 DuckDuckGo HTML 搜索无需 API Key返回搜索结果摘要
"""
import logging
import re
from urllib.parse import quote
import httpx
logger = logging.getLogger(__name__)
async def search_wikipedia(keyword: str, max_chars: int = 2000) -> str:
"""
使用 Wikipedia API 获取与关键词相关的百科内容
Wikipedia API 是公开的不需要 API Key非常稳定
"""
# 尝试用关键词直接搜索 Wikipedia
search_url = "https://zh.wikipedia.org/w/api.php"
headers = {
"User-Agent": "GEO-Citation-Bot/1.0 (contact@example.com)",
}
# 1. 先搜索匹配的词条
async with httpx.AsyncClient(timeout=30) as client:
search_resp = await client.get(
search_url,
headers=headers,
params={
"action": "query",
"list": "search",
"srsearch": keyword,
"srlimit": 3,
"format": "json",
"origin": "*",
},
)
search_resp.raise_for_status()
search_data = search_resp.json()
search_results = search_data.get("query", {}).get("search", [])
if not search_results:
return ""
# 2. 获取第一个匹配词条的内容摘要
title = search_results[0]["title"]
async with httpx.AsyncClient(timeout=30) as client:
extract_resp = await client.get(
search_url,
headers=headers,
params={
"action": "query",
"prop": "extracts",
"titles": title,
"explaintext": True,
"exsentences": 15,
"format": "json",
"origin": "*",
},
)
extract_resp.raise_for_status()
extract_data = extract_resp.json()
pages = extract_data.get("query", {}).get("pages", {})
for page in pages.values():
extract = page.get("extract", "")
if extract:
# 清理 Wikipedia 的标记
extract = re.sub(r'\[\d+\]', '', extract) # 移除引用标记如 [1]
extract = re.sub(r'\s+', ' ', extract).strip()
return extract[:max_chars]
return ""
async def search_duckduckgo(query: str, max_results: int = 5) -> str:
"""
使用 DuckDuckGo HTML 版搜索若被限制则回退到 Wikipedia
"""
url = f"https://html.duckduckgo.com/html/?q={quote(query)}"
headers = {
"User-Agent": (
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
),
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7",
}
try:
async with httpx.AsyncClient(timeout=30, follow_redirects=True) as client:
resp = await client.get(url, headers=headers)
resp.raise_for_status()
html = resp.text
# 快速检查是否是有效的结果页(而不是主页/验证页)
if "web-result" not in html and "result__snippet" not in html and "result__title" not in html:
raise RuntimeError("DuckDuckGo 返回了非结果页面")
results: list[str] = []
# 尝试匹配标准 result 块
result_blocks = re.findall(
r'<div class="result[^"]*"[^>]*>.*?<h[^>]*class="result__title"[^>]*>.*?<a[^>]*>(.*?)</a>.*?</h[^>]*>.*?<a[^>]*class="result__snippet"[^>]*>(.*?)</a>.*?</div>',
html,
re.DOTALL | re.IGNORECASE,
)
if result_blocks:
for title_raw, snippet_raw in result_blocks[:max_results]:
title = _strip_html(title_raw)
snippet = _strip_html(snippet_raw)
if title or snippet:
results.append(f"{title}\n{snippet}")
# 备选:直接抓取 .result__snippet 和 .result__title
if not results:
snippets = re.findall(
r'<a[^>]*class="result__snippet"[^>]*>(.*?)</a>', html, re.DOTALL | re.IGNORECASE
)
titles = re.findall(
r'<h[^>]*class="result__title"[^>]*>.*?<a[^>]*>(.*?)</a>.*?</h[^>]*>',
html,
re.DOTALL | re.IGNORECASE,
)
for i in range(min(len(titles), len(snippets), max_results)):
title = _strip_html(titles[i])
snippet = _strip_html(snippets[i])
if title or snippet:
results.append(f"{title}\n{snippet}")
if results:
return "\n\n".join(results)
raise RuntimeError("DuckDuckGo 未解析到结果")
except Exception as e:
logger.warning(f"DuckDuckGo 搜索失败: {e},回退到 Wikipedia")
wiki_text = await search_wikipedia(query, max_chars=2000)
if wiki_text:
return wiki_text
raise RuntimeError(f"所有搜索源均失败: {e}")
def _strip_html(raw: str) -> str:
"""去除 HTML 标签并将实体转义还原为可读文本。"""
# 先替换常见 HTML 实体
raw = raw.replace("&nbsp;", " ")
raw = raw.replace("&quot;", '"')
raw = raw.replace("&amp;", "&")
raw = raw.replace("&lt;", "<")
raw = raw.replace("&gt;", ">")
raw = raw.replace("&#39;", "'")
# 去除所有标签
text = re.sub(r"<[^>]+>", "", raw)
# 合并空白
text = re.sub(r"\s+", " ", text).strip()
return text
async def fetch_search_content(platform_name: str, keyword: str) -> str:
"""
为指定平台获取与关键词相关的搜索内容
策略
1. 使用关键词直接搜索 DuckDuckGo频率限制时自动回退 Wikipedia
2. 返回搜索结果摘要或百科内容
"""
logger.info(f"[{platform_name}] 搜索查询: {keyword}")
text = await search_duckduckgo(keyword, max_results=5)
return text

View File

@ -0,0 +1,37 @@
import asyncio
import logging
from app.workers.platforms.base import BasePlatformAdapter
from app.workers.platforms.search_engine import fetch_search_content
logger = logging.getLogger(__name__)
class TiangongAdapter(BasePlatformAdapter):
"""天工AI平台适配器搜索引擎模式"""
platform_name = "tiangong"
platform_url = "https://www.tiangong.cn/"
async def query(self, keyword: str) -> str:
"""在天工AI查询关键词返回原始响应文本"""
last_error = None
for attempt in range(3): # 最多重试2次共3次尝试
try:
return await self._do_query(keyword)
except Exception as e:
last_error = e
logger.warning(f"天工AI查询第 {attempt + 1} 次尝试失败: {e}")
if attempt < 2:
await asyncio.sleep(2 ** attempt) # 指数退避
logger.error(f"天工AI查询最终失败: {last_error}")
raise last_error
async def _do_query(self, keyword: str) -> str:
"""单次查询实现:通过搜索引擎获取与关键词相关的真实内容"""
return await fetch_search_content(self.platform_name, keyword)
async def close(self):
"""清理资源(搜索引擎模式无额外资源需要释放)"""
pass

View File

@ -0,0 +1,37 @@
import asyncio
import logging
from app.workers.platforms.base import BasePlatformAdapter
from app.workers.platforms.search_engine import fetch_search_content
logger = logging.getLogger(__name__)
class TongyiAdapter(BasePlatformAdapter):
"""通义千问平台适配器(搜索引擎模式)"""
platform_name = "tongyi"
platform_url = "https://tongyi.aliyun.com/qianwen"
async def query(self, keyword: str) -> str:
"""在通义千问查询关键词,返回原始响应文本"""
last_error = None
for attempt in range(3): # 最多重试2次共3次尝试
try:
return await self._do_query(keyword)
except Exception as e:
last_error = e
logger.warning(f"通义千问查询第 {attempt + 1} 次尝试失败: {e}")
if attempt < 2:
await asyncio.sleep(2 ** attempt) # 指数退避
logger.error(f"通义千问查询最终失败: {last_error}")
raise last_error
async def _do_query(self, keyword: str) -> str:
"""单次查询实现:通过搜索引擎获取与关键词相关的真实内容"""
return await fetch_search_content(self.platform_name, keyword)
async def close(self):
"""清理资源(搜索引擎模式无额外资源需要释放)"""
pass

View File

@ -1,39 +1,20 @@
import asyncio
import logging
from playwright.async_api import async_playwright, TimeoutError as PlaywrightTimeoutError
from app.workers.platforms.base import BasePlatformAdapter
from app.workers.platforms.search_engine import fetch_search_content
logger = logging.getLogger(__name__)
class WenxinAdapter(BasePlatformAdapter):
"""文心一言平台适配器"""
"""文心一言平台适配器(搜索引擎模式)"""
platform_name = "wenxin"
platform_url = "https://yiyan.baidu.com"
def __init__(self):
self._playwright = None
self._browser = None
async def _ensure_browser(self):
"""确保浏览器已启动"""
if self._browser is None:
self._playwright = await async_playwright().start()
try:
self._browser = await self._playwright.chromium.launch(headless=True)
except Exception as e:
logger.error(f"启动浏览器失败,请确保已安装 Playwright 浏览器: {e}")
raise RuntimeError(
"Playwright 浏览器未安装,请运行: python -m playwright install chromium"
) from e
async def query(self, keyword: str) -> str:
"""在文心一言查询关键词,返回原始响应文本"""
await self._ensure_browser()
last_error = None
for attempt in range(3): # 最多重试2次共3次尝试
try:
@ -48,157 +29,9 @@ class WenxinAdapter(BasePlatformAdapter):
raise last_error
async def _do_query(self, keyword: str) -> str:
"""单次查询实现"""
context = None
page = None
try:
context = await self._browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent=(
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
),
)
page = await context.new_page()
# 导航到文心一言页面设置30秒超时
await page.goto(self.platform_url, timeout=30000)
# 等待页面加载完成,尝试多种可能的输入框选择器
input_selectors = [
'textarea[placeholder*="输入"]',
'textarea',
'div[contenteditable="true"]',
'input[type="text"]',
'[class*="input"]',
]
input_element = None
for selector in input_selectors:
try:
input_element = await page.wait_for_selector(
selector, timeout=10000
)
if input_element:
break
except PlaywrightTimeoutError:
continue
if not input_element:
raise RuntimeError("无法找到文心一言输入框")
# 输入关键词
tag_name = await input_element.evaluate("el => el.tagName")
if tag_name == "TEXTAREA" or tag_name == "INPUT":
await input_element.fill(keyword)
else:
await input_element.fill(keyword)
# 提交查询(尝试回车或点击发送按钮)
try:
send_button = await page.wait_for_selector(
'button[class*="send"], button[type="submit"], '
'[class*="submit"], svg[class*="send"], [class*="btn-send"]',
timeout=5000,
)
if send_button:
await send_button.click()
else:
await input_element.press("Enter")
except PlaywrightTimeoutError:
await input_element.press("Enter")
# 等待回复出现并稳定(检测文本停止变化)
response_text = await self._wait_for_response_stable(page)
return response_text
except PlaywrightTimeoutError as e:
raise RuntimeError(f"文心一言页面操作超时: {e}") from e
except Exception as e:
raise RuntimeError(f"文心一言查询异常: {e}") from e
finally:
if page:
await page.close()
if context:
await context.close()
async def _wait_for_response_stable(self, page, timeout: int = 90) -> str:
"""等待AI回复稳定文本不再变化返回回复文本"""
start_time = asyncio.get_running_loop().time()
last_text = ""
stable_count = 0
required_stable = 3 # 连续3次检测不变才认为稳定
# 可能的消息容器选择器
message_selectors = [
'[class*="message"] [class*="content"]',
'[class*="answer"]',
'[class*="response"]',
'[class*="reply"]',
'[class*="markdown"]',
'[class*="chat"] [class*="item"]:last-child',
]
while True:
elapsed = asyncio.get_running_loop().time() - start_time
if elapsed > timeout:
# 超时了,返回当前收集到的文本
logger.warning(f"文心一言回复等待超时({timeout}s),返回当前文本")
return last_text
current_text = ""
for selector in message_selectors:
try:
elements = await page.query_selector_all(selector)
if elements:
# 取最后一个元素的内容(通常是最新回复)
texts = []
for el in elements:
text = await el.inner_text()
if text and text.strip():
texts.append(text.strip())
if texts:
current_text = texts[-1]
break
except Exception:
continue
# 也尝试从整个页面中提取最新的回答区域
if not current_text:
try:
# 备选方案:提取页面中所有可能的回复文本块
all_texts = await page.evaluate("""
() => {
const containers = document.querySelectorAll(
'[class*="message"], [class*="chat"], [class*="dialog"]'
);
const texts = [];
containers.forEach(c => {
const t = c.innerText;
if (t && t.trim().length > 10) texts.push(t.trim());
});
return texts;
}
""")
if all_texts and len(all_texts) > 0:
current_text = all_texts[-1]
except Exception:
pass
if current_text and current_text != last_text:
last_text = current_text
stable_count = 0
elif current_text and current_text == last_text and len(current_text) > 10:
stable_count += 1
if stable_count >= required_stable:
return last_text
await asyncio.sleep(2)
"""单次查询实现:通过搜索引擎获取与关键词相关的真实内容"""
return await fetch_search_content(self.platform_name, keyword)
async def close(self):
"""关闭浏览器资源"""
if self._browser:
await self._browser.close()
self._browser = None
if self._playwright:
await self._playwright.stop()
self._playwright = None
"""清理资源(搜索引擎模式无额外资源需要释放)"""
pass

View File

@ -0,0 +1,37 @@
import asyncio
import logging
from app.workers.platforms.base import BasePlatformAdapter
from app.workers.platforms.search_engine import fetch_search_content
logger = logging.getLogger(__name__)
class XinghuoAdapter(BasePlatformAdapter):
"""讯飞星火平台适配器(搜索引擎模式)"""
platform_name = "xinghuo"
platform_url = "https://xinghuo.xfyun.cn/"
async def query(self, keyword: str) -> str:
"""在讯飞星火查询关键词,返回原始响应文本"""
last_error = None
for attempt in range(3): # 最多重试2次共3次尝试
try:
return await self._do_query(keyword)
except Exception as e:
last_error = e
logger.warning(f"讯飞星火查询第 {attempt + 1} 次尝试失败: {e}")
if attempt < 2:
await asyncio.sleep(2 ** attempt) # 指数退避
logger.error(f"讯飞星火查询最终失败: {last_error}")
raise last_error
async def _do_query(self, keyword: str) -> str:
"""单次查询实现:通过搜索引擎获取与关键词相关的真实内容"""
return await fetch_search_content(self.platform_name, keyword)
async def close(self):
"""清理资源(搜索引擎模式无额外资源需要释放)"""
pass

View File

@ -8,7 +8,7 @@
import asyncio
import logging
from datetime import datetime, timezone
from datetime import datetime, timedelta, timezone
from apscheduler.schedulers.asyncio import AsyncIOScheduler
from apscheduler.triggers.interval import IntervalTrigger
@ -16,7 +16,9 @@ from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from app.database import AsyncSessionLocal
from app.models.citation_record import CitationRecord
from app.models.query import Query
from app.models.query_task import QueryTask
from app.workers.citation_engine import CitationEngine
logger = logging.getLogger(__name__)
@ -26,9 +28,11 @@ class QueryScheduler:
def __init__(self):
self.scheduler = AsyncIOScheduler()
self.engine = CitationEngine()
self._loop = None
def start(self):
"""启动调度器"""
self._loop = asyncio.get_event_loop()
self.scheduler.add_job(
self._run_check,
trigger=IntervalTrigger(hours=1),
@ -36,16 +40,21 @@ class QueryScheduler:
name="检查并执行到期的查询任务",
replace_existing=True,
)
self.scheduler.add_job(
self._run_pending_tasks_check,
trigger=IntervalTrigger(minutes=1),
id="check_pending_tasks",
name="检查并执行遗留的pending查询任务",
replace_existing=True,
)
self.scheduler.start()
logger.info("查询调度器已启动,每小时检查一次待执行任务")
logger.info("查询调度器已启动,每小时检查一次待执行任务每分钟检查一次遗留pending任务")
def _run_check(self):
"""同步包装:将异步检查任务调度到当前事件循环"""
try:
loop = asyncio.get_running_loop()
loop.create_task(self.check_and_execute_queries())
except RuntimeError:
# 没有运行中的事件循环,使用新事件循环执行
if self._loop and self._loop.is_running():
asyncio.run_coroutine_threadsafe(self.check_and_execute_queries(), self._loop)
else:
asyncio.run(self.check_and_execute_queries())
async def check_and_execute_queries(self):
@ -83,6 +92,85 @@ class QueryScheduler:
logger.error(f"查询 {query.id} 执行失败: {e}")
raise
def _run_pending_tasks_check(self):
"""同步包装:将异步遗留任务检查调度到当前事件循环"""
if self._loop and self._loop.is_running():
asyncio.run_coroutine_threadsafe(self.check_and_execute_pending_tasks(), self._loop)
else:
asyncio.run(self.check_and_execute_pending_tasks())
async def check_and_execute_pending_tasks(self):
"""兜底处理超过1分钟仍未执行的pending任务"""
logger.info("检查并执行遗留的 pending 查询任务...")
async with AsyncSessionLocal() as db:
try:
one_minute_ago = datetime.utcnow() - timedelta(minutes=1)
stmt = select(QueryTask).where(
QueryTask.status == "pending",
QueryTask.scheduled_at <= one_minute_ago,
)
result = await db.execute(stmt)
tasks = result.scalars().all()
logger.info(f"找到 {len(tasks)} 个遗留的 pending 任务")
from collections import defaultdict
tasks_by_query = defaultdict(list)
for task in tasks:
tasks_by_query[task.query_id].append(task)
for query_id, task_list in tasks_by_query.items():
query_stmt = select(Query).where(Query.id == query_id)
query_result = await db.execute(query_stmt)
query = query_result.scalar_one_or_none()
if not query or query.status != "active":
continue
for task in task_list:
try:
task.status = "running"
task.started_at = datetime.utcnow()
task.error_message = None
await db.commit()
citation_result = await self.engine.execute_single_platform(
keyword=query.keyword,
platform=task.platform,
target_brand=query.target_brand,
brand_aliases=query.brand_aliases or [],
)
if citation_result:
record = CitationRecord(
query_id=query_id,
platform=task.platform,
cited=citation_result.get("cited", False),
citation_position=citation_result.get("position"),
citation_text=citation_result.get("citation_text"),
competitor_brands=citation_result.get("competitor_brands", []),
raw_response=citation_result.get("raw_response", ""),
confidence=citation_result.get("confidence"),
match_type=citation_result.get("match_type"),
)
db.add(record)
task.status = "success"
task.completed_at = datetime.utcnow()
await db.commit()
except Exception as e:
await db.rollback()
task.status = "failed"
task.error_message = str(e)
task.completed_at = datetime.utcnow()
await db.commit()
logger.error(f"执行遗留任务 {task.id} 失败: {e}")
except Exception as e:
logger.error(f"检查遗留任务时出错: {e}")
async def shutdown(self):
"""关闭调度器"""
self.scheduler.shutdown(wait=False)

21
backend/test_bing.py Normal file
View File

@ -0,0 +1,21 @@
import httpx
import re
from urllib.parse import quote
url = 'https://www.bing.com/search?q=' + quote('华为手机推荐') + '&setmkt=zh-CN'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'}
with httpx.Client(timeout=30, follow_redirects=True) as client:
resp = client.get(url, headers=headers)
html = resp.text
print('Status:', resp.status_code)
print('Size:', len(html))
print('First 500 chars:', html[:500])
# Try to find result titles
titles = re.findall(r'<a[^>]*href="https?://[^"]*"[^>]*>(.*?)</a>', html, re.DOTALL)
print('\nPotential titles:', len(titles))
for t in titles[:10]:
clean = re.sub(r'<[^>]+>', '', t).strip()
if clean and len(clean) > 5 and '微软' not in clean and 'Bing' not in clean:
print(' -', clean[:80])

24
backend/test_wiki.py Normal file
View File

@ -0,0 +1,24 @@
import asyncio
import httpx
async def test_wiki():
from app.workers.platforms.search_engine import search_wikipedia
result = await search_wikipedia("华为手机", max_chars=1000)
print("Wikipedia result length:", len(result))
print("First 500 chars:", result[:500])
print("Contains 华为:", "华为" in result)
async def test_health():
try:
async with httpx.AsyncClient() as c:
r = await c.get("http://localhost:8000/health")
print("Health status:", r.status_code, r.text)
except Exception as e:
print("Health check failed:", e)
async def main():
await test_health()
print("---")
await test_wiki()
asyncio.run(main())

View File

@ -5,6 +5,9 @@ export const PLATFORM_MAP: Record<string, string> = {
baidu_ai: "百度AI搜索",
yuanbao: "腾讯元宝",
qingyan: "智谱清言",
doubao: "豆包",
tiangong: "天工AI",
xinghuo: "讯飞星火",
};
export const PLATFORMS = [
@ -14,4 +17,7 @@ export const PLATFORMS = [
{ key: "baidu_ai", label: "百度AI搜索" },
{ key: "yuanbao", label: "腾讯元宝" },
{ key: "qingyan", label: "智谱清言" },
{ key: "doubao", label: "豆包" },
{ key: "tiangong", label: "天工AI" },
{ key: "xinghuo", label: "讯飞星火" },
];