|
@@ -0,0 +1,644 @@
|
|
|
|
|
+# 样本中心对外API接口文档
|
|
|
|
|
+
|
|
|
|
|
+## 1. 概述
|
|
|
|
|
+
|
|
|
|
|
+### 1.1 文档目的
|
|
|
|
|
+本文档定义样本中心对外提供的API接口规范,供外部系统(如采集系统)接入使用。
|
|
|
|
|
+
|
|
|
|
|
+### 1.2 基础信息
|
|
|
|
|
+| 项目 | 说明 |
|
|
|
|
|
+|------|------|
|
|
|
|
|
+| Base URL | `https://{host}/api/v1` |
|
|
|
|
|
+| 数据格式 | JSON |
|
|
|
|
|
+| 字符编码 | UTF-8 |
|
|
|
|
|
+| 认证方式 | Bearer Token |
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 2. 认证与鉴权
|
|
|
|
|
+
|
|
|
|
|
+### 2.1 Token 设计方案
|
|
|
|
|
+
|
|
|
|
|
+采用 **API Key + Secret 签名机制**,外部系统通过以下方式获取访问 Token:
|
|
|
|
|
+
|
|
|
|
|
+#### 2.1.1 凭证分配
|
|
|
|
|
+每个接入系统在样本中心注册后,分配一对凭证:
|
|
|
|
|
+- `app_id`:应用唯一标识
|
|
|
|
|
+- `app_secret`:应用密钥(仅初始化时展示一次,需妥善保存)
|
|
|
|
|
+
|
|
|
|
|
+#### 2.1.2 获取访问令牌
|
|
|
|
|
+
|
|
|
|
|
+**POST /api/v1/auth/token**
|
|
|
|
|
+
|
|
|
|
|
+| 项目 | 说明 |
|
|
|
|
|
+|------|------|
|
|
|
|
|
+| Content-Type | application/json |
|
|
|
|
|
+
|
|
|
|
|
+**请求参数:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 必录 | 说明 |
|
|
|
|
|
+|--------|------|------|------|
|
|
|
|
|
+| app_id | string | 是 | 应用标识 |
|
|
|
|
|
+| app_secret | string | 是 | 应用密钥 |
|
|
|
|
|
+
|
|
|
|
|
+**请求示例:**
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "app_id": "collect_system_001",
|
|
|
|
|
+ "app_secret": "sk_xxxxxxxxxxxxxxxx"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**响应参数:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 说明 |
|
|
|
|
|
+|--------|------|------|
|
|
|
|
|
+| access_token | string | 访问令牌 |
|
|
|
|
|
+| expires_in | integer | 过期时间(秒),默认 7200 |
|
|
|
|
|
+| token_type | string | 令牌类型,固定 "Bearer" |
|
|
|
|
|
+
|
|
|
|
|
+**响应示例:**
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "code": "000000",
|
|
|
|
|
+ "message": "success",
|
|
|
|
|
+ "data": {
|
|
|
|
|
+ "access_token": "eyJhbGciOiJIUzI1NiIs...",
|
|
|
|
|
+ "expires_in": 7200,
|
|
|
|
|
+ "token_type": "Bearer"
|
|
|
|
|
+ }
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+#### 2.1.3 Token 使用
|
|
|
|
|
+所有业务接口需在 HTTP Header 中携带 Token:
|
|
|
|
|
+```
|
|
|
|
|
+Authorization: Bearer {access_token}
|
|
|
|
|
+X-App-Id: {app_id}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+#### 2.1.4 Token 管理规则
|
|
|
|
|
+- Token 有效期 2 小时,过期后需重新获取
|
|
|
|
|
+- 支持多 Token 并存,新 Token 获取不影响旧 Token 使用
|
|
|
|
|
+- 单 app_id 同时最多持有 3 个有效 Token
|
|
|
|
|
+- Token 被吊销后即时失效
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 3. 统一响应格式
|
|
|
|
|
+
|
|
|
|
|
+所有接口返回统一的 JSON 结构:
|
|
|
|
|
+
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "code": "000000",
|
|
|
|
|
+ "message": "success",
|
|
|
|
|
+ "data": { ... },
|
|
|
|
|
+ "request_id": "req_xxxxxxxxxxxxxxxx"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 说明 |
|
|
|
|
|
+|--------|------|------|
|
|
|
|
|
+| code | string | 状态码,"000000" 表示成功,非 "000000" 表示失败 |
|
|
|
|
|
+| message | string | 提示信息 |
|
|
|
|
|
+| data | object/array/null | 业务数据 |
|
|
|
|
|
+| request_id | string | 请求追踪 ID |
|
|
|
|
|
+
|
|
|
|
|
+### 3.1 错误码定义
|
|
|
|
|
+
|
|
|
|
|
+| 错误码 | 说明 |
|
|
|
|
|
+|--------|------|
|
|
|
|
|
+| 000000 | 成功 |
|
|
|
|
|
+| 000400 | 请求参数错误 |
|
|
|
|
|
+| 000401 | 认证失败(Token 无效/过期) |
|
|
|
|
|
+| 000403 | 权限不足 |
|
|
|
|
|
+| 000404 | 资源不存在 |
|
|
|
|
|
+| 000429 | 请求频率超限 |
|
|
|
|
|
+| 000500 | 服务端内部错误 |
|
|
|
|
|
+| 001001 | 知识库不存在 |
|
|
|
|
|
+| 001002 | 知识库未启用 |
|
|
|
|
|
+| 002001 | 批量入库参数校验失败 |
|
|
|
|
|
+| 002002 | 批量入库任务不存在 |
|
|
|
|
|
+| 002003 | 批量入库任务数据格式错误 |
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 4. 接口详情
|
|
|
|
|
+
|
|
|
|
|
+### 4.1 知识库列表查询
|
|
|
|
|
+
|
|
|
|
|
+**GET /api/v1/knowledge-bases**
|
|
|
|
|
+
|
|
|
|
|
+查询当前应用有权限访问的已启用知识库列表。
|
|
|
|
|
+
|
|
|
|
|
+**请求参数(Query):**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 必录 | 说明 |
|
|
|
|
|
+|--------|------|------|------|
|
|
|
|
|
+| page | integer | 否 | 页码,默认 1 |
|
|
|
|
|
+| page_size | integer | 否 | 每页条数,默认 20,最大 100 |
|
|
|
|
|
+
|
|
|
|
|
+**响应参数:**
|
|
|
|
|
+
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "code": "000000",
|
|
|
|
|
+ "message": "success",
|
|
|
|
|
+ "data": {
|
|
|
|
|
+ "total": 10,
|
|
|
|
|
+ "page": 1,
|
|
|
|
|
+ "page_size": 20,
|
|
|
|
|
+ "items": [
|
|
|
|
|
+ {
|
|
|
|
|
+ "id": "5138215886300893584",
|
|
|
|
|
+ "name": "建设工程知识库",
|
|
|
|
|
+ "parent_table": "kb_parent_table_001",
|
|
|
|
|
+ "child_table": "kb_child_table_001",
|
|
|
|
|
+ "document_count": 1520,
|
|
|
|
|
+ "status": 1,
|
|
|
|
|
+ "created_at": "2026-05-10T10:30:00Z",
|
|
|
|
|
+ "created_by": "admin",
|
|
|
|
|
+ "metadata_schema": [
|
|
|
|
|
+ {
|
|
|
|
|
+ "field_name_cn": "文档编号",
|
|
|
|
|
+ "field_name_en": "doc_number",
|
|
|
|
|
+ "field_type": "string",
|
|
|
|
|
+ "description": "文档的唯一编号"
|
|
|
|
|
+ },
|
|
|
|
|
+ {
|
|
|
|
|
+ "field_name_cn": "发布日期",
|
|
|
|
|
+ "field_name_en": "publish_date",
|
|
|
|
|
+ "field_type": "date",
|
|
|
|
|
+ "description": "文档发布日期"
|
|
|
|
|
+ }
|
|
|
|
|
+ ]
|
|
|
|
|
+ }
|
|
|
|
|
+ ]
|
|
|
|
|
+ },
|
|
|
|
|
+ "request_id": "req_xxxxxxxxxxxxxxxx"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**列表项字段说明:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 说明 |
|
|
|
|
|
+|--------|------|------|
|
|
|
|
|
+| id | string | 知识库ID |
|
|
|
|
|
+| name | string | 知识库名称 |
|
|
|
|
|
+| parent_table | string | 父集合表名 |
|
|
|
|
|
+| child_table | string | 子集合表名 |
|
|
|
|
|
+| document_count | integer | 文档数量 |
|
|
|
|
|
+| status | integer | 状态:1-启用,0-禁用 |
|
|
|
|
|
+| created_at | string | 创建时间,ISO 8601 格式 |
|
|
|
|
|
+| created_by | string | 创建人 |
|
|
|
|
|
+| metadata_schema | array | 元数据字典定义列表 |
|
|
|
|
|
+
|
|
|
|
|
+**metadata_schema 字段说明:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 说明 |
|
|
|
|
|
+|--------|------|------|
|
|
|
|
|
+| field_name_cn | string | 字段中文名称 |
|
|
|
|
|
+| field_name_en | string | 字段英文名称 |
|
|
|
|
|
+| field_type | string | 字段类型:string/integer/float/date/boolean/array |
|
|
|
|
|
+| description | string | 字段描述 |
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+### 4.2 知识库详情查询
|
|
|
|
|
+
|
|
|
|
|
+**GET /api/v1/knowledge-bases/{id}**
|
|
|
|
|
+
|
|
|
|
|
+查询指定知识库的详细信息,包括元数据字典定义。
|
|
|
|
|
+
|
|
|
|
|
+**路径参数:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 必录 | 说明 |
|
|
|
|
|
+|--------|------|------|------|
|
|
|
|
|
+| id | string | 是 | 知识库ID |
|
|
|
|
|
+
|
|
|
|
|
+**响应参数:**
|
|
|
|
|
+
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "code": "000000",
|
|
|
|
|
+ "message": "success",
|
|
|
|
|
+ "data": {
|
|
|
|
|
+ "id": "5138215886300893584",
|
|
|
|
|
+ "name": "建设工程知识库",
|
|
|
|
|
+ "description": "建设工程相关法规文档",
|
|
|
|
|
+ "parent_table": "kb_parent_table_001",
|
|
|
|
|
+ "child_table": "kb_child_table_001",
|
|
|
|
|
+ "document_count": 1520,
|
|
|
|
|
+ "status": 1,
|
|
|
|
|
+ "created_at": "2026-05-10T10:30:00Z",
|
|
|
|
|
+ "created_by": "admin",
|
|
|
|
|
+ "updated_at": "2026-05-15T14:20:00Z",
|
|
|
|
|
+ "metadata_schema": [
|
|
|
|
|
+ {
|
|
|
|
|
+ "field_name_cn": "文档编号",
|
|
|
|
|
+ "field_name_en": "doc_number",
|
|
|
|
|
+ "field_type": "string",
|
|
|
|
|
+ "description": "文档的唯一编号"
|
|
|
|
|
+ },
|
|
|
|
|
+ {
|
|
|
|
|
+ "field_name_cn": "发布日期",
|
|
|
|
|
+ "field_name_en": "publish_date",
|
|
|
|
|
+ "field_type": "date",
|
|
|
|
|
+ "description": "文档发布日期"
|
|
|
|
|
+ },
|
|
|
|
|
+ {
|
|
|
|
|
+ "field_name_cn": "文档来源",
|
|
|
|
|
+ "field_name_en": "source",
|
|
|
|
|
+ "field_type": "string",
|
|
|
|
|
+ "description": "文档来源机构"
|
|
|
|
|
+ }
|
|
|
|
|
+ ]
|
|
|
|
|
+ },
|
|
|
|
|
+ "request_id": "req_xxxxxxxxxxxxxxxx"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+### 4.3 知识库批量入库
|
|
|
|
|
+
|
|
|
|
|
+**POST /api/v1/knowledge-bases/{kb_id}/batch-import**
|
|
|
|
|
+
|
|
|
|
|
+提交批量入库任务,系统将片段向量化后存入向量数据库。
|
|
|
|
|
+
|
|
|
|
|
+**路径参数:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 必录 | 说明 |
|
|
|
|
|
+|--------|------|------|------|
|
|
|
|
|
+| kb_id | string | 是 | 知识库ID |
|
|
|
|
|
+
|
|
|
|
|
+**请求体:**
|
|
|
|
|
+
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "task_no": "IMP202605170001",
|
|
|
|
|
+ "callback_url": "https://collect-system.example.com/api/callback/import-result",
|
|
|
|
|
+ "parents": [
|
|
|
|
|
+ {
|
|
|
|
|
+ "index": 0,
|
|
|
|
|
+ "parent_id": "5138215886300893584",
|
|
|
|
|
+ "hierarchy": "建设工程质量管理条例",
|
|
|
|
|
+ "text": "第一条 为了加强对建设工程质量的管理...",
|
|
|
|
|
+ "metadata": {
|
|
|
|
|
+ "doc_number": "国务院令第279号",
|
|
|
|
|
+ "publish_date": "2000-01-30"
|
|
|
|
|
+ },
|
|
|
|
|
+ "doc_id": "doc_001",
|
|
|
|
|
+ "tag_list": ["法规", "质量管理"],
|
|
|
|
|
+ "permission": {
|
|
|
|
|
+ "visible_roles": ["role_admin", "role_engineer"],
|
|
|
|
|
+ "visible_users": ["user_001"]
|
|
|
|
|
+ }
|
|
|
|
|
+ }
|
|
|
|
|
+ ],
|
|
|
|
|
+ "children": [
|
|
|
|
|
+ {
|
|
|
|
|
+ "index": 0,
|
|
|
|
|
+ "parent_id": "5138215886300893584",
|
|
|
|
|
+ "hierarchy": "第一章 总则",
|
|
|
|
|
+ "text": "第一条 为了加强对建设工程质量的管理,保证建设工程质量...",
|
|
|
|
|
+ "metadata": {
|
|
|
|
|
+ "doc_number": "国务院令第279号",
|
|
|
|
|
+ "section": "第一章"
|
|
|
|
|
+ },
|
|
|
|
|
+ "doc_id": "doc_001",
|
|
|
|
|
+ "tag_list": ["法规", "总则"],
|
|
|
|
|
+ "permission": {
|
|
|
|
|
+ "visible_roles": ["role_admin"]
|
|
|
|
|
+ }
|
|
|
|
|
+ }
|
|
|
|
|
+ ]
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**请求参数说明:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 必录 | 说明 |
|
|
|
|
|
+|--------|------|------|------|
|
|
|
|
|
+| task_no | string | 是 | 入库任务号,由调用方生成,用于记录当前入库任务状态及后续查询进度 |
|
|
|
|
|
+| callback_url | string | 否 | 回调地址,任务完成后样本中心将处理结果推送至该地址,不传则不回调 |
|
|
|
|
|
+| parents | array | 是 | 父段信息列表 |
|
|
|
|
|
+| children | array | 否 | 子段信息列表 |
|
|
|
|
|
+
|
|
|
|
|
+**parents/children 项字段说明:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 必录 | 说明 |
|
|
|
|
|
+|--------|------|------|------|
|
|
|
|
|
+| index | integer | 是 | 分片索引号 |
|
|
|
|
|
+| parent_id | string | 是 | 父段ID,子段通过此字段关联父段 |
|
|
|
|
|
+| hierarchy | string | 否 | 章节信息 |
|
|
|
|
|
+| text | string | 是 | 段文本信息 |
|
|
|
|
|
+| metadata | object | 否 | 段元数据信息,键值对结构 |
|
|
|
|
|
+| doc_id | string | 否 | 文档ID |
|
|
|
|
|
+| tag_list | array | 否 | 标签列表 |
|
|
|
|
|
+| permission | object | 否 | 权限配置 |
|
|
|
|
|
+
|
|
|
|
|
+**permission 字段说明:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 说明 |
|
|
|
|
|
+|--------|------|------|
|
|
|
|
|
+| visible_roles | array | 可见角色 ID 列表 |
|
|
|
|
|
+| visible_users | array | 可见用户 ID 列表 |
|
|
|
|
|
+
|
|
|
|
|
+**响应参数:**
|
|
|
|
|
+
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "code": "000000",
|
|
|
|
|
+ "message": "success",
|
|
|
|
|
+ "data": {
|
|
|
|
|
+ "task_id": "task_20260517xxxxxxxx",
|
|
|
|
|
+ "status": "pending"
|
|
|
|
|
+ },
|
|
|
|
|
+ "request_id": "req_xxxxxxxxxxxxxxxx"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 说明 |
|
|
|
|
|
+|--------|------|------|
|
|
|
|
|
+| task_id | string | 任务ID |
|
|
|
|
|
+| status | string | 任务状态:pending-待处理 |
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+### 4.4 批量入库任务查询
|
|
|
|
|
+
|
|
|
|
|
+**GET /api/v1/knowledge-bases/batch-import/{task_id}**
|
|
|
|
|
+
|
|
|
|
|
+查询批量入库任务的处理状态和结果。
|
|
|
|
|
+
|
|
|
|
|
+**路径参数:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 必录 | 说明 |
|
|
|
|
|
+|--------|------|------|------|
|
|
|
|
|
+| task_id | string | 是 | 入库任务ID |
|
|
|
|
|
+
|
|
|
|
|
+**响应参数:**
|
|
|
|
|
+
|
|
|
|
|
+**进行中:**
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "code": "000000",
|
|
|
|
|
+ "message": "success",
|
|
|
|
|
+ "data": {
|
|
|
|
|
+ "task_id": "task_20260517xxxxxxxx",
|
|
|
|
|
+ "task_no": "IMP202605170001",
|
|
|
|
|
+ "status": "processing",
|
|
|
|
|
+ "progress": {
|
|
|
|
|
+ "total": 100,
|
|
|
|
|
+ "processed": 45,
|
|
|
|
|
+ "succeeded": 43,
|
|
|
|
|
+ "failed": 2
|
|
|
|
|
+ },
|
|
|
|
|
+ "created_at": "2026-05-17T10:00:00Z",
|
|
|
|
|
+ "updated_at": "2026-05-17T10:01:30Z"
|
|
|
|
|
+ },
|
|
|
|
|
+ "request_id": "req_xxxxxxxxxxxxxxxx"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**已完成:**
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "code": "000000",
|
|
|
|
|
+ "message": "success",
|
|
|
|
|
+ "data": {
|
|
|
|
|
+ "task_id": "task_20260517xxxxxxxx",
|
|
|
|
|
+ "task_no": "IMP202605170001",
|
|
|
|
|
+ "status": "completed",
|
|
|
|
|
+ "progress": {
|
|
|
|
|
+ "total": 100,
|
|
|
|
|
+ "processed": 100,
|
|
|
|
|
+ "succeeded": 98,
|
|
|
|
|
+ "failed": 2
|
|
|
|
|
+ },
|
|
|
|
|
+ "created_at": "2026-05-17T10:00:00Z",
|
|
|
|
|
+ "completed_at": "2026-05-17T10:05:00Z",
|
|
|
|
|
+ "failures": [
|
|
|
|
|
+ {
|
|
|
|
|
+ "index": 12,
|
|
|
|
|
+ "parent_id": "5138215886300893584",
|
|
|
|
|
+ "error": "文本内容为空,跳过入库"
|
|
|
|
|
+ },
|
|
|
|
|
+ {
|
|
|
|
|
+ "index": 56,
|
|
|
|
|
+ "parent_id": "5138215886300893584",
|
|
|
|
|
+ "error": "向量化模型调用超时"
|
|
|
|
|
+ }
|
|
|
|
|
+ ]
|
|
|
|
|
+ },
|
|
|
|
|
+ "request_id": "req_xxxxxxxxxxxxxxxx"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**失败:**
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "code": "000000",
|
|
|
|
|
+ "message": "success",
|
|
|
|
|
+ "data": {
|
|
|
|
|
+ "task_id": "task_20260517xxxxxxxx",
|
|
|
|
|
+ "task_no": "IMP202605170001",
|
|
|
|
|
+ "status": "failed",
|
|
|
|
|
+ "error": "向量数据库连接异常",
|
|
|
|
|
+ "created_at": "2026-05-17T10:00:00Z",
|
|
|
|
|
+ "completed_at": "2026-05-17T10:00:05Z"
|
|
|
|
|
+ },
|
|
|
|
|
+ "request_id": "req_xxxxxxxxxxxxxxxx"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**字段说明:**
|
|
|
|
|
+
|
|
|
|
|
+| 参数名 | 类型 | 说明 |
|
|
|
|
|
+|--------|------|------|
|
|
|
|
|
+| task_id | string | 任务ID,系统生成 |
|
|
|
|
|
+| task_no | string | 入库任务号,调用方传入 |
|
|
|
|
|
+| status | string | 任务状态:pending-待处理,processing-处理中,completed-已完成,failed-失败 |
|
|
|
|
|
+| progress | object | 进度信息 |
|
|
|
|
|
+| progress.total | integer | 总条数 |
|
|
|
|
|
+| progress.processed | integer | 已处理条数 |
|
|
|
|
|
+| progress.succeeded | integer | 成功条数 |
|
|
|
|
|
+| progress.failed | integer | 失败条数 |
|
|
|
|
|
+| failures | array | 失败明细列表 |
|
|
|
|
|
+| error | string | 整体失败原因(仅 status=failed 时返回) |
|
|
|
|
|
+| created_at | string | 任务创建时间 |
|
|
|
|
|
+| updated_at | string | 状态更新时间 |
|
|
|
|
|
+| completed_at | string | 任务完成时间 |
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 5. 任务状态流转
|
|
|
|
|
+
|
|
|
|
|
+```
|
|
|
|
|
+pending → processing → completed
|
|
|
|
|
+ → failed
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+| 状态 | 说明 |
|
|
|
|
|
+|------|------|
|
|
|
|
|
+| pending | 任务已接收,排队等待处理 |
|
|
|
|
|
+| processing | 任务正在处理中 |
|
|
|
|
|
+| completed | 任务处理完成(可能部分失败) |
|
|
|
|
|
+| failed | 任务整体失败,无数据入库 |
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 6. 轮询与回调机制
|
|
|
|
|
+
|
|
|
|
|
+### 6.1 轮询方式
|
|
|
|
|
+外部系统提交任务后,通过 **GET /api/v1/knowledge-bases/batch-import/{task_id}** 轮询任务状态。
|
|
|
|
|
+
|
|
|
|
|
+**建议轮询策略:**
|
|
|
|
|
+- 初始间隔 2 秒
|
|
|
|
|
+- 每次轮询后间隔 × 1.5,上限 30 秒
|
|
|
|
|
+- 任务终态(completed/failed)后停止轮询
|
|
|
|
|
+
|
|
|
|
|
+### 6.2 回调方式(可选)
|
|
|
|
|
+
|
|
|
|
|
+外部系统在提交批量入库任务时,若无法主动轮询,可在请求体中传入 `callback_url`。任务完成后样本中心主动向该地址推送处理结果。
|
|
|
|
|
+
|
|
|
|
|
+**回调请求:** POST {callback_url}
|
|
|
|
|
+
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "task_id": "task_20260517xxxxxxxx",
|
|
|
|
|
+ "task_no": "IMP202605170001",
|
|
|
|
|
+ "status": "completed",
|
|
|
|
|
+ "kb_id": "5138215886300893584",
|
|
|
|
|
+ "progress": {
|
|
|
|
|
+ "total": 100,
|
|
|
|
|
+ "processed": 100,
|
|
|
|
|
+ "succeeded": 98,
|
|
|
|
|
+ "failed": 2
|
|
|
|
|
+ },
|
|
|
|
|
+ "completed_at": "2026-05-17T10:05:00Z"
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+回调失败时最多重试 3 次,间隔 10s / 30s / 60s。
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 7. 限流与配额
|
|
|
|
|
+
|
|
|
|
|
+| 接口 | 限流规则 |
|
|
|
|
|
+|------|----------|
|
|
|
|
|
+| 获取 Token | 10 次/分钟/app_id |
|
|
|
|
|
+| 知识库列表 | 60 次/分钟/app_id |
|
|
|
|
|
+| 知识库详情 | 60 次/分钟/app_id |
|
|
|
|
|
+| 批量入库提交 | 20 次/分钟/app_id |
|
|
|
|
|
+| 任务查询 | 120 次/分钟/app_id |
|
|
|
|
|
+
|
|
|
|
|
+超限返回 HTTP 429,Header 中包含限流信息:
|
|
|
|
|
+```
|
|
|
|
|
+X-RateLimit-Limit: 60
|
|
|
|
|
+X-RateLimit-Remaining: 0
|
|
|
|
|
+X-RateLimit-Reset: 1715932800
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 8. 接口总览
|
|
|
|
|
+
|
|
|
|
|
+| 序号 | 方法 | 路径 | 说明 |
|
|
|
|
|
+|------|------|------|------|
|
|
|
|
|
+| 1 | POST | /api/v1/auth/token | 获取访问令牌 |
|
|
|
|
|
+| 2 | GET | /api/v1/knowledge-bases | 知识库列表查询 |
|
|
|
|
|
+| 3 | GET | /api/v1/knowledge-bases/{id} | 知识库详情查询 |
|
|
|
|
|
+| 4 | POST | /api/v1/knowledge-bases/{kb_id}/batch-import | 批量入库提交 |
|
|
|
|
|
+| 5 | GET | /api/v1/knowledge-bases/batch-import/{task_id} | 入库任务查询 |
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 9. 异步任务表设计
|
|
|
|
|
+
|
|
|
|
|
+### 9.1 表结构定义
|
|
|
|
|
+
|
|
|
|
|
+**表名:`t_samp_task_management`**
|
|
|
|
|
+
|
|
|
|
|
+样本中心异步任务表,用于记录所有异步任务的生命周期。
|
|
|
|
|
+
|
|
|
|
|
+| 字段名 | 类型 | 说明 |
|
|
|
|
|
+|--------|------|------|
|
|
|
|
|
+| task_id | varchar(50) | 任务ID,系统自动生成,主键 |
|
|
|
|
|
+| task_no | varchar(50) | 任务编号,请求方传入,用于外部追踪 |
|
|
|
|
|
+| task_type | varchar(50) | 任务类型,如:bi(批量入库) |
|
|
|
|
|
+| task_params | text | 任务参数,JSON 格式存储原始请求参数 |
|
|
|
|
|
+| task_source | varchar(50) | 任务来源,枚举定义,如:col(采集系统) |
|
|
|
|
|
+| callback_url | varchar(300) | 任务回调地址 |
|
|
|
|
|
+| status | varchar(20) | 任务状态,枚举定义 |
|
|
|
|
|
+| callback_status | varchar(20) | 任务回调状态,枚举定义 |
|
|
|
|
|
+| error_message | varchar(3000) | 任务错误信息 |
|
|
|
|
|
+| completed_time | datetime | 任务完成时间 |
|
|
|
|
|
+| created_time | datetime | 任务创建时间 |
|
|
|
|
|
+| updated_time | datetime | 任务更新时间 |
|
|
|
|
|
+
|
|
|
|
|
+### 9.2 任务状态枚举
|
|
|
|
|
+
|
|
|
|
|
+| 状态值 | 说明 |
|
|
|
|
|
+|--------|------|
|
|
|
|
|
+| pending | 任务已接收,排队等待处理 |
|
|
|
|
|
+| processing | 任务正在处理中 |
|
|
|
|
|
+| completed | 任务处理完成(可能部分失败) |
|
|
|
|
|
+| failed | 任务整体失败,无数据入库 |
|
|
|
|
|
+
|
|
|
|
|
+### 9.3 回调状态枚举
|
|
|
|
|
+
|
|
|
|
|
+| 状态值 | 说明 |
|
|
|
|
|
+|--------|------|
|
|
|
|
|
+| pending | 待处理 |
|
|
|
|
|
+| processing | 处理中 |
|
|
|
|
|
+| success | 回调成功 |
|
|
|
|
|
+| failed | 回调失败 |
|
|
|
|
|
+
|
|
|
|
|
+### 9.4 任务处理逻辑
|
|
|
|
|
+
|
|
|
|
|
+1. **任务接收**:批量入库任务接收请求后,先将任务保存到任务表,状态设为 `pending`
|
|
|
|
|
+2. **异步处理**:发起异步任务进行知识片段入库处理,同时更新任务表状态为 `processing`
|
|
|
|
|
+3. **任务完成**:处理完成后更新任务表状态为 `completed`,记录完成时间
|
|
|
|
|
+4. **任务失败**:处理失败时更新任务表状态为 `failed`,同时写入错误信息到 `error_message`
|
|
|
|
|
+5. **回调通知**:任务处理完成后,若 `callback_url` 不为空,则发起回调请求,并根据回调结果更新 `callback_status`
|
|
|
|
|
+6. **状态查询**:入库任务查询接口直接查询任务表的 `status` 和 `callback_status`
|
|
|
|
|
+
|
|
|
|
|
+### 9.5 任务类型扩展
|
|
|
|
|
+
|
|
|
|
|
+任务表通过 `task_type` 字段支持多种任务类型,当前支持:
|
|
|
|
|
+
|
|
|
|
|
+| task_type | 说明 |
|
|
|
|
|
+|-----------|------|
|
|
|
|
|
+| bi | 批量入库任务 |
|
|
|
|
|
+
|
|
|
|
|
+后续可按需扩展其他任务类型(如:批量删除、数据同步等)。
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+## 10. 代码结构说明
|
|
|
|
|
+
|
|
|
|
|
+### 10.1 Token 认证模块
|
|
|
|
|
+
|
|
|
|
|
+| 文件路径 | 说明 |
|
|
|
|
|
+|----------|------|
|
|
|
|
|
+| `src/app/api/v1/token_api_view.py` | Token 接口路由定义,处理 `/api/v1/auth/token` 请求 |
|
|
|
|
|
+| `src/app/service/api/token_api_service.py` | Token 业务逻辑,负责凭证校验、Token 签发与过期管理 |
|
|
|
|
|
+
|
|
|
|
|
+### 10.2 知识库 API 模块
|
|
|
|
|
+
|
|
|
|
|
+| 文件路径 | 说明 |
|
|
|
|
|
+|----------|------|
|
|
|
|
|
+| `src/app/api/v1/knowledge_base_api_view.py` | 知识库相关接口路由定义,包含列表查询、详情查询、批量入库、任务查询 |
|
|
|
|
|
+| `src/app/service/api/knowledge_base_api_service.py` | 知识库业务逻辑,包含知识库查询、批量入库任务创建、任务状态查询等 |
|
|
|
|
|
+
|
|
|
|
|
+### 10.3 任务管理模块
|
|
|
|
|
+
|
|
|
|
|
+| 文件路径 | 说明 |
|
|
|
|
|
+|----------|------|
|
|
|
|
|
+| `src/app/service/sample_task_management_service.py` | 异步任务管理服务,负责任务的创建、状态更新、回调执行、进度追踪 |
|
|
|
|
|
+
|
|
|
|
|
+
|