|
|
@@ -0,0 +1,533 @@
|
|
|
+# 标注平台对外API接口文档
|
|
|
+
|
|
|
+## 概述
|
|
|
+
|
|
|
+本文档描述了标注平台提供给外部系统(如样本中心)调用的API接口。通过这些接口,外部系统可以:
|
|
|
+
|
|
|
+1. 创建标注项目并导入任务数据
|
|
|
+2. 查询项目标注进度
|
|
|
+3. 导出标注完成的数据
|
|
|
+
|
|
|
+## 认证方式
|
|
|
+
|
|
|
+所有API请求需要在HTTP Header中携带管理员Token进行身份验证:
|
|
|
+
|
|
|
+```
|
|
|
+Authorization: Bearer <admin_token>
|
|
|
+```
|
|
|
+
|
|
|
+**注意**:Token由标注平台管理员提供,具有长期有效性。
|
|
|
+
|
|
|
+## 基础URL
|
|
|
+
|
|
|
+```
|
|
|
+生产环境: https://your-domain.com/api/external
|
|
|
+开发环境: http://localhost:8003/api/external
|
|
|
+```
|
|
|
+
|
|
|
+## 接口列表
|
|
|
+
|
|
|
+### 1. 项目初始化接口
|
|
|
+
|
|
|
+创建新的标注项目并批量导入任务数据。
|
|
|
+
|
|
|
+**请求**
|
|
|
+
|
|
|
+```
|
|
|
+POST /api/external/projects/init
|
|
|
+```
|
|
|
+
|
|
|
+**请求头**
|
|
|
+
|
|
|
+| 参数 | 类型 | 必填 | 说明 |
|
|
|
+|------|------|------|------|
|
|
|
+| Authorization | string | 是 | Bearer Token |
|
|
|
+| Content-Type | string | 是 | application/json |
|
|
|
+
|
|
|
+**请求体**
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "name": "图像分类项目-批次001",
|
|
|
+ "description": "对商品图片进行分类标注",
|
|
|
+ "task_type": "image_classification",
|
|
|
+ "data": [
|
|
|
+ {
|
|
|
+ "id": "ext_001",
|
|
|
+ "content": "https://example.com/images/product1.jpg",
|
|
|
+ "metadata": {
|
|
|
+ "batch": "001",
|
|
|
+ "source": "商品库"
|
|
|
+ }
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "id": "ext_002",
|
|
|
+ "content": "https://example.com/images/product2.jpg",
|
|
|
+ "metadata": {
|
|
|
+ "batch": "001",
|
|
|
+ "source": "商品库"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "external_id": "sample_center_proj_001"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+**请求参数说明**
|
|
|
+
|
|
|
+| 参数 | 类型 | 必填 | 说明 |
|
|
|
+|------|------|------|------|
|
|
|
+| name | string | 是 | 项目名称 |
|
|
|
+| description | string | 否 | 项目描述 |
|
|
|
+| task_type | string | 是 | 任务类型,见下方支持的类型 |
|
|
|
+| data | array | 是 | 任务数据列表 |
|
|
|
+| data[].id | string | 否 | 外部系统的数据ID,用于关联 |
|
|
|
+| data[].content | string | 是 | 数据内容(文本或图像URL) |
|
|
|
+| data[].metadata | object | 否 | 额外元数据 |
|
|
|
+| external_id | string | 否 | 外部系统的项目ID,用于关联查询 |
|
|
|
+
|
|
|
+**注意**:标签(labels)由标注平台管理员在项目配置阶段设置,样本中心无需提供。
|
|
|
+
|
|
|
+**支持的任务类型**
|
|
|
+
|
|
|
+| task_type | 说明 | data.content 格式 |
|
|
|
+|-----------|------|-------------------|
|
|
|
+| text_classification | 文本分类 | 文本内容 |
|
|
|
+| image_classification | 图像分类 | 图像URL |
|
|
|
+| object_detection | 目标检测 | 图像URL |
|
|
|
+| ner | 命名实体识别 | 文本内容 |
|
|
|
+
|
|
|
+**响应**
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "project_id": "proj_abc123def456",
|
|
|
+ "project_name": "图像分类项目-批次001",
|
|
|
+ "task_count": 2,
|
|
|
+ "status": "draft",
|
|
|
+ "created_at": "2026-02-03T10:30:00Z",
|
|
|
+ "config": "<View>...</View>",
|
|
|
+ "external_id": "sample_center_proj_001"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+**响应参数说明**
|
|
|
+
|
|
|
+| 参数 | 类型 | 说明 |
|
|
|
+|------|------|------|
|
|
|
+| project_id | string | **标注平台项目ID,样本中心必须保存此ID用于后续进度查询和数据导出** |
|
|
|
+| project_name | string | 项目名称 |
|
|
|
+| task_count | int | 创建的任务数量 |
|
|
|
+| status | string | 项目状态(draft=草稿,等待管理员配置) |
|
|
|
+| created_at | string | 创建时间(ISO 8601格式) |
|
|
|
+| config | string | 实际使用的XML配置模板 |
|
|
|
+| external_id | string | 样本中心传入的外部ID(如有) |
|
|
|
+
|
|
|
+**重要**:`project_id` 是标注平台生成的唯一标识符,样本中心需要保存此ID用于后续的进度查询和数据导出接口调用。
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+### 2. 项目进度查询接口
|
|
|
+
|
|
|
+查询指定项目的标注进度和人员完成情况。
|
|
|
+
|
|
|
+**请求**
|
|
|
+
|
|
|
+```
|
|
|
+GET /api/external/projects/{project_id}/progress
|
|
|
+```
|
|
|
+
|
|
|
+**路径参数**
|
|
|
+
|
|
|
+| 参数 | 类型 | 必填 | 说明 |
|
|
|
+|------|------|------|------|
|
|
|
+| project_id | string | 是 | 项目ID |
|
|
|
+
|
|
|
+**响应**
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "project_id": "proj_abc123def456",
|
|
|
+ "project_name": "图像分类项目-批次001",
|
|
|
+ "status": "in_progress",
|
|
|
+ "total_tasks": 100,
|
|
|
+ "completed_tasks": 45,
|
|
|
+ "in_progress_tasks": 30,
|
|
|
+ "pending_tasks": 25,
|
|
|
+ "completion_percentage": 45.0,
|
|
|
+ "annotators": [
|
|
|
+ {
|
|
|
+ "user_id": "user_001",
|
|
|
+ "username": "annotator1",
|
|
|
+ "assigned_count": 50,
|
|
|
+ "completed_count": 25,
|
|
|
+ "in_progress_count": 15,
|
|
|
+ "completion_rate": 50.0
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "user_id": "user_002",
|
|
|
+ "username": "annotator2",
|
|
|
+ "assigned_count": 50,
|
|
|
+ "completed_count": 20,
|
|
|
+ "in_progress_count": 15,
|
|
|
+ "completion_rate": 40.0
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "last_updated": "2026-02-03T15:30:00Z"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+**响应参数说明**
|
|
|
+
|
|
|
+| 参数 | 类型 | 说明 |
|
|
|
+|------|------|------|
|
|
|
+| project_id | string | 项目ID |
|
|
|
+| project_name | string | 项目名称 |
|
|
|
+| status | string | 项目状态 |
|
|
|
+| total_tasks | int | 总任务数 |
|
|
|
+| completed_tasks | int | 已完成任务数 |
|
|
|
+| in_progress_tasks | int | 进行中任务数 |
|
|
|
+| pending_tasks | int | 待处理任务数 |
|
|
|
+| completion_percentage | float | 完成百分比(0-100) |
|
|
|
+| annotators | array | 标注人员列表 |
|
|
|
+| annotators[].user_id | string | 用户ID |
|
|
|
+| annotators[].username | string | 用户名 |
|
|
|
+| annotators[].assigned_count | int | 分配的任务数 |
|
|
|
+| annotators[].completed_count | int | 已完成数 |
|
|
|
+| annotators[].in_progress_count | int | 进行中数 |
|
|
|
+| annotators[].completion_rate | float | 个人完成率(0-100) |
|
|
|
+| last_updated | string | 最后更新时间 |
|
|
|
+
|
|
|
+**项目状态说明**
|
|
|
+
|
|
|
+| status | 说明 |
|
|
|
+|--------|------|
|
|
|
+| draft | 草稿,等待管理员配置 |
|
|
|
+| configuring | 配置中 |
|
|
|
+| ready | 就绪,等待分发任务 |
|
|
|
+| in_progress | 进行中,标注人员正在工作 |
|
|
|
+| completed | 已完成 |
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+### 3. 数据导出接口
|
|
|
+
|
|
|
+导出项目的标注数据,支持多种导出格式,以文件形式返回。
|
|
|
+
|
|
|
+**请求**
|
|
|
+
|
|
|
+```
|
|
|
+POST /api/external/projects/{project_id}/export
|
|
|
+```
|
|
|
+
|
|
|
+**路径参数**
|
|
|
+
|
|
|
+| 参数 | 类型 | 必填 | 说明 |
|
|
|
+|------|------|------|------|
|
|
|
+| project_id | string | 是 | 标注平台项目ID(创建项目时返回的project_id) |
|
|
|
+
|
|
|
+**请求体**
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "format": "sharegpt",
|
|
|
+ "completed_only": true,
|
|
|
+ "callback_url": "https://sample-center.example.com/api/callback/export"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+**请求参数说明**
|
|
|
+
|
|
|
+| 参数 | 类型 | 必填 | 默认值 | 说明 |
|
|
|
+|------|------|------|--------|------|
|
|
|
+| format | string | 否 | json | 导出格式,见下方支持的格式 |
|
|
|
+| completed_only | bool | 否 | true | 是否只导出已完成的任务 |
|
|
|
+| callback_url | string | 否 | null | 回调URL,导出完成后通知样本中心 |
|
|
|
+
|
|
|
+**支持的导出格式**
|
|
|
+
|
|
|
+| format | 说明 | 适用场景 |
|
|
|
+|--------|------|----------|
|
|
|
+| json | 通用JSON格式 | 通用数据交换 |
|
|
|
+| csv | CSV表格格式 | Excel处理 |
|
|
|
+| sharegpt | ShareGPT对话格式 | 对话模型训练 |
|
|
|
+| yolo | YOLO目标检测格式 | YOLO模型训练 |
|
|
|
+| coco | COCO数据集格式 | 目标检测/分割模型训练 |
|
|
|
+| alpaca | Alpaca指令微调格式 | LLM指令微调 |
|
|
|
+
|
|
|
+**响应**
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "project_id": "proj_abc123def456",
|
|
|
+ "format": "sharegpt",
|
|
|
+ "total_exported": 45,
|
|
|
+ "file_url": "/api/exports/export_789/download",
|
|
|
+ "file_name": "export_proj_abc123_sharegpt_20260203.json",
|
|
|
+ "file_size": 1024000,
|
|
|
+ "expires_at": "2026-02-10T10:30:00Z"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+**响应参数说明**
|
|
|
+
|
|
|
+| 参数 | 类型 | 说明 |
|
|
|
+|------|------|------|
|
|
|
+| project_id | string | 项目ID |
|
|
|
+| format | string | 导出格式 |
|
|
|
+| total_exported | int | 导出的任务数量 |
|
|
|
+| file_url | string | 文件下载URL |
|
|
|
+| file_name | string | 文件名 |
|
|
|
+| file_size | int | 文件大小(字节) |
|
|
|
+| expires_at | string | 下载链接过期时间 |
|
|
|
+
|
|
|
+**回调通知**
|
|
|
+
|
|
|
+如果提供了 `callback_url`,导出完成后系统会向该URL发送POST请求:
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "project_id": "proj_abc123def456",
|
|
|
+ "export_id": "export_789",
|
|
|
+ "status": "completed",
|
|
|
+ "format": "sharegpt",
|
|
|
+ "total_exported": 45,
|
|
|
+ "file_url": "https://annotation-platform.example.com/api/exports/export_789/download",
|
|
|
+ "file_name": "export_proj_abc123_sharegpt_20260203.json",
|
|
|
+ "file_size": 1024000,
|
|
|
+ "error_message": null
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+**导出格式示例**
|
|
|
+
|
|
|
+**ShareGPT格式**:
|
|
|
+```json
|
|
|
+[
|
|
|
+ {
|
|
|
+ "conversations": [
|
|
|
+ {"from": "human", "value": "请对这段文本进行分类"},
|
|
|
+ {"from": "gpt", "value": "这是一段正面评价"}
|
|
|
+ ]
|
|
|
+ }
|
|
|
+]
|
|
|
+```
|
|
|
+
|
|
|
+**YOLO格式**:
|
|
|
+```
|
|
|
+# image1.txt
|
|
|
+0 0.5 0.5 0.2 0.3
|
|
|
+1 0.3 0.4 0.1 0.2
|
|
|
+```
|
|
|
+
|
|
|
+**COCO格式**:
|
|
|
+```json
|
|
|
+{
|
|
|
+ "images": [...],
|
|
|
+ "annotations": [...],
|
|
|
+ "categories": [...]
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+**Alpaca格式**:
|
|
|
+```json
|
|
|
+[
|
|
|
+ {
|
|
|
+ "instruction": "对以下文本进行情感分类",
|
|
|
+ "input": "这个产品非常好用!",
|
|
|
+ "output": "正面"
|
|
|
+ }
|
|
|
+]
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## 错误响应
|
|
|
+
|
|
|
+所有接口在发生错误时返回统一格式的错误响应:
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "error_code": "PROJECT_NOT_FOUND",
|
|
|
+ "message": "项目不存在",
|
|
|
+ "details": {
|
|
|
+ "project_id": "proj_invalid"
|
|
|
+ }
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+**错误码说明**
|
|
|
+
|
|
|
+| error_code | HTTP状态码 | 说明 |
|
|
|
+|------------|-----------|------|
|
|
|
+| INVALID_TOKEN | 401 | Token无效或已过期 |
|
|
|
+| PERMISSION_DENIED | 403 | 权限不足,需要管理员权限 |
|
|
|
+| PROJECT_NOT_FOUND | 404 | 项目不存在 |
|
|
|
+| INVALID_REQUEST | 400 | 请求参数无效 |
|
|
|
+| INVALID_TASK_TYPE | 400 | 不支持的任务类型 |
|
|
|
+| EXPORT_FAILED | 500 | 导出失败 |
|
|
|
+| INTERNAL_ERROR | 500 | 内部服务器错误 |
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## 使用示例
|
|
|
+
|
|
|
+### Python 示例
|
|
|
+
|
|
|
+```python
|
|
|
+import requests
|
|
|
+
|
|
|
+BASE_URL = "http://localhost:8003/api/external"
|
|
|
+ADMIN_TOKEN = "your_admin_token_here"
|
|
|
+
|
|
|
+headers = {
|
|
|
+ "Authorization": f"Bearer {ADMIN_TOKEN}",
|
|
|
+ "Content-Type": "application/json"
|
|
|
+}
|
|
|
+
|
|
|
+# 1. 创建项目
|
|
|
+init_data = {
|
|
|
+ "name": "文本分类项目",
|
|
|
+ "description": "对用户评论进行情感分类",
|
|
|
+ "task_type": "text_classification",
|
|
|
+ "data": [
|
|
|
+ {"id": "1", "content": "这个产品非常好用!"},
|
|
|
+ {"id": "2", "content": "质量太差了,不推荐"},
|
|
|
+ {"id": "3", "content": "一般般,没什么特别的"}
|
|
|
+ ],
|
|
|
+ "external_id": "sample_center_001"
|
|
|
+}
|
|
|
+
|
|
|
+response = requests.post(
|
|
|
+ f"{BASE_URL}/projects/init",
|
|
|
+ json=init_data,
|
|
|
+ headers=headers
|
|
|
+)
|
|
|
+project = response.json()
|
|
|
+project_id = project["project_id"]
|
|
|
+print(f"项目创建成功: {project_id}")
|
|
|
+
|
|
|
+# 2. 查询进度
|
|
|
+response = requests.get(
|
|
|
+ f"{BASE_URL}/projects/{project_id}/progress",
|
|
|
+ headers=headers
|
|
|
+)
|
|
|
+progress = response.json()
|
|
|
+print(f"完成进度: {progress['completion_percentage']}%")
|
|
|
+
|
|
|
+# 3. 导出数据(ShareGPT格式)
|
|
|
+export_data = {
|
|
|
+ "format": "sharegpt",
|
|
|
+ "completed_only": True,
|
|
|
+ "callback_url": "https://sample-center.example.com/api/callback"
|
|
|
+}
|
|
|
+
|
|
|
+response = requests.post(
|
|
|
+ f"{BASE_URL}/projects/{project_id}/export",
|
|
|
+ json=export_data,
|
|
|
+ headers=headers
|
|
|
+)
|
|
|
+export_result = response.json()
|
|
|
+print(f"导出任务数: {export_result['total_exported']}")
|
|
|
+print(f"下载链接: {export_result['file_url']}")
|
|
|
+
|
|
|
+# 下载导出文件
|
|
|
+file_response = requests.get(
|
|
|
+ f"http://localhost:8003{export_result['file_url']}",
|
|
|
+ headers=headers
|
|
|
+)
|
|
|
+with open(export_result['file_name'], 'wb') as f:
|
|
|
+ f.write(file_response.content)
|
|
|
+print(f"文件已保存: {export_result['file_name']}")
|
|
|
+```
|
|
|
+
|
|
|
+### cURL 示例
|
|
|
+
|
|
|
+```bash
|
|
|
+# 1. 创建项目
|
|
|
+curl -X POST "http://localhost:8003/api/external/projects/init" \
|
|
|
+ -H "Authorization: Bearer your_admin_token" \
|
|
|
+ -H "Content-Type: application/json" \
|
|
|
+ -d '{
|
|
|
+ "name": "图像分类项目",
|
|
|
+ "task_type": "image_classification",
|
|
|
+ "data": [
|
|
|
+ {"id": "1", "content": "https://example.com/img1.jpg"},
|
|
|
+ {"id": "2", "content": "https://example.com/img2.jpg"}
|
|
|
+ ],
|
|
|
+ "external_id": "sample_center_proj_001"
|
|
|
+ }'
|
|
|
+
|
|
|
+# 2. 查询进度
|
|
|
+curl -X GET "http://localhost:8003/api/external/projects/proj_xxx/progress" \
|
|
|
+ -H "Authorization: Bearer your_admin_token"
|
|
|
+
|
|
|
+# 3. 导出数据(YOLO格式)
|
|
|
+curl -X POST "http://localhost:8003/api/external/projects/proj_xxx/export" \
|
|
|
+ -H "Authorization: Bearer your_admin_token" \
|
|
|
+ -H "Content-Type: application/json" \
|
|
|
+ -d '{"format": "yolo", "completed_only": true}'
|
|
|
+
|
|
|
+# 4. 下载导出文件
|
|
|
+curl -X GET "http://localhost:8003/api/exports/export_xxx/download" \
|
|
|
+ -H "Authorization: Bearer your_admin_token" \
|
|
|
+ -o exported_data.zip
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## 工作流程说明
|
|
|
+
|
|
|
+```
|
|
|
+样本中心 标注平台
|
|
|
+ | |
|
|
|
+ | 1. POST /projects/init |
|
|
|
+ |-------------------------------->|
|
|
|
+ | 返回 project_id, status=draft |
|
|
|
+ |<--------------------------------|
|
|
|
+ | |
|
|
|
+ | | 管理员配置项目
|
|
|
+ | | (配置标签、XML等)
|
|
|
+ | |
|
|
|
+ | | 管理员分发任务
|
|
|
+ | | (选择标注人员)
|
|
|
+ | |
|
|
|
+ | 2. GET /projects/{id}/progress |
|
|
|
+ |-------------------------------->|
|
|
|
+ | 返回进度信息 |
|
|
|
+ |<--------------------------------|
|
|
|
+ | |
|
|
|
+ | (轮询查询进度...) |
|
|
|
+ | |
|
|
|
+ | 3. POST /projects/{id}/export |
|
|
|
+ |-------------------------------->|
|
|
|
+ | 返回标注数据 |
|
|
|
+ |<--------------------------------|
|
|
|
+ | |
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## 注意事项
|
|
|
+
|
|
|
+1. **项目状态**:外部系统创建的项目初始状态为 `draft`,需要标注平台管理员完成配置(设置标签等)和任务分发后才会进入 `in_progress` 状态。
|
|
|
+
|
|
|
+2. **标签配置**:标签由标注平台管理员在项目配置阶段设置,样本中心只需提供任务数据,无需关心标签定义。
|
|
|
+
|
|
|
+3. **数据格式**:
|
|
|
+ - 文本类任务(text_classification, ner):`content` 字段为文本内容
|
|
|
+ - 图像类任务(image_classification, object_detection):`content` 字段为图像URL
|
|
|
+
|
|
|
+4. **外部ID关联**:建议在创建项目时提供 `external_id`,方便后续在样本中心系统中关联查询。
|
|
|
+
|
|
|
+5. **导出时机**:建议在项目状态为 `completed` 或 `completion_percentage` 达到预期值时再进行数据导出。
|
|
|
+
|
|
|
+6. **Token安全**:请妥善保管管理员Token,不要在客户端代码中暴露。
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## 版本历史
|
|
|
+
|
|
|
+| 版本 | 日期 | 说明 |
|
|
|
+|------|------|------|
|
|
|
+| 1.0.0 | 2026-02-03 | 初始版本 |
|