Selaa lähdekoodia

修改SSO和样本中心对接

kinglee 1 viikko sitten
vanhempi
sitoutus
b4b030d78b

+ 10 - 2
.env

@@ -18,7 +18,15 @@ JWT_SECRET_KEY=jwt-secret-change-me-to-random-string
 SSO_BASE_URL=http://192.168.92.61:8200
 SSO_CLIENT_ID=3PNhrQRejIutH9SKIHJDsWnxiEDpnrzh
 SSO_CLIENT_SECRET=4e5S2HKp0iHKwVLmrQu25TV3e0n507nkROObHCq6kbfV4LtdAG4X9K5vYlwnlH0V
-SSO_REDIRECT_URI=http://localhost:5000/auth/callback
-SSO_FRONTEND_URL=http://localhost:5000
+SSO_REDIRECT_URI=http://192.168.92.150:5000/auth/callback
+SSO_FRONTEND_URL=http://192.168.92.150:5000
 SSO_SCOPE=email
 SSO_LOGOUT_REDIRECT_URL=http://192.168.92.61:9200/login
+
+# 应用名称(显示在页面标题、侧边栏等位置)
+APP_NAME=路桥采集平台
+
+# 样本中心配置
+SAMPLE_CENTER_BASE_URL=http://192.168.92.61
+SAMPLE_CENTER_APP_ID=3PNhrQRejIutH9SKIHJDsWnxiEDpnrzh
+SAMPLE_CENTER_APP_SECRET=4e5S2HKp0iHKwVLmrQu25TV3e0n507nkROObHCq6kbfV4LtdAG4X9K5vYlwnlH0V

+ 177 - 0
DEPLOY.md

@@ -0,0 +1,177 @@
+# Docker 部署文档
+
+## 环境要求
+
+- Docker >= 20.10
+- Docker Compose >= 2.0
+- 外部 PostgreSQL 数据库(生产环境)
+
+## 一、配置文件说明
+
+| 文件 | 用途 |
+|------|------|
+| `Dockerfile` | 应用镜像构建,基于 `python:3.12-slim`,使用 `uv` 管理依赖 |
+| `docker-compose.yml` | **生产环境** 配置,连接外部 PostgreSQL,容器启动自动执行数据库迁移 |
+| `docker-compose.dev.yml` | **开发环境** 覆盖配置,含本地 PostgreSQL + 代码热重载 |
+| `.env` | 环境变量(数据库、SSO、密钥等),**不会被打包进镜像** |
+| `.dockerignore` | 构建排除项(`venv/`、`__pycache__/`、`.env` 等) |
+
+## 二、环境配置
+
+构建前确保项目根目录下 `.env` 文件已配置,参考 `.env.example`:
+
+```ini
+# PostgreSQL 数据库配置(必填)
+DB_HOST=47.109.147.74
+DB_PORT=5432
+DB_USER=maas_collect
+DB_PASSWORD=your_db_password
+DB_NAME=maas_collect
+
+# Flask 密钥(生产环境请更换为随机字符串)
+SECRET_KEY=change-me-to-a-random-secret
+
+# JWT 密钥(生产环境请更换为随机字符串)
+JWT_SECRET_KEY=jwt-secret-change-me-to-random-string
+
+# SSO 统一认证配置
+SSO_BASE_URL=http://192.168.92.61:8200
+SSO_CLIENT_ID=your_client_id
+SSO_CLIENT_SECRET=your_client_secret
+SSO_REDIRECT_URI=http://localhost:5000/auth/callback
+SSO_FRONTEND_URL=http://localhost:5000
+SSO_SCOPE=email
+SSO_LOGOUT_REDIRECT_URL=http://192.168.92.61:9200/login
+
+# 样本中心配置
+SAMPLE_CENTER_BASE_URL=http://192.168.92.61
+SAMPLE_CENTER_APP_ID=your-app-id
+SAMPLE_CENTER_APP_SECRET=your-app-secret
+```
+
+> **注意**:生产部署时 `DB_HOST`、`DB_USER`、`DB_PASSWORD`、`DB_NAME` 为必填项,`docker-compose.yml` 中无默认回退值。
+
+## 三、生产部署
+
+### 3.1 构建并启动
+
+```bash
+docker compose up -d --build
+```
+
+启动流程:
+1. Docker 构建镜像(安装系统依赖 + Python 依赖 + Playwright)
+2. 从 `.env` 读取环境变量注入容器
+3. 容器启动时 `entrypoint.sh` 自动执行 `migrate_db.py`(同步数据库表结构)
+4. 启动 Flask 应用,监听 `0.0.0.0:5000`
+
+### 3.2 验证服务
+
+```bash
+# 查看容器状态
+docker compose ps
+
+# 查看应用日志
+docker compose logs -f web
+
+# 健康检查
+curl -f http://localhost:5000
+```
+
+### 3.3 常用运维命令
+
+```bash
+# 停止服务
+docker compose down
+
+# 停止并清理数据卷(会删除本地 PostgreSQL 数据,生产慎用)
+docker compose down -v
+
+# 重新构建并启动(代码更新后)
+docker compose up -d --build
+
+# 仅重启服务(不重建镜像)
+docker compose restart web
+
+# 进入容器执行命令
+docker compose exec web uv run python -c "from app import create_app; print(create_app().config['SQLALCHEMY_DATABASE_URI'])"
+
+# 查看数据库迁移状态
+docker compose exec web uv run flask db heads
+```
+
+## 四、开发模式
+
+```bash
+docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build
+```
+
+开发模式与生产模式的区别:
+
+| 差异 | 生产模式 | 开发模式 |
+|------|----------|----------|
+| 数据库 | 外部 PostgreSQL(`.env` 配置) | 本地 `postgres:16-alpine` 容器 |
+| 代码热重载 | 否 | 是(挂载 `.` 到 `/app`) |
+| FLASK_DEBUG | false | true |
+| 健康检查 | 开启(30s 间隔) | 关闭 |
+
+## 五、架构说明
+
+### 5.1 容器启动流程
+
+```
+docker compose up
+  → Dockerfile 构建镜像
+  → entrypoint.sh 执行
+    → migrate_db.py(自动添加/更新数据库表字段)
+    → run.py(启动 Flask 应用)
+```
+
+### 5.2 数据库迁移
+
+`entrypoint.sh` 会在每次容器启动时执行 `migrate_db.py`,该脚本使用幂等逻辑(先检查字段是否存在再添加),重复执行不会产生错误。
+
+新增字段包括:
+- `sso_sub` — SSO 用户标识(唯一索引)
+- `real_name` — 真实姓名
+- `roles` — 角色列表(JSON 文本)
+- `email` / `phone` / `avatar_url` — 用户信息
+
+### 5.3 SSO 登录流程
+
+```
+用户点击 SSO 登录
+  → 前端 GET /auth/sso/authorize?redirect=true
+  → 302 跳转至 SSO_BASE_URL/oauth/authorize
+  → 用户授权后回调 /auth/callback?code=xxx
+  → 前端 POST /api/oauth/exchange-code {"code": "xxx"}
+  → 后端:
+    1. 用 code 向 SSO 换 access_token
+    2. 用 token 获取用户信息
+    3. 同步用户到本地数据库
+    4. 签发 JWT(access + refresh)
+  → 返回 token + user info 给前端
+```
+
+### 5.4 样本中心集成
+
+样本中心模块(`/knowledge`)提供知识库管理和批量入库功能:
+
+```
+用户访问 /knowledge
+  → 侧边栏「样本中心」进入页面
+  → Tab 1: 知识库列表(GET /api/v1/knowledge/bases)
+  → Tab 2: 知识库详情 + 批量入库按钮
+  → Tab 3: 入库任务列表(自动刷新 + 手动刷新)
+```
+
+**后端流程:**
+1. 提交入库 → `POST /api/v1/knowledge/bases/{kb_id}/batch-import`
+2. 创建本地 `KnowledgeImportTask` 记录
+3. 后台轮询器每 5 秒检查待处理任务,指数退避轮询样本中心
+4. 样本中心回调推送 → `POST /api/v1/callback/knowledge-import`
+
+**Token 管理:**
+- `SampleCenterClient` 内部缓存 access_token
+- 过期前 5 分钟自动刷新
+- 线程安全(threading.Lock)

+ 11 - 17
Dockerfile

@@ -1,32 +1,26 @@
-FROM python:3.12-slim AS base
+FROM python:3.12-slim
 
-# 设置工作目录
 WORKDIR /app
 
-# 安装系统依赖(Playwright 依赖)
 RUN apt-get update && apt-get install -y --no-install-recommends \
     build-essential \
     curl \
     libpq-dev \
     && rm -rf /var/lib/apt/lists/*
 
-# 安装 uv
-RUN pip install --no-cache-dir uv
+COPY pyproject.toml uv.lock ./
 
-# 复制依赖文件
-COPY pyproject.toml ./
+# 配置 pip 和 uv 使用阿里云镜像,使用系统 Python
+ENV PIP_INDEX_URL=https://mirrors.aliyun.com/pypi/simple/
+ENV UV_INDEX_URL=https://mirrors.aliyun.com/pypi/simple/
+ENV UV_PYTHON_PREFERENCE=system
 
-# 使用 uv 安装 Python 依赖(不含 dev)
-RUN uv sync --frozen --no-dev
+RUN pip install --no-cache-dir uv && \
+    uv sync --frozen --no-dev
 
-# 安装 Playwright 浏览器
-RUN uv run playwright install chromium --with-deps
-
-# 复制应用代码
 COPY . .
 
-# 暴露端口
-EXPOSE 5000
+RUN chmod +x entrypoint.sh
+ENTRYPOINT ["./entrypoint.sh"]
 
-# 启动命令(生产模式)
-CMD ["uv", "run", "python", "run.py"]
+EXPOSE 5000

+ 17 - 3
app/__init__.py

@@ -1,11 +1,13 @@
 from flask import Flask
 from flask_sqlalchemy import SQLAlchemy
 from flask_login import LoginManager
+from flask_migrate import Migrate
 from .config import Config
 
 db = SQLAlchemy()
 login_manager = LoginManager()
 login_manager.login_view = 'main.login'
+migrate = Migrate()
 
 def create_app(config_class=Config):
     app = Flask(__name__)
@@ -13,18 +15,30 @@ def create_app(config_class=Config):
 
     db.init_app(app)
     login_manager.init_app(app)
+    migrate.init_app(app, db)
 
-    from .routes import main_routes, ai_routes, source_routes, collection_routes, data_routes, deep_routes
+    from .routes import main_routes, ai_routes, source_routes, collection_routes, data_routes, deep_routes, sso_routes, knowledge_routes
     app.register_blueprint(main_routes.bp)
     app.register_blueprint(ai_routes.bp)
     app.register_blueprint(source_routes.bp)
     app.register_blueprint(collection_routes.bp)
     app.register_blueprint(data_routes.bp)
     app.register_blueprint(deep_routes.bp)
-    
+    app.register_blueprint(sso_routes.bp)
+    app.register_blueprint(knowledge_routes.bp)
+
     from . import models
-    
+
     with app.app_context():
         db.create_all()
 
+    @app.context_processor
+    def inject_app_name():
+        return {'app_name': app.config.get('APP_NAME', '路桥采集平台')}
+
+    from .knowledge_poller import KnowledgePoller
+    poller = KnowledgePoller(app)
+    poller.start()
+    app.config['_KNOWLEDGE_POLLER'] = poller
+
     return app

+ 20 - 0
app/config.py

@@ -19,7 +19,27 @@ def _build_database_uri():
     return os.environ.get('DATABASE_URL', '')
 
 class Config:
+    APP_NAME = os.environ.get('APP_NAME', '路桥采集平台')
     SECRET_KEY = os.environ.get('SECRET_KEY') or 'you-will-never-guess'
     SQLALCHEMY_DATABASE_URI = _build_database_uri() or \
         'sqlite:///' + os.path.join(basedir, 'app.db')
     SQLALCHEMY_TRACK_MODIFICATIONS = False
+
+    # JWT 配置(用于本地 Token 签发)
+    JWT_SECRET_KEY = os.environ.get('JWT_SECRET_KEY') or 'jwt-secret-change-me'
+    JWT_ACCESS_TOKEN_EXPIRES = 1200  # 20 分钟
+    JWT_REFRESH_TOKEN_EXPIRES = 86400  # 24 小时
+
+    # SSO 统一认证配置
+    SSO_BASE_URL = os.environ.get('SSO_BASE_URL', 'http://192.168.92.61:8200')
+    SSO_CLIENT_ID = os.environ.get('SSO_CLIENT_ID', '')
+    SSO_CLIENT_SECRET = os.environ.get('SSO_CLIENT_SECRET', '')
+    SSO_REDIRECT_URI = os.environ.get('SSO_REDIRECT_URI', 'http://localhost:5000/auth/callback')
+    SSO_FRONTEND_URL = os.environ.get('SSO_FRONTEND_URL', 'http://localhost:5000')
+    SSO_SCOPE = os.environ.get('SSO_SCOPE', 'email')
+    SSO_LOGOUT_REDIRECT_URL = os.environ.get('SSO_LOGOUT_REDIRECT_URL', 'http://192.168.92.61:9200/login')
+
+    # 样本中心配置
+    SAMPLE_CENTER_BASE_URL = os.environ.get('SAMPLE_CENTER_BASE_URL', 'http://192.168.92.61')
+    SAMPLE_CENTER_APP_ID = os.environ.get('SAMPLE_CENTER_APP_ID', '')
+    SAMPLE_CENTER_APP_SECRET = os.environ.get('SAMPLE_CENTER_APP_SECRET', '')

+ 113 - 0
app/knowledge_poller.py

@@ -0,0 +1,113 @@
+import json
+import logging
+import threading
+from datetime import datetime, timedelta
+from app.sample_center_client import SampleCenterClient, SampleCenterError
+
+logger = logging.getLogger(__name__)
+
+MAX_POLL_COUNT = 20
+POLL_INTERVAL_INIT = 2
+POLL_INTERVAL_MAX = 30
+POLL_MULTIPLIER = 1.5
+
+
+class KnowledgePoller:
+    """后台轮询线程,定期检查 pending/processing 的入库任务状态。"""
+
+    def __init__(self, app):
+        self.app = app
+        self._thread = None
+        self._stop_event = threading.Event()
+
+    def start(self):
+        if self._thread and self._thread.is_alive():
+            return
+        self._stop_event.clear()
+        self._thread = threading.Thread(target=self._run, name="knowledge-poller", daemon=True)
+        self._thread.start()
+        logger.info("Knowledge poller started")
+
+    def stop(self):
+        self._stop_event.set()
+        if self._thread:
+            self._thread.join(timeout=10)
+        logger.info("Knowledge poller stopped")
+
+    def _run(self):
+        while not self._stop_event.is_set():
+            try:
+                self._poll_due_tasks()
+            except Exception:
+                logger.exception("Poller error")
+            self._stop_event.wait(5)
+
+    def _poll_due_tasks(self):
+        from app.models import KnowledgeImportTask
+
+        with self.app.app_context():
+            now = datetime.utcnow()
+            tasks = KnowledgeImportTask.query.filter(
+                KnowledgeImportTask.status.in_(['pending', 'processing']),
+                KnowledgeImportTask.next_poll_at <= now,
+                KnowledgeImportTask.poll_count < MAX_POLL_COUNT,
+            ).all()
+
+            for task in tasks:
+                self._poll_single_task(task)
+
+    def _poll_single_task(self, task):
+        from app import db
+
+        cfg = self.app.config
+        client = SampleCenterClient(
+            base_url=cfg['SAMPLE_CENTER_BASE_URL'],
+            app_id=cfg['SAMPLE_CENTER_APP_ID'],
+            app_secret=cfg['SAMPLE_CENTER_APP_SECRET'],
+        )
+
+        task.poll_count += 1
+        task.last_poll_at = datetime.utcnow()
+
+        try:
+            result = client.get_import_task(task.sample_task_id)
+            sc_data = result.get('data', {})
+            sc_status = sc_data.get('status', '')
+
+            task.status_detail = json.dumps(sc_data, ensure_ascii=False)
+
+            progress = sc_data.get('progress')
+            if progress:
+                task.progress = json.dumps(progress, ensure_ascii=False)
+
+            if sc_status in ('completed',):
+                task.status = 'success'
+                task.next_poll_at = None
+            elif sc_status == 'failed':
+                task.status = 'failed'
+                task.error_message = sc_data.get('error', '')
+                task.next_poll_at = None
+            else:
+                task.status = 'processing'
+                interval = min(
+                    POLL_INTERVAL_INIT * (POLL_MULTIPLIER ** (task.poll_count - 1)),
+                    POLL_INTERVAL_MAX,
+                )
+                task.next_poll_at = datetime.utcnow() + timedelta(seconds=interval)
+
+            db.session.commit()
+            logger.info(f"Polled task {task.task_no}: status={task.status}")
+
+        except SampleCenterError as e:
+            task.error_message = str(e)
+            interval = min(
+                POLL_INTERVAL_INIT * (POLL_MULTIPLIER ** (task.poll_count - 1)),
+                POLL_INTERVAL_MAX,
+            )
+            task.next_poll_at = datetime.utcnow() + timedelta(seconds=interval)
+            db.session.commit()
+            logger.warning(f"Poll error for {task.task_no}: {e}")
+
+        except Exception:
+            db.session.rollback()
+            logger.exception(f"Unexpected poll error for {task.task_no}")

+ 30 - 1
app/models.py

@@ -6,7 +6,15 @@ from datetime import datetime
 class User(UserMixin, db.Model):
     id = db.Column(db.Integer, primary_key=True)
     username = db.Column(db.String(64), index=True, unique=True)
-    password_hash = db.Column(db.String(256))
+    password_hash = db.Column(db.String(512))
+
+    # SSO 相关字段
+    sso_sub = db.Column(db.String(256), unique=True, nullable=True)
+    real_name = db.Column(db.String(100), nullable=True)
+    roles = db.Column(db.Text, nullable=True)  # JSON 字符串,如 ["super_admin", "sam_sys_admin"]
+    email = db.Column(db.String(120), nullable=True)
+    phone = db.Column(db.String(30), nullable=True)
+    avatar_url = db.Column(db.String(500), nullable=True)
 
     def set_password(self, password):
         self.password_hash = generate_password_hash(password)
@@ -119,3 +127,24 @@ class AIMessage(db.Model):
     created_at = db.Column(db.DateTime, default=datetime.utcnow)
 
 CollectionTask = SpiderTask
+
+class KnowledgeImportTask(db.Model):
+    """样本中心批量入库任务本地记录。"""
+    __tablename__ = 'knowledge_import_task'
+
+    id = db.Column(db.Integer, primary_key=True)
+    task_no = db.Column(db.String(64), unique=True, nullable=False, index=True)
+    sample_task_id = db.Column(db.String(64), nullable=True, index=True)
+    kb_id = db.Column(db.String(64), nullable=False, index=True)
+    kb_name = db.Column(db.String(200), nullable=True)
+    callback_url = db.Column(db.String(500), nullable=True)
+    status = db.Column(db.String(20), default='pending', index=True)  # pending/processing/success/failed
+    status_detail = db.Column(db.Text, nullable=True)  # JSON
+    progress = db.Column(db.Text, nullable=True)  # JSON: {total, processed, succeeded, failed}
+    poll_count = db.Column(db.Integer, default=0)
+    last_poll_at = db.Column(db.DateTime, nullable=True)
+    next_poll_at = db.Column(db.DateTime, nullable=True, index=True)
+    error_message = db.Column(db.Text, nullable=True)
+    created_by = db.Column(db.Integer, db.ForeignKey('user.id'), nullable=True)
+    created_at = db.Column(db.DateTime, default=datetime.utcnow)
+    updated_at = db.Column(db.DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)

+ 2 - 2
app/routes/__pycache__/main_routes.cpython-311.pyc

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:884ccb26db2b9167e509649ddf51c9be5d6a977acd50ab4187a35968633f46b7
-size 16260
+oid sha256:c6fc0ce1ffdcbd909e4702ee9fbb25dbba52804782bd9aff6b5b927f61667e3d
+size 16449

+ 294 - 0
app/routes/knowledge_routes.py

@@ -0,0 +1,294 @@
+import json
+import logging
+import secrets
+import traceback
+from datetime import datetime, timedelta
+from functools import wraps
+from flask import Blueprint, render_template, request, jsonify, current_app, url_for
+from flask_login import current_user, login_required
+from app import db
+from app.models import KnowledgeImportTask
+from app.sample_center_client import SampleCenterClient, SampleCenterError
+
+logger = logging.getLogger(__name__)
+
+bp = Blueprint('knowledge', __name__)
+
+
+def _get_sc_client():
+    cfg = current_app.config
+    return SampleCenterClient(
+        base_url=cfg['SAMPLE_CENTER_BASE_URL'],
+        app_id=cfg['SAMPLE_CENTER_APP_ID'],
+        app_secret=cfg['SAMPLE_CENTER_APP_SECRET'],
+    )
+
+
+def jwt_required(f):
+    """JWT 验证装饰器(复用 sso_routes 逻辑)。"""
+    @wraps(f)
+    def decorated(*args, **kwargs):
+        auth_header = request.headers.get('Authorization', '')
+        if not auth_header.startswith('Bearer '):
+            return jsonify({'code': '401001', 'message': '未提供有效的 Bearer Token'}), 401
+        token = auth_header[7:]
+        try:
+            import jwt as pyjwt
+            payload = pyjwt.decode(token, current_app.config['JWT_SECRET_KEY'], algorithms=['HS256'])
+            from app.models import User
+            user = User.query.get(int(payload['sub']))
+            if not user:
+                return jsonify({'code': '401002', 'message': '用户不存在'}), 401
+            request.current_user = user
+            request.jwt_payload = payload
+        except Exception:
+            return jsonify({'code': '401004', 'message': 'Token 无效'}), 401
+        return f(*args, **kwargs)
+    return decorated
+
+
+# ==================== 页面路由 ====================
+
+@bp.route('/knowledge')
+@login_required
+def knowledge_page():
+    return render_template('knowledge_management.html')
+
+
+# ==================== API 代理路由 ====================
+
+@bp.route('/api/v1/knowledge/bases')
+@jwt_required
+def list_knowledge_bases():
+    page = request.args.get('page', 1, type=int)
+    page_size = request.args.get('page_size', 20, type=int)
+    try:
+        cfg = current_app.config
+        logger.info(f"=== 样本中心配置 === base_url={cfg.get('SAMPLE_CENTER_BASE_URL')}, app_id={cfg.get('SAMPLE_CENTER_APP_ID')}")
+        client = _get_sc_client()
+        result = client.list_knowledge_bases(page=page, page_size=page_size)
+        return jsonify(result)
+    except SampleCenterError as e:
+        logger.error(f"样本中心调用失败: {e}")
+        logger.error(traceback.format_exc())
+        return jsonify({'code': '500001', 'message': str(e), 'detail': traceback.format_exc()}), 500
+    except Exception as e:
+        logger.error(f"未知异常: {e}")
+        logger.error(traceback.format_exc())
+        return jsonify({'code': '500002', 'message': str(e), 'detail': traceback.format_exc()}), 500
+
+
+@bp.route('/api/v1/knowledge/bases/<kb_id>')
+@jwt_required
+def get_knowledge_base(kb_id):
+    try:
+        result = _get_sc_client().get_knowledge_base(kb_id)
+        return jsonify(result)
+    except SampleCenterError as e:
+        logger.error(f"获取知识库详情失败: {e}")
+        logger.error(traceback.format_exc())
+        return jsonify({'code': '500001', 'message': str(e), 'detail': traceback.format_exc()}), 500
+    except Exception as e:
+        logger.error(f"未知异常: {e}")
+        logger.error(traceback.format_exc())
+        return jsonify({'code': '500002', 'message': str(e), 'detail': traceback.format_exc()}), 500
+
+
+@bp.route('/api/v1/knowledge/bases/<kb_id>/batch-import', methods=['POST'])
+@jwt_required
+def submit_batch_import(kb_id):
+    data = request.json or {}
+    task_no = data.get('task_no')
+    parents = data.get('parents', [])
+    children = data.get('children')
+
+    if not task_no:
+        return jsonify({'code': '100001', 'message': '缺少 task_no'}), 400
+    if not parents:
+        return jsonify({'code': '100002', 'message': 'parents 不能为空'}), 400
+
+    callback_url = url_for('knowledge.knowledge_import_callback', _external=True)
+
+    try:
+        client = _get_sc_client()
+        logger.info(f"提交批量入库: kb_id={kb_id}, task_no={task_no}")
+        result = client.batch_import(
+            kb_id=kb_id,
+            task_no=task_no,
+            parents=parents,
+            children=children,
+            callback_url=callback_url,
+        )
+        sc_data = result.get('data', {})
+        sample_task_id = sc_data.get('task_id', '')
+
+        kb_name = data.get('kb_name', '')
+
+        task = KnowledgeImportTask(
+            task_no=task_no,
+            sample_task_id=sample_task_id,
+            kb_id=kb_id,
+            kb_name=kb_name,
+            callback_url=callback_url,
+            status='pending',
+            next_poll_at=datetime.utcnow() + timedelta(seconds=2),
+            created_by=current_user.id,
+        )
+        db.session.add(task)
+        db.session.commit()
+
+        return jsonify({
+            'code': '000000',
+            'message': '入库任务已提交',
+            'data': {
+                'task_no': task_no,
+                'sample_task_id': sample_task_id,
+                'status': 'pending',
+            },
+        })
+    except SampleCenterError as e:
+        db.session.rollback()
+        logger.error(f"批量入库失败: {e}")
+        logger.error(traceback.format_exc())
+        return jsonify({'code': '500001', 'message': str(e), 'detail': traceback.format_exc()}), 500
+    except Exception as e:
+        db.session.rollback()
+        logger.error(f"未知异常: {e}")
+        logger.error(traceback.format_exc())
+        return jsonify({'code': '500002', 'message': str(e), 'detail': traceback.format_exc()}), 500
+
+
+@bp.route('/api/v1/knowledge/import-tasks')
+@jwt_required
+def list_import_tasks():
+    page = request.args.get('page', 1, type=int)
+    page_size = request.args.get('page_size', 20, type=int)
+    status = request.args.get('status', '')
+    kb_id = request.args.get('kb_id', '')
+
+    query = KnowledgeImportTask.query
+    if status:
+        query = query.filter_by(status=status)
+    if kb_id:
+        query = query.filter_by(kb_id=kb_id)
+
+    query = query.order_by(KnowledgeImportTask.created_at.desc())
+    total = query.count()
+    tasks = query.offset((page - 1) * page_size).limit(page_size).all()
+
+    items = []
+    for t in tasks:
+        items.append({
+            'id': t.id,
+            'task_no': t.task_no,
+            'sample_task_id': t.sample_task_id,
+            'kb_id': t.kb_id,
+            'kb_name': t.kb_name,
+            'status': t.status,
+            'progress': json.loads(t.progress) if t.progress else None,
+            'poll_count': t.poll_count,
+            'error_message': t.error_message,
+            'created_at': t.created_at.isoformat() if t.created_at else '',
+            'updated_at': t.updated_at.isoformat() if t.updated_at else '',
+        })
+
+    return jsonify({
+        'code': '000000',
+        'data': {
+            'total': total,
+            'page': page,
+            'page_size': page_size,
+            'items': items,
+        },
+    })
+
+
+@bp.route('/api/v1/knowledge/import-tasks/<task_no>')
+@jwt_required
+def get_import_task(task_no):
+    task = KnowledgeImportTask.query.filter_by(task_no=task_no).first()
+    if not task:
+        return jsonify({'code': '100001', 'message': '任务不存在'}), 404
+
+    # 如果非终态,同步查询样本中心
+    if task.status in ('pending', 'processing') and task.sample_task_id:
+        try:
+            client = _get_sc_client()
+            result = client.get_import_task(task.sample_task_id)
+            sc_data = result.get('data', {})
+            sc_status = sc_data.get('status', '')
+
+            task.status_detail = json.dumps(sc_data, ensure_ascii=False)
+
+            progress = sc_data.get('progress')
+            if progress:
+                task.progress = json.dumps(progress, ensure_ascii=False)
+
+            if sc_status in ('completed',):
+                task.status = 'success'
+                task.next_poll_at = None
+            elif sc_status == 'failed':
+                task.status = 'failed'
+                task.error_message = sc_data.get('error', '')
+                task.next_poll_at = None
+            elif sc_status == 'processing':
+                task.status = 'processing'
+
+            task.last_poll_at = datetime.utcnow()
+            db.session.commit()
+        except SampleCenterError as e:
+            logger.warning(f"查询样本中心任务状态失败: {e}")
+        except Exception as e:
+            logger.error(f"查询任务详情异常: {e}")
+            logger.error(traceback.format_exc())
+
+    return jsonify({
+        'code': '000000',
+        'data': {
+            'id': task.id,
+            'task_no': task.task_no,
+            'sample_task_id': task.sample_task_id,
+            'kb_id': task.kb_id,
+            'kb_name': task.kb_name,
+            'status': task.status,
+            'progress': json.loads(task.progress) if task.progress else None,
+            'status_detail': json.loads(task.status_detail) if task.status_detail else None,
+            'error_message': task.error_message,
+            'poll_count': task.poll_count,
+            'created_at': task.created_at.isoformat() if task.created_at else '',
+            'updated_at': task.updated_at.isoformat() if task.updated_at else '',
+        },
+    })
+
+
+@bp.route('/api/v1/callback/knowledge-import', methods=['POST'])
+def knowledge_import_callback():
+    """接收样本中心入库任务回调。"""
+    data = request.json or {}
+    task_id = data.get('task_id')
+    status = data.get('status')
+
+    current_app.logger.info(f"Callback received: task_id={task_id}, status={status}")
+
+    task = KnowledgeImportTask.query.filter_by(sample_task_id=task_id).first()
+    if not task:
+        return jsonify({'code': '100001', 'message': '任务不存在'}), 404
+
+    task.status_detail = json.dumps(data, ensure_ascii=False)
+
+    if status in ('completed',):
+        task.status = 'success'
+    elif status == 'failed':
+        task.status = 'failed'
+        task.error_message = data.get('error', '')
+    else:
+        task.status = status
+
+    progress = data.get('progress')
+    if progress:
+        task.progress = json.dumps(progress, ensure_ascii=False)
+
+    task.next_poll_at = None
+    db.session.commit()
+
+    return jsonify({'code': '000000', 'message': '回调接收成功'})

+ 4 - 1
app/routes/main_routes.py

@@ -1,4 +1,4 @@
-from flask import Blueprint, render_template, redirect, url_for, request, jsonify, Response
+from flask import Blueprint, render_template, redirect, url_for, request, jsonify, Response, current_app
 from flask_login import current_user, login_user, logout_user, login_required
 from app.models import User, SpiderResult, AIModel, SpiderTask
 from app import db
@@ -131,6 +131,9 @@ def login():
 @bp.route('/logout')
 def logout():
     logout_user()
+    sso_logout = current_app.config.get('SSO_LOGOUT_REDIRECT_URL', '')
+    if sso_logout:
+        return redirect(sso_logout)
     return redirect(url_for('main.login'))
 
 @bp.route('/dashboard')

+ 326 - 0
app/routes/sso_routes.py

@@ -0,0 +1,326 @@
+import json
+import jwt
+import datetime
+from functools import wraps
+from flask import Blueprint, render_template, request, jsonify, current_app, redirect, url_for, session
+from flask_login import current_user, login_user, logout_user
+from app import db
+from app.models import User
+from app.sso_client import SSOClient, SSOError
+
+bp = Blueprint('sso', __name__)
+
+
+def _get_sso_client():
+    cfg = current_app.config
+    return SSOClient(
+        base_url=cfg['SSO_BASE_URL'],
+        client_id=cfg['SSO_CLIENT_ID'],
+        client_secret=cfg['SSO_CLIENT_SECRET'],
+        redirect_uri=cfg['SSO_REDIRECT_URI'],
+        scope=cfg.get('SSO_SCOPE', 'email'),
+    )
+
+
+def _generate_jwt(user, token_type='access'):
+    """签发本地 JWT。"""
+    cfg = current_app.config
+    secret = cfg['JWT_SECRET_KEY']
+    now = datetime.datetime.utcnow()
+
+    payload = {
+        'sub': str(user.id),
+        'username': user.username,
+        'roles': json.loads(user.roles) if user.roles else [],
+        'type': token_type,
+        'iat': now,
+    }
+
+    if token_type == 'access':
+        payload['exp'] = now + datetime.timedelta(seconds=cfg.get('JWT_ACCESS_TOKEN_EXPIRES', 1200))
+    else:
+        payload['exp'] = now + datetime.timedelta(seconds=cfg.get('JWT_REFRESH_TOKEN_EXPIRES', 86400))
+
+    return jwt.encode(payload, secret, algorithm='HS256')
+
+
+def _decode_jwt(token, allow_expired=False):
+    """解码 JWT。"""
+    cfg = current_app.config
+    secret = cfg['JWT_SECRET_KEY']
+    options = {}
+    if allow_expired:
+        options['verify_exp'] = False
+    return jwt.decode(token, secret, algorithms=['HS256'], options=options)
+
+
+def _sync_user(sso_info):
+    """根据 SSO 用户信息在本地查找或创建用户。"""
+    sso_sub = sso_info.get('sub', '')
+    username = sso_info.get('username', sso_sub)
+    real_name = sso_info.get('real_name', '')
+    email = sso_info.get('email', '')
+    phone = sso_info.get('phone', '')
+    avatar_url = sso_info.get('avatar_url', '')
+    roles_raw = sso_info.get('roles', [])
+
+    # 从 role 对象列表中提取 code
+    role_codes = []
+    if isinstance(roles_raw, list):
+        for r in roles_raw:
+            if isinstance(r, dict):
+                role_codes.append(r.get('code', ''))
+            elif isinstance(r, str):
+                role_codes.append(r)
+    role_codes = [r for r in role_codes if r]
+
+    # 查找用户:优先按 sso_sub,其次按 username
+    user = None
+    if sso_sub:
+        user = User.query.filter_by(sso_sub=sso_sub).first()
+
+    if not user:
+        user = User.query.filter_by(username=username).first()
+
+    if not user:
+        # 创建新用户(SSO 用户无密码)
+        user = User(
+            username=username,
+            sso_sub=sso_sub,
+        )
+        db.session.add(user)
+
+    # 同步字段
+    user.sso_sub = sso_sub
+    user.real_name = real_name
+    user.roles = json.dumps(role_codes, ensure_ascii=False)
+    user.email = email
+    user.phone = phone
+    user.avatar_url = avatar_url
+
+    return user
+
+
+def jwt_required(f):
+    """JWT 验证装饰器,用于 /api/v1/auth/* 路由。"""
+    @wraps(f)
+    def decorated(*args, **kwargs):
+        auth_header = request.headers.get('Authorization', '')
+        if not auth_header.startswith('Bearer '):
+            return jsonify({'code': '401001', 'message': '未提供有效的 Bearer Token'}), 401
+        token = auth_header[7:]
+        try:
+            payload = _decode_jwt(token)
+            user = User.query.get(int(payload['sub']))
+            if not user:
+                return jsonify({'code': '401002', 'message': '用户不存在'}), 401
+            request.current_user = user
+            request.jwt_payload = payload
+        except jwt.ExpiredSignatureError:
+            return jsonify({'code': '401003', 'message': 'Token 已过期'}), 401
+        except jwt.InvalidTokenError:
+            return jsonify({'code': '401004', 'message': 'Token 无效'}), 401
+        return f(*args, **kwargs)
+    return decorated
+
+
+# ==================== SSO 核心接口 ====================
+
+@bp.route('/api/oauth/exchange-code', methods=['POST'])
+def exchange_code():
+    """
+    SSO 换码接口(核心免登)。
+    前端用 code 调用此接口,换取本地 JWT。
+    """
+    data = request.json or {}
+    code = data.get('code')
+    if not code:
+        return jsonify({'code': '100001', 'message': '缺少授权码', 'data': None}), 400
+
+    try:
+        client = _get_sso_client()
+
+        # 1. 用 code 换 SSO access_token
+        token_resp = client.exchange_code_for_token(code)
+        sso_access_token = token_resp['access_token']
+        sso_refresh_token = token_resp.get('refresh_token', '')
+
+        # 2. 获取用户信息
+        sso_info = client.get_userinfo(sso_access_token)
+
+        # 3. 同步用户到本地
+        user = _sync_user(sso_info)
+        db.session.commit()
+
+        # 4. 登录 Flask-Login(用于模板渲染)
+        login_user(user)
+
+        # 5. 签发本地 JWT
+        access_token = _generate_jwt(user, token_type='access')
+        refresh_token = _generate_jwt(user, token_type='refresh')
+
+        role_codes = json.loads(user.roles) if user.roles else []
+        user_data = {
+            'id': str(user.id),
+            'username': user.username,
+            'email': user.email or '',
+            'phone': user.phone or '',
+            'real_name': user.real_name or '',
+            'is_superuser': 'super_admin' in role_codes,
+            'is_active': True,
+            'roles': role_codes,
+        }
+
+        return jsonify({
+            'code': '000000',
+            'message': '登录成功',
+            'data': {
+                'token': access_token,
+                'refresh_token': refresh_token,
+                'user': user_data,
+            },
+        })
+
+    except SSOError as e:
+        return jsonify({'code': '400001', 'message': f'登录失败: {str(e)}', 'data': None}), 400
+    except Exception as e:
+        db.session.rollback()
+        return jsonify({'code': '500001', 'message': f'登录失败: {str(e)}', 'data': None}), 500
+
+
+@bp.route('/auth/sso/authorize')
+def sso_authorize():
+    """
+    返回 SSO 授权 URL,前端点击"SSO登录"时跳转。
+    支持 ?redirect=true 直接 302 重定向。
+    """
+    client = _get_sso_client()
+    authorize_url = client.get_authorize_url()
+
+    if request.args.get('redirect', '').lower() == 'true':
+        return redirect(authorize_url)
+
+    return jsonify({
+        'code': '000000',
+        'message': '获取授权URL成功',
+        'data': {'authorize_url': authorize_url},
+    })
+
+
+@bp.route('/auth/callback')
+def sso_callback():
+    """
+    SSO 回调端点(旧流程,后端 302 重定向方式)。
+    处理 SSO 返回的 code/error 参数,重定向到前端回调页。
+    """
+    error = request.args.get('error')
+    if error:
+        error_desc = request.args.get('error_description', error)
+        frontend_url = current_app.config.get('SSO_FRONTEND_URL', 'http://localhost:5000')
+        return redirect(f"{frontend_url}/auth/callback?error={error_desc}")
+
+    # 重定向到前端回调页(由前端 JS 处理换码)
+    code = request.args.get('code')
+    if not code:
+        return jsonify({'code': '100001', 'message': '缺少授权码', 'data': None}), 400
+
+    return render_template('auth_callback.html', code=code)
+
+
+# ==================== 标准认证接口 ====================
+
+@bp.route('/api/v1/auth/login', methods=['POST'])
+def local_login():
+    """本地密码登录(兜底方案)。"""
+    data = request.json or {}
+    username = data.get('username')
+    password = data.get('password')
+
+    if not username or not password:
+        return jsonify({'code': '100002', 'message': '用户名和密码不能为空', 'data': None}), 400
+
+    user = User.query.filter_by(username=username).first()
+    if not user or not user.check_password(password):
+        return jsonify({'code': '100003', 'message': '用户名或密码错误', 'data': None}), 401
+
+    login_user(user)
+
+    access_token = _generate_jwt(user, token_type='access')
+    refresh_token = _generate_jwt(user, token_type='refresh')
+
+    return jsonify({
+        'code': '000000',
+        'message': '登录成功',
+        'data': {
+            'access_token': access_token,
+            'refresh_token': refresh_token,
+            'token_type': 'Bearer',
+            'expires_in': current_app.config.get('JWT_ACCESS_TOKEN_EXPIRES', 1200),
+        },
+    })
+
+
+@bp.route('/api/v1/auth/logout', methods=['POST'])
+def local_logout():
+    """登出,返回 SSO 登出 URL。"""
+    logout_user()
+    sso_logout_url = current_app.config.get('SSO_LOGOUT_REDIRECT_URL', '')
+    return jsonify({
+        'code': '000000',
+        'message': '登出成功',
+        'data': {'sso_logout_url': sso_logout_url},
+    })
+
+
+@bp.route('/api/v1/auth/userinfo')
+@jwt_required
+def get_userinfo():
+    """获取当前登录用户信息。"""
+    user = request.current_user
+    role_codes = json.loads(user.roles) if user.roles else []
+    return jsonify({
+        'code': '000000',
+        'data': {
+            'id': str(user.id),
+            'username': user.username,
+            'email': user.email or '',
+            'phone': user.phone or '',
+            'real_name': user.real_name or '',
+            'roles': role_codes,
+            'permissions': [],
+        },
+    })
+
+
+@bp.route('/api/v1/auth/refresh', methods=['POST'])
+def refresh_token():
+    """用 refresh_token 换取新的 access_token。"""
+    data = request.json or {}
+    token = data.get('refresh_token')
+    if not token:
+        return jsonify({'code': '100004', 'message': '缺少 refresh_token', 'data': None}), 400
+
+    try:
+        payload = _decode_jwt(token)
+        if payload.get('type') != 'refresh':
+            return jsonify({'code': '400002', 'message': '无效的 refresh_token', 'data': None}), 400
+
+        user = User.query.get(int(payload['sub']))
+        if not user:
+            return jsonify({'code': '400003', 'message': '用户不存在', 'data': None}), 404
+
+        new_access_token = _generate_jwt(user, token_type='access')
+        new_refresh_token = _generate_jwt(user, token_type='refresh')
+
+        return jsonify({
+            'code': '000000',
+            'message': '刷新成功',
+            'data': {
+                'access_token': new_access_token,
+                'refresh_token': new_refresh_token,
+                'token_type': 'Bearer',
+                'expires_in': current_app.config.get('JWT_ACCESS_TOKEN_EXPIRES', 1200),
+            },
+        })
+    except jwt.InvalidTokenError:
+        return jsonify({'code': '401004', 'message': 'refresh_token 无效', 'data': None}), 401

+ 133 - 0
app/sample_center_client.py

@@ -0,0 +1,133 @@
+import time
+import threading
+import logging
+import requests
+from datetime import datetime, timedelta
+
+logger = logging.getLogger(__name__)
+
+
+class SampleCenterError(Exception):
+    """样本中心交互异常。"""
+    pass
+
+
+class SampleCenterClient:
+    """样本中心 API 客户端,封装 token 管理和接口调用。"""
+
+    TOKEN_REFRESH_THRESHOLD = 300  # token 过期前 5 分钟自动刷新
+
+    def __init__(self, base_url, app_id, app_secret):
+        self.base_url = base_url.rstrip("/")
+        self.app_id = app_id
+        self.app_secret = app_secret
+        self._access_token = None
+        self._token_expires_at = None
+        self._lock = threading.Lock()
+
+    def _ensure_token(self):
+        """线程安全地确保 access_token 有效。"""
+        now = datetime.utcnow()
+        if (self._access_token and self._token_expires_at
+                and now < self._token_expires_at - timedelta(seconds=self.TOKEN_REFRESH_THRESHOLD)):
+            return self._access_token
+
+        with self._lock:
+            now = datetime.utcnow()
+            if (self._access_token and self._token_expires_at
+                    and now < self._token_expires_at - timedelta(seconds=self.TOKEN_REFRESH_THRESHOLD)):
+                return self._access_token
+            self._refresh_token()
+            return self._access_token
+
+    def _refresh_token(self):
+        """调用样本中心换 token 接口。"""
+        url = f"{self.base_url}/api/v1/auth/token"
+        logger.info(f"请求 Token: url={url}, app_id={self.app_id}")
+        try:
+            resp = requests.post(url, json={
+                "app_id": self.app_id,
+                "app_secret": self.app_secret,
+            }, timeout=15)
+            logger.info(f"Token 响应: status={resp.status_code}, body={resp.text[:500]}")
+            if resp.status_code != 200:
+                raise SampleCenterError(f"Token request failed: {resp.status_code} {resp.text}")
+            result = resp.json()
+            if result.get("code") != "000000":
+                raise SampleCenterError(f"Token error: {result.get('message', result)}")
+            data = result["data"]
+            self._access_token = data["access_token"]
+            expires_in = data.get("expires_in", 7200)
+            self._token_expires_at = datetime.utcnow() + timedelta(seconds=expires_in)
+            logger.info(f"Token 获取成功: expires_in={expires_in}")
+        except requests.exceptions.RequestException as e:
+            logger.error(f"Token 请求异常: {e}")
+            raise SampleCenterError(f"Token request exception: {e}")
+
+    def _headers(self):
+        return {
+            "Authorization": f"Bearer {self._ensure_token()}",
+            "X-App-Id": self.app_id,
+        }
+
+    def _parse(self, resp):
+        """统一解析样本中心响应。"""
+        if resp.status_code != 200:
+            logger.error(f"HTTP 错误: status={resp.status_code}, body={resp.text[:500]}")
+            raise SampleCenterError(f"HTTP {resp.status_code}: {resp.text}")
+        body = resp.json()
+        if body.get("code") != "000000":
+            logger.error(f"业务错误: code={body.get('code')}, message={body.get('message')}")
+            raise SampleCenterError(
+                f"SampleCenter error [{body.get('code')}]: {body.get('message', 'unknown')}"
+            )
+        return body
+
+    # -- 业务接口 --
+
+    def list_knowledge_bases(self, page=1, page_size=20):
+        """GET /api/v1/knowledge-bases -- 知识库列表。"""
+        url = f"{self.base_url}/api/v1/knowledge-bases"
+        logger.info(f"请求知识库列表: url={url}, page={page}, page_size={page_size}")
+        resp = requests.get(
+            url,
+            headers=self._headers(),
+            params={"page": page, "page_size": page_size},
+            timeout=30,
+        )
+        logger.info(f"知识库列表响应: status={resp.status_code}, body={resp.text[:500]}")
+        return self._parse(resp)
+
+    def get_knowledge_base(self, kb_id):
+        """GET /api/v1/knowledge-bases/{id} -- 知识库详情。"""
+        url = f"{self.base_url}/api/v1/knowledge-bases/{kb_id}"
+        logger.info(f"请求知识库详情: url={url}")
+        resp = requests.get(url, headers=self._headers(), timeout=15)
+        return self._parse(resp)
+
+    def batch_import(self, kb_id, task_no, parents, children=None, callback_url=None):
+        """POST /api/v1/knowledge-bases/{kb_id}/batch-import -- 批量入库。"""
+        url = f"{self.base_url}/api/v1/knowledge-bases/{kb_id}/batch-import"
+        payload = {
+            "task_no": task_no,
+            "parents": parents,
+        }
+        if children:
+            payload["children"] = children
+        if callback_url:
+            payload["callback_url"] = callback_url
+        logger.info(f"请求批量入库: url={url}, task_no={task_no}")
+        resp = requests.post(
+            url,
+            headers=self._headers(),
+            json=payload,
+            timeout=60,
+        )
+        return self._parse(resp)
+
+    def get_import_task(self, task_id):
+        """GET /api/v1/knowledge-bases/batch-import/{task_id} -- 入库任务查询。"""
+        url = f"{self.base_url}/api/v1/knowledge-bases/batch-import/{task_id}"
+        logger.info(f"请求入库任务: url={url}")
+        resp = requests.get(url, headers=self._headers(), timeout=15)
+        return self._parse(resp)

+ 60 - 0
app/sso_client.py

@@ -0,0 +1,60 @@
+import requests
+
+
+class SSOClient:
+    """与统一认证平台(LQAI-middle-platform)交互的 HTTP 客户端。"""
+
+    def __init__(self, base_url, client_id, client_secret, redirect_uri, scope="email"):
+        self.base_url = base_url.rstrip("/")
+        self.client_id = client_id
+        self.client_secret = client_secret
+        self.redirect_uri = redirect_uri
+        self.scope = scope
+
+    def get_authorize_url(self, state=None):
+        """构建 SSO 授权 URL(GET /oauth/authorize)。"""
+        params = {
+            "response_type": "code",
+            "client_id": self.client_id,
+            "redirect_uri": self.redirect_uri,
+            "scope": self.scope,
+        }
+        if state:
+            params["state"] = state
+        qs = "&".join(f"{k}={v}" for k, v in params.items())
+        return f"{self.base_url}/oauth/authorize?{qs}"
+
+    def exchange_code_for_token(self, code):
+        """用授权码换取 SSO access_token(POST /oauth/token)。"""
+        url = f"{self.base_url}/oauth/token"
+        data = {
+            "grant_type": "authorization_code",
+            "code": code,
+            "redirect_uri": self.redirect_uri,
+            "client_id": self.client_id,
+            "client_secret": self.client_secret,
+        }
+        resp = requests.post(url, data=data, timeout=15)
+        if resp.status_code != 200:
+            raise SSOError(f"Token exchange failed: {resp.status_code} {resp.text}")
+        result = resp.json()
+        if "access_token" not in result:
+            raise SSOError(f"Unexpected token response: {result}")
+        return result
+
+    def get_userinfo(self, access_token):
+        """用 SSO access_token 获取用户信息(GET /oauth/userinfo)。"""
+        url = f"{self.base_url}/oauth/userinfo"
+        headers = {"Authorization": f"Bearer {access_token}"}
+        resp = requests.get(url, headers=headers, timeout=15)
+        if resp.status_code != 200:
+            raise SSOError(f"UserInfo failed: {resp.status_code} {resp.text}")
+        result = resp.json()
+        if "error" in result:
+            raise SSOError(f"UserInfo error: {result.get('error_description', result['error'])}")
+        return result
+
+
+class SSOError(Exception):
+    """SSO 交互过程中产生的异常。"""
+    pass

+ 82 - 0
app/templates/auth_callback.html

@@ -0,0 +1,82 @@
+<!DOCTYPE html>
+<html lang="zh-CN">
+<head>
+    <meta charset="utf-8">
+    <meta content="width=device-width, initial-scale=1.0" name="viewport">
+    <title>SSO 登录回调</title>
+    <script src="{{ url_for('static', filename='js/tailwindcss.js') }}"></script>
+    <link href="{{ url_for('static', filename='fontawesome/css/all.min.css') }}" rel="stylesheet">
+    <link href="{{ url_for('static', filename='css/style.css') }}" rel="stylesheet">
+</head>
+<body class="min-h-screen flex items-center justify-center" style="background-color: #0a192f;">
+    <div class="tech-panel p-10 rounded-xl w-full max-w-md shadow-2xl border-blue-500/50 text-center">
+        <div id="loading-view">
+            <i class="fas fa-spinner fa-spin text-5xl text-cyan-400 mb-6"></i>
+            <h2 class="text-2xl font-bold text-cyan-300 mb-2">SSO 登录中</h2>
+            <p class="text-gray-400">正在验证身份,请稍候...</p>
+        </div>
+        <div id="error-view" class="hidden">
+            <i class="fas fa-exclamation-circle text-5xl text-red-400 mb-6"></i>
+            <h2 class="text-2xl font-bold text-red-300 mb-2">登录失败</h2>
+            <p id="error-message" class="text-gray-400 mb-6"></p>
+            <a href="{{ url_for('main.login') }}" class="tech-button inline-block px-6 py-3 bg-cyan-600 hover:bg-cyan-500 text-white font-bold rounded-lg transition duration-300">
+                <i class="fas fa-arrow-left mr-2"></i> 返回登录页
+            </a>
+        </div>
+        <div id="success-view" class="hidden">
+            <i class="fas fa-check-circle text-5xl text-green-400 mb-6"></i>
+            <h2 class="text-2xl font-bold text-green-300 mb-2">登录成功</h2>
+            <p class="text-gray-400">正在跳转首页...</p>
+        </div>
+    </div>
+
+    <script>
+        (async function () {
+            const code = '{{ code }}';
+            const loadingView = document.getElementById('loading-view');
+            const errorView = document.getElementById('error-view');
+            const successView = document.getElementById('success-view');
+            const errorMessage = document.getElementById('error-message');
+
+            if (!code) {
+                showError('缺少授权码');
+                return;
+            }
+
+            try {
+                const response = await fetch('/api/oauth/exchange-code', {
+                    method: 'POST',
+                    headers: { 'Content-Type': 'application/json' },
+                    body: JSON.stringify({ code: code })
+                });
+
+                const result = await response.json();
+
+                if (result.code === '000000') {
+                    // 保存 Token
+                    localStorage.setItem('token', result.data.token);
+                    localStorage.setItem('refresh_token', result.data.refresh_token);
+                    localStorage.setItem('user', JSON.stringify(result.data.user));
+
+                    // 显示成功,跳转首页
+                    loadingView.classList.add('hidden');
+                    successView.classList.remove('hidden');
+                    setTimeout(function () {
+                        window.location.href = '{{ url_for("main.dashboard") }}';
+                    }, 800);
+                } else {
+                    showError(result.message || '登录失败');
+                }
+            } catch (err) {
+                showError('网络请求失败: ' + err.message);
+            }
+
+            function showError(msg) {
+                loadingView.classList.add('hidden');
+                errorMessage.textContent = msg;
+                errorView.classList.remove('hidden');
+            }
+        })();
+    </script>
+</body>
+</html>

+ 1 - 1
app/templates/base.html

@@ -3,7 +3,7 @@
 <head>
     <meta charset="utf-8">
     <meta content="width=device-width, initial-scale=1.0" name="viewport">
-    <title>{% block title %}AI智能政企瞭望舆情数据采集与分析系统{% endblock %}</title>
+    <title>{% block title %}{{ app_name }}{% endblock %}</title>
     <!-- Tailwind CSS -->
     <script src="{{ url_for('static', filename='js/tailwindcss.js') }}"></script>
     <!-- Font Awesome -->

+ 454 - 0
app/templates/knowledge_management.html

@@ -0,0 +1,454 @@
+{% extends "base.html" %}
+
+{% block content %}
+<div class="flex h-screen overflow-hidden" id="knowledge-view">
+    <!-- Sidebar -->
+    {% include 'partials/sidebar.html' %}
+
+    <!-- Main Content -->
+    <div class="flex-1 flex flex-col min-w-0 overflow-hidden bg-gray-900/80 relative">
+        <!-- Top Header -->
+        <header class="h-16 flex items-center justify-between px-6 z-20 bg-gray-900/80 backdrop-blur-md border-b border-blue-900/30">
+            <button class="md:hidden text-gray-300 focus:outline-none" id="open-sidebar">
+                <i class="fas fa-bars text-xl"></i>
+            </button>
+            <h1 class="text-lg md:text-2xl font-bold tech-title truncate ml-2">样本中心管理</h1>
+            <div class="flex items-center space-x-4">
+                <div class="flex items-center space-x-2">
+                    <div class="w-8 h-8 rounded-full bg-blue-500 flex items-center justify-center text-white font-bold border border-cyan-400 shadow-md">A</div>
+                    <span class="hidden md:inline text-sm text-gray-300">{{ current_user.username }}</span>
+                </div>
+            </div>
+        </header>
+
+        <!-- Main Scrollable Area -->
+        <main class="flex-1 overflow-y-auto p-4 md:p-6 scroll-smooth">
+            <div class="h-full flex flex-col gap-6">
+                <!-- Tabs -->
+                <div class="bg-gray-800 rounded-lg shadow-lg">
+                    <div class="flex border-b border-gray-700 px-6 pt-4 gap-6">
+                        <button class="kb-tab active px-4 py-2 text-sm font-bold rounded-t-lg transition-colors" data-tab="list">
+                            <i class="fas fa-list mr-2"></i>知识库列表
+                        </button>
+                        <button class="kb-tab px-4 py-2 text-sm font-bold rounded-t-lg transition-colors text-gray-400 hover:text-white" data-tab="detail">
+                            <i class="fas fa-info-circle mr-2"></i>知识库详情
+                        </button>
+                        <button class="kb-tab px-4 py-2 text-sm font-bold rounded-t-lg transition-colors text-gray-400 hover:text-white" data-tab="tasks">
+                            <i class="fas fa-tasks mr-2"></i>入库任务
+                        </button>
+                    </div>
+                </div>
+
+                <!-- Tab 1: 知识库列表 -->
+                <div id="tab-list" class="flex-1 bg-gray-800 p-6 rounded-b-lg shadow-lg flex flex-col min-h-0 tab-content">
+                    <div class="flex justify-between items-center mb-4">
+                        <label class="text-gray-400 text-sm font-bold">知识库列表</label>
+                        <div class="flex gap-2 items-center">
+                            <input type="text" id="kb-search" class="bg-gray-700 text-white rounded px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500 w-48" placeholder="搜索知识库...">
+                            <button onclick="loadKnowledgeList()" class="bg-blue-600 hover:bg-blue-700 text-white px-4 py-2 rounded text-sm transition-colors">
+                                <i class="fas fa-sync-alt"></i> 刷新
+                            </button>
+                        </div>
+                    </div>
+                    <div class="flex-1 overflow-auto custom-scrollbar">
+                        <table class="w-full text-sm text-gray-300">
+                            <thead class="text-xs text-gray-400 uppercase bg-gray-700/50 sticky top-0">
+                                <tr>
+                                    <th class="px-4 py-3 text-left">ID</th>
+                                    <th class="px-4 py-3 text-left">名称</th>
+                                    <th class="px-4 py-3 text-center">文档数</th>
+                                    <th class="px-4 py-3 text-center">状态</th>
+                                    <th class="px-4 py-3 text-center">创建时间</th>
+                                    <th class="px-4 py-3 text-center">操作</th>
+                                </tr>
+                            </thead>
+                            <tbody id="kb-list-body">
+                                <tr><td colspan="6" class="text-center text-gray-500 py-10">加载中...</td></tr>
+                            </tbody>
+                        </table>
+                    </div>
+                    <div class="flex justify-between items-center mt-4 pt-4 border-t border-gray-700">
+                        <span id="kb-total" class="text-gray-500 text-sm"></span>
+                        <div class="flex gap-2">
+                            <button onclick="changePage(-1)" id="btn-prev" class="hidden bg-gray-700 hover:bg-gray-600 text-white px-3 py-1 rounded text-sm">上一页</button>
+                            <span id="kb-page-info" class="text-gray-400 text-sm px-3 py-1"></span>
+                            <button onclick="changePage(1)" id="btn-next" class="hidden bg-gray-700 hover:bg-gray-600 text-white px-3 py-1 rounded text-sm">下一页</button>
+                        </div>
+                    </div>
+                </div>
+
+                <!-- Tab 2: 知识库详情 -->
+                <div id="tab-detail" class="flex-1 bg-gray-800 p-6 rounded-b-lg shadow-lg flex flex-col min-h-0 tab-content hidden">
+                    <div id="kb-detail-content">
+                        <div class="text-center text-gray-500 py-20">
+                            <i class="fas fa-info-circle text-4xl mb-4 opacity-50"></i>
+                            <p>请先从知识库列表中选择查看详情</p>
+                        </div>
+                    </div>
+                </div>
+
+                <!-- Tab 3: 入库任务 -->
+                <div id="tab-tasks" class="flex-1 bg-gray-800 p-6 rounded-b-lg shadow-lg flex flex-col min-h-0 tab-content hidden">
+                    <div class="flex justify-between items-center mb-4">
+                        <label class="text-gray-400 text-sm font-bold">批量入库任务</label>
+                        <button onclick="loadImportTasks()" class="bg-blue-600 hover:bg-blue-700 text-white px-4 py-2 rounded text-sm transition-colors">
+                            <i class="fas fa-sync-alt"></i> 刷新
+                        </button>
+                    </div>
+                    <div class="flex-1 overflow-auto custom-scrollbar">
+                        <table class="w-full text-sm text-gray-300">
+                            <thead class="text-xs text-gray-400 uppercase bg-gray-700/50 sticky top-0">
+                                <tr>
+                                    <th class="px-4 py-3 text-left">任务号</th>
+                                    <th class="px-4 py-3 text-left">知识库</th>
+                                    <th class="px-4 py-3 text-center">状态</th>
+                                    <th class="px-4 py-3 text-center">进度</th>
+                                    <th class="px-4 py-3 text-center">创建时间</th>
+                                    <th class="px-4 py-3 text-center">操作</th>
+                                </tr>
+                            </thead>
+                            <tbody id="task-list-body">
+                                <tr><td colspan="6" class="text-center text-gray-500 py-10">暂无任务</td></tr>
+                            </tbody>
+                        </table>
+                    </div>
+                </div>
+            </div>
+        </main>
+    </div>
+</div>
+
+<!-- 入库弹窗 Modal -->
+<div id="import-modal" class="fixed inset-0 bg-black/70 z-50 hidden flex items-center justify-center">
+    <div class="bg-gray-800 rounded-lg shadow-2xl w-full max-w-2xl mx-4 max-h-[90vh] overflow-y-auto">
+        <div class="flex justify-between items-center p-6 border-b border-gray-700">
+            <h2 class="text-xl font-bold text-white"><i class="fas fa-upload mr-2 text-blue-500"></i>批量入库</h2>
+            <button onclick="closeImportModal()" class="text-gray-400 hover:text-white"><i class="fas fa-times text-xl"></i></button>
+        </div>
+        <div class="p-6 space-y-4">
+            <div>
+                <label class="block text-gray-400 text-sm mb-1 font-bold">知识库</label>
+                <input type="text" id="import-kb-name" class="w-full bg-gray-700 text-white rounded p-3 focus:outline-none focus:ring-2 focus:ring-blue-500" readonly>
+                <input type="hidden" id="import-kb-id">
+            </div>
+            <div>
+                <label class="block text-gray-400 text-sm mb-1 font-bold">任务号 (task_no)</label>
+                <input type="text" id="import-task-no" class="w-full bg-gray-700 text-white rounded p-3 focus:outline-none focus:ring-2 focus:ring-blue-500" placeholder="自动生成或手动输入">
+            </div>
+            <div>
+                <label class="block text-gray-400 text-sm mb-1 font-bold">Parents (JSON 数组)</label>
+                <textarea id="import-parents" class="w-full bg-gray-700 text-white rounded p-3 focus:outline-none focus:ring-2 focus:ring-blue-500 font-mono text-xs" rows="8" placeholder='[{"index": 0, "parent_id": "...", "text": "..."}]'></textarea>
+            </div>
+            <div>
+                <label class="block text-gray-400 text-sm mb-1 font-bold">Children (JSON 数组,可选)</label>
+                <textarea id="import-children" class="w-full bg-gray-700 text-white rounded p-3 focus:outline-none focus:ring-2 focus:ring-blue-500 font-mono text-xs" rows="6" placeholder='[{"index": 0, "parent_id": "...", "text": "..."}]'></textarea>
+            </div>
+        </div>
+        <div class="flex justify-end gap-3 p-6 border-t border-gray-700">
+            <button onclick="closeImportModal()" class="bg-gray-600 hover:bg-gray-500 text-white px-6 py-2 rounded transition-colors">取消</button>
+            <button onclick="submitImport()" class="bg-blue-600 hover:bg-blue-700 text-white px-6 py-2 rounded transition-colors"><i class="fas fa-paper-plane mr-1"></i>提交</button>
+        </div>
+    </div>
+</div>
+
+<style>
+.kb-tab.active {
+    background-color: #1F2937;
+    color: white;
+    border-bottom: 2px solid #3B82F6;
+}
+.status-pending { color: #F59E0B; }
+.status-processing { color: #3B82F6; }
+.status-success, .status-completed { color: #10B981; }
+.status-failed { color: #EF4444; }
+.custom-scrollbar::-webkit-scrollbar { width: 6px; }
+.custom-scrollbar::-webkit-scrollbar-track { background: #1F2937; }
+.custom-scrollbar::-webkit-scrollbar-thumb { background: #4B5563; border-radius: 3px; }
+</style>
+
+<script>
+let currentPage = 1;
+let currentKbId = '';
+
+$(document).ready(function() {
+    $('#open-sidebar').click(function() {
+        $('.sidebar').toggleClass('-translate-x-full');
+    });
+    loadKnowledgeList();
+    loadImportTasks();
+
+    // Tab switching
+    $('.kb-tab').click(function() {
+        $('.kb-tab').removeClass('active').addClass('text-gray-400');
+        $(this).addClass('active').removeClass('text-gray-400');
+        $('.tab-content').addClass('hidden');
+        $('#tab-' + $(this).data('tab')).removeClass('hidden');
+    });
+
+    // Search
+    $('#kb-search').on('keyup', function(e) {
+        if (e.key === 'Enter') {
+            currentPage = 1;
+            loadKnowledgeList();
+        }
+    });
+
+    // Auto-generate task_no
+    $('#import-task-no').on('focus', function() {
+        if (!$(this).val()) {
+            $(this).val('KIT-' + new Date().toISOString().replace(/[-:T.]/g, '').slice(0, 14) + '-' + Math.random().toString(36).slice(2, 6));
+        }
+    });
+});
+
+function loadKnowledgeList() {
+    const keyword = $('#kb-search').val().trim();
+    $.ajax({
+        url: '/api/v1/knowledge/bases',
+        method: 'GET',
+        data: { page: currentPage, page_size: 20 },
+        headers: { 'Authorization': 'Bearer ' + getJwtToken() },
+        success: function(resp) {
+            renderKnowledgeList(resp.data || resp);
+        },
+        error: function(xhr) {
+            $('#kb-list-body').html('<tr><td colspan="6" class="text-center text-red-400 py-10">加载失败: ' + (xhr.responseJSON?.message || '未知错误') + '</td></tr>');
+        }
+    });
+}
+
+function renderKnowledgeList(data) {
+    const items = data.items || data;
+    const total = data.total || items.length;
+    const tbody = $('#kb-list-body');
+    tbody.empty();
+
+    if (!items || items.length === 0) {
+        tbody.html('<tr><td colspan="6" class="text-center text-gray-500 py-10">暂无知识库数据</td></tr>');
+        $('#kb-total').text('');
+        return;
+    }
+
+    items.forEach(function(item) {
+        const statusHtml = item.status === 1
+            ? '<span class="text-green-400"><i class="fas fa-check-circle"></i> 启用</span>'
+            : '<span class="text-red-400"><i class="fas fa-times-circle"></i> 禁用</span>';
+        const html = `<tr class="border-b border-gray-700 hover:bg-gray-700/30 transition-colors">
+            <td class="px-4 py-3 text-gray-400 font-mono text-xs">${item.id}</td>
+            <td class="px-4 py-3 text-white font-medium">${item.name}</td>
+            <td class="px-4 py-3 text-center">${item.document_count || 0}</td>
+            <td class="px-4 py-3 text-center">${statusHtml}</td>
+            <td class="px-4 py-3 text-center text-gray-400 text-xs">${item.created_at || ''}</td>
+            <td class="px-4 py-3 text-center">
+                <button onclick="viewKbDetail('${item.id}', '${item.name.replace(/'/g, "\\'")}')" class="text-blue-400 hover:text-blue-300 text-xs mr-3"><i class="fas fa-eye"></i> 详情</button>
+                <button onclick="openImportModal('${item.id}', '${item.name.replace(/'/g, "\\'")}')" class="text-green-400 hover:text-green-300 text-xs"><i class="fas fa-upload"></i> 入库</button>
+            </td>
+        </tr>`;
+        tbody.append(html);
+    });
+
+    const pageSize = data.page_size || 20;
+    const page = data.page || 1;
+    const totalPages = Math.ceil(total / pageSize);
+    $('#kb-total').text('共 ' + total + ' 条');
+    $('#kb-page-info').text('第 ' + page + ' / ' + totalPages + ' 页');
+    $('#btn-prev').toggleClass('hidden', page <= 1);
+    $('#btn-next').toggleClass('hidden', page >= totalPages);
+}
+
+function changePage(delta) {
+    currentPage += delta;
+    if (currentPage < 1) currentPage = 1;
+    loadKnowledgeList();
+}
+
+function viewKbDetail(kbId, kbName) {
+    // Switch to detail tab
+    $('.kb-tab').removeClass('active').addClass('text-gray-400');
+    $('.kb-tab[data-tab="detail"]').addClass('active').removeClass('text-gray-400');
+    $('.tab-content').addClass('hidden');
+    $('#tab-detail').removeClass('hidden');
+
+    $('#kb-detail-content').html('<div class="text-center text-gray-400 py-20"><i class="fas fa-circle-notch fa-spin text-3xl text-blue-500 mb-4"></i><p>加载中...</p></div>');
+
+    $.ajax({
+        url: '/api/v1/knowledge/bases/' + kbId,
+        method: 'GET',
+        headers: { 'Authorization': 'Bearer ' + getJwtToken() },
+        success: function(resp) {
+            renderKbDetail(resp.data || resp);
+        },
+        error: function(xhr) {
+            $('#kb-detail-content').html('<div class="text-center text-red-400 py-20">加载失败: ' + (xhr.responseJSON?.message || '未知错误') + '</div>');
+        }
+    });
+}
+
+function renderKbDetail(data) {
+    const schemaHtml = data.metadata_schema && data.metadata_schema.length > 0
+        ? `<div class="mt-6"><h3 class="text-white font-bold mb-3"><i class="fas fa-table mr-2 text-blue-500"></i>元数据定义</h3>
+           <table class="w-full text-sm text-gray-300">
+               <thead class="text-xs text-gray-400 uppercase bg-gray-700/50">
+                   <tr><th class="px-4 py-2 text-left">中文名</th><th class="px-4 py-2 text-left">英文名</th><th class="px-4 py-2 text-center">类型</th><th class="px-4 py-2 text-left">描述</th></tr>
+               </thead>
+               <tbody>${data.metadata_schema.map(function(f) {
+                   return '<tr class="border-b border-gray-700"><td class="px-4 py-2">' + (f.field_name_cn||'') + '</td><td class="px-4 py-2 font-mono text-xs">' + (f.field_name_en||'') + '</td><td class="px-4 py-2 text-center"><span class="bg-gray-600 px-2 py-0.5 rounded text-xs">' + (f.field_type||'') + '</span></td><td class="px-4 py-2 text-gray-400">' + (f.description||'') + '</td></tr>';
+               }).join('')}</tbody>
+           </table></div>`
+        : '';
+
+    $('#kb-detail-content').html(`
+        <div class="bg-gray-700/50 rounded-lg p-6">
+            <h2 class="text-xl font-bold text-white mb-4"><i class="fas fa-book mr-2 text-blue-500"></i>${data.name}</h2>
+            <div class="grid grid-cols-1 md:grid-cols-2 gap-4 text-sm">
+                <div><span class="text-gray-400">ID:</span> <span class="text-white font-mono">${data.id}</span></div>
+                <div><span class="text-gray-400">文档数:</span> <span class="text-white">${data.document_count || 0}</span></div>
+                <div><span class="text-gray-400">父表:</span> <span class="text-white font-mono text-xs">${data.parent_table || '-'}</span></div>
+                <div><span class="text-gray-400">子表:</span> <span class="text-white font-mono text-xs">${data.child_table || '-'}</span></div>
+                <div><span class="text-gray-400">创建人:</span> <span class="text-white">${data.created_by || '-'}</span></div>
+                <div><span class="text-gray-400">创建时间:</span> <span class="text-white">${data.created_at || '-'}</span></div>
+            </div>
+            ${data.description ? '<p class="mt-4 text-gray-300"><span class="text-gray-400">描述:</span> ' + data.description + '</p>' : ''}
+            ${schemaHtml}
+            <div class="mt-6 flex gap-3">
+                <button onclick="openImportModal('${data.id}', '${data.name.replace(/'/g, "\\'")}')" class="bg-blue-600 hover:bg-blue-700 text-white px-6 py-2 rounded transition-colors"><i class="fas fa-upload mr-1"></i>批量入库</button>
+                <button onclick="switchTab('list')" class="bg-gray-600 hover:bg-gray-500 text-white px-6 py-2 rounded transition-colors"><i class="fas fa-arrow-left mr-1"></i>返回列表</button>
+            </div>
+        </div>
+    `);
+}
+
+function switchTab(name) {
+    $('.kb-tab').removeClass('active').addClass('text-gray-400');
+    $('.kb-tab[data-tab="' + name + '"]').addClass('active').removeClass('text-gray-400');
+    $('.tab-content').addClass('hidden');
+    $('#tab-' + name).removeClass('hidden');
+}
+
+function loadImportTasks() {
+    $.ajax({
+        url: '/api/v1/knowledge/import-tasks',
+        method: 'GET',
+        data: { page: 1, page_size: 50 },
+        headers: { 'Authorization': 'Bearer ' + getJwtToken() },
+        success: function(resp) {
+            renderImportTasks(resp.data || resp);
+        },
+        error: function(xhr) {
+            $('#task-list-body').html('<tr><td colspan="6" class="text-center text-red-400 py-10">加载失败</td></tr>');
+        }
+    });
+}
+
+function renderImportTasks(data) {
+    const items = data.items || [];
+    const tbody = $('#task-list-body');
+    tbody.empty();
+
+    if (items.length === 0) {
+        tbody.html('<tr><td colspan="6" class="text-center text-gray-500 py-10">暂无入库任务</td></tr>');
+        return;
+    }
+
+    const statusMap = {
+        'pending': '<span class="status-pending"><i class="fas fa-clock"></i> 等待中</span>',
+        'processing': '<span class="status-processing"><i class="fas fa-spinner fa-spin"></i> 处理中</span>',
+        'success': '<span class="status-success"><i class="fas fa-check-circle"></i> 已完成</span>',
+        'completed': '<span class="status-success"><i class="fas fa-check-circle"></i> 已完成</span>',
+        'failed': '<span class="status-failed"><i class="fas fa-times-circle"></i> 失败</span>',
+    };
+
+    items.forEach(function(item) {
+        const progressHtml = item.progress
+            ? item.progress.processed + '/' + item.progress.total
+            : '-';
+        const html = `<tr class="border-b border-gray-700 hover:bg-gray-700/30 transition-colors">
+            <td class="px-4 py-3 font-mono text-xs text-gray-300">${item.task_no}</td>
+            <td class="px-4 py-3 text-white">${item.kb_name || item.kb_id}</td>
+            <td class="px-4 py-3 text-center">${statusMap[item.status] || item.status}</td>
+            <td class="px-4 py-3 text-center text-sm">${progressHtml}</td>
+            <td class="px-4 py-3 text-center text-gray-400 text-xs">${item.created_at || ''}</td>
+            <td class="px-4 py-3 text-center">
+                ${item.status === 'processing' || item.status === 'pending'
+                    ? '<button onclick="refreshTaskStatus(\'' + item.task_no + '\')" class="text-blue-400 hover:text-blue-300 text-xs"><i class="fas fa-sync-alt"></i> 刷新</button>'
+                    : (item.error_message ? '<span class="text-red-400 text-xs" title="' + item.error_message + '"><i class="fas fa-exclamation-triangle"></i></span>' : '')}
+            </td>
+        </tr>`;
+        tbody.append(html);
+    });
+}
+
+function refreshTaskStatus(taskNo) {
+    $.ajax({
+        url: '/api/v1/knowledge/import-tasks/' + taskNo,
+        method: 'GET',
+        headers: { 'Authorization': 'Bearer ' + getJwtToken() },
+        success: function() {
+            loadImportTasks();
+        }
+    });
+}
+
+function openImportModal(kbId, kbName) {
+    $('#import-kb-id').val(kbId);
+    $('#import-kb-name').val(kbName);
+    $('#import-task-no').val('');
+    $('#import-parents').val('');
+    $('#import-children').val('');
+    $('#import-modal').removeClass('hidden');
+}
+
+function closeImportModal() {
+    $('#import-modal').addClass('hidden');
+}
+
+function submitImport() {
+    const kbId = $('#import-kb-id').val();
+    const taskNo = $('#import-task-no').val().trim();
+    let parents, children;
+
+    if (!taskNo) { alert('请输入任务号'); return; }
+
+    try {
+        parents = JSON.parse($('#import-parents').val());
+    } catch(e) { alert('Parents JSON 格式错误: ' + e.message); return; }
+
+    const childrenVal = $('#import-children').val().trim();
+    children = childrenVal ? JSON.parse(childrenVal) : null;
+
+    $.ajax({
+        url: '/api/v1/knowledge/bases/' + kbId + '/batch-import',
+        method: 'POST',
+        contentType: 'application/json',
+        headers: { 'Authorization': 'Bearer ' + getJwtToken() },
+        data: JSON.stringify({
+            task_no: taskNo,
+            kb_name: $('#import-kb-name').val(),
+            parents: parents,
+            children: children,
+        }),
+        success: function(resp) {
+            alert('入库任务已提交!任务号: ' + taskNo);
+            closeImportModal();
+            switchTab('tasks');
+            loadImportTasks();
+        },
+        error: function(xhr) {
+            alert('提交失败: ' + (xhr.responseJSON?.message || '未知错误'));
+        }
+    });
+}
+
+function getJwtToken() {
+    // SSO 登录后保存在 localStorage 中的 key 是 'token'
+    return localStorage.getItem('token') || localStorage.getItem('jwt_token') || '';
+}
+
+// 自动刷新任务状态(每 10 秒)
+setInterval(function() {
+    if (!$('#tab-tasks').hasClass('hidden')) {
+        loadImportTasks();
+    }
+}, 10000);
+</script>
+{% endblock %}

+ 38 - 18
app/templates/login.html

@@ -3,25 +3,45 @@
 {% block content %}
 <div class="flex flex-1 justify-center items-center p-4" id="login-view">
     <div class="tech-panel p-10 rounded-xl w-full max-w-sm shadow-2xl border-blue-500/50">
-        <h2 class="text-3xl font-extrabold text-center mb-10 text-cyan-300">
+        <h2 class="text-3xl font-extrabold text-center mb-2 text-cyan-300">
             系统登录
         </h2>
-        <form class="space-y-6" action="{{ url_for('main.login') }}" method="POST">
-            <div class="relative">
-                <i class="fas fa-user absolute left-4 top-1/2 transform -translate-y-1/2 text-blue-400"></i>
-                <input class="w-full p-3.5 pl-11 bg-gray-800 border border-blue-700 rounded-lg focus:ring-2 focus:ring-cyan-400 focus:border-cyan-400 text-white placeholder-gray-500 transition duration-200 shadow-inner" id="username" name="username" placeholder="用户名" required="" type="text">
-            </div>
-            <div class="relative">
-                <i class="fas fa-lock absolute left-4 top-1/2 transform -translate-y-1/2 text-blue-400"></i>
-                <input class="w-full p-3.5 pl-11 bg-gray-800 border border-blue-700 rounded-lg focus:ring-2 focus:ring-cyan-400 focus:border-cyan-400 text-white placeholder-gray-500 transition duration-200 shadow-inner" id="password" name="password" placeholder="密码" required="" type="password">
-            </div>
-            <button class="tech-button w-full py-3 mt-8 bg-cyan-600 hover:bg-cyan-500 text-white font-bold rounded-lg shadow-lg transition duration-300 transform hover:scale-[1.01] border-cyan-400" type="submit">
-                <i class="fas fa-sign-in-alt mr-2"></i> 进入系统
-            </button>
-            {% if error %}
-            <p class="mt-4 text-center text-sm text-red-400">{{ error }}</p>
-            {% endif %}
-        </form>
+        <p class="text-center text-gray-500 text-sm mb-10">{{ app_name }}</p>
+
+        <!-- SSO 登录(主入口) -->
+        <a href="{{ url_for('sso.sso_authorize') }}?redirect=true"
+           class="block w-full py-3.5 mb-6 bg-cyan-600 hover:bg-cyan-500 text-white font-bold rounded-lg shadow-lg transition duration-300 transform hover:scale-[1.01] border-cyan-400 text-center">
+            <i class="fas fa-right-to-bracket mr-2"></i> SSO 统一认证登录
+        </a>
+
+        <div class="flex items-center gap-3 mb-6">
+            <div class="flex-1 border-t border-blue-900/50"></div>
+            <span class="text-xs text-gray-500">或</span>
+            <div class="flex-1 border-t border-blue-900/50"></div>
+        </div>
+
+        <!-- 本地密码登录(兜底) -->
+        <details class="group">
+            <summary class="cursor-pointer text-sm text-gray-400 hover:text-cyan-300 transition text-center mb-4 list-none">
+                <i class="fas fa-key mr-1"></i> 使用本地账号密码登录
+            </summary>
+            <form class="space-y-6 mt-4" action="{{ url_for('main.login') }}" method="POST">
+                <div class="relative">
+                    <i class="fas fa-user absolute left-4 top-1/2 transform -translate-y-1/2 text-blue-400"></i>
+                    <input class="w-full p-3.5 pl-11 bg-gray-800 border border-blue-700 rounded-lg focus:ring-2 focus:ring-cyan-400 focus:border-cyan-400 text-white placeholder-gray-500 transition duration-200 shadow-inner" id="username" name="username" placeholder="用户名" required="" type="text">
+                </div>
+                <div class="relative">
+                    <i class="fas fa-lock absolute left-4 top-1/2 transform -translate-y-1/2 text-blue-400"></i>
+                    <input class="w-full p-3.5 pl-11 bg-gray-800 border border-blue-700 rounded-lg focus:ring-2 focus:ring-cyan-400 focus:border-cyan-400 text-white placeholder-gray-500 transition duration-200 shadow-inner" id="password" name="password" placeholder="密码" required="" type="password">
+                </div>
+                <button class="tech-button w-full py-3 bg-gray-700 hover:bg-gray-600 text-white font-bold rounded-lg shadow-lg transition duration-300 border-gray-500" type="submit">
+                    <i class="fas fa-sign-in-alt mr-2"></i> 本地登录
+                </button>
+                {% if error %}
+                <p class="mt-4 text-center text-sm text-red-400">{{ error }}</p>
+                {% endif %}
+            </form>
+        </details>
     </div>
 </div>
-{% endblock %}
+{% endblock %}

+ 7 - 1
app/templates/partials/sidebar.html

@@ -2,7 +2,7 @@
         <div class="p-6 flex items-center justify-between transition-all duration-300" id="sidebar-header">
             <span class="text-xl font-bold tech-title tracking-wider flex items-center gap-2 whitespace-nowrap overflow-hidden" id="sidebar-brand">
                 <i class="fas fa-globe-asia text-blue-500 min-w-[1.5rem] text-center"></i> 
-                <span class="sidebar-text transition-opacity duration-300">政企瞭望</span>
+                <span class="sidebar-text transition-opacity duration-300">{{ app_name }}</span>
             </span>
             <button class="md:hidden text-gray-300 hover:text-white focus:outline-none" id="close-sidebar">
                 <i class="fas fa-times"></i>
@@ -61,6 +61,12 @@
                 <span class="ml-2 font-medium sidebar-text whitespace-nowrap">AI数智大屏</span>
             </a>
         </li>
+                <li>
+                    <a class="flex items-center px-4 py-3 rounded-lg {% if request.endpoint == 'knowledge.knowledge_page' %}bg-gradient-to-r from-blue-600 to-blue-800 text-white shadow-lg{% else %}text-gray-400 hover:bg-gray-800 hover:text-white{% endif %} transition-all duration-200 group" href="{{ url_for('knowledge.knowledge_page') }}">
+                        <i class="fas fa-book w-6 text-center {% if request.endpoint == 'knowledge.knowledge_page' %}text-white{% else %}text-gray-500 group-hover:text-blue-400{% endif %} transition-colors"></i>
+                        <span class="ml-2 font-medium sidebar-text whitespace-nowrap">样本中心</span>
+                    </a>
+                </li>
             </ul>
         </nav>
         <div class="p-4 mx-3 mb-4 mt-auto rounded-xl bg-gray-800/50 backdrop-blur-sm transition-all duration-300">

+ 3 - 3
app/templates/screen.html

@@ -161,7 +161,7 @@
     <!-- Main Content -->
     <div class="flex-1 flex flex-col min-w-0 overflow-hidden">
         <div class="screen-header">
-            <h1>数智舆情大数据可视化平台</h1>
+            <h1>路桥采集数据可视化平台</h1>
             <div class="screen-time" id="clock"></div>
         </div>
 
@@ -173,7 +173,7 @@
                     <div id="chart-source" class="screen-chart"></div>
                 </div>
                 <div class="screen-panel">
-                    <div class="screen-panel-title">舆情趋势分析 (近7天)</div>
+                    <div class="screen-panel-title">采集趋势分析 (近7天)</div>
                     <div id="chart-trend" class="screen-chart"></div>
                 </div>
                 <div class="screen-panel">
@@ -212,7 +212,7 @@
                     <div id="chart-tasks" class="screen-chart"></div>
                 </div>
                 <div class="screen-panel" style="flex: 2;">
-                    <div class="screen-panel-title">实时舆情动态</div>
+                    <div class="screen-panel-title">实时采集动态</div>
                     <div class="screen-news" id="news-list"></div>
                 </div>
             </div>

+ 40 - 0
docker-compose.dev.yml

@@ -0,0 +1,40 @@
+version: '3.8'
+
+services:
+  web:
+    environment:
+      - FLASK_DEBUG=true
+      - DB_HOST=db
+      - DB_USER=${DB_USER:-liaowang}
+      - DB_PASSWORD=${DB_PASSWORD:-liaowang_secret}
+      - DB_NAME=${DB_NAME:-liaowang_db}
+    volumes:
+      # 开发模式:挂载代码实现热重载
+      - .:/app
+      - /app/venv
+    depends_on:
+      db:
+        condition: service_healthy
+    # 开发模式不需要健康检查
+    healthcheck:
+      disable: true
+
+  db:
+    image: postgres:16-alpine
+    container_name: liaowang-db
+    environment:
+      POSTGRES_DB: ${DB_NAME:-liaowang_db}
+      POSTGRES_USER: ${DB_USER:-liaowang}
+      POSTGRES_PASSWORD: ${DB_PASSWORD:-liaowang_secret}
+    ports:
+      - "5432:5432"
+    volumes:
+      - postgres_data:/var/lib/postgresql/data
+    healthcheck:
+      test: ["CMD-SHELL", "pg_isready -U ${DB_USER:-liaowang}"]
+      interval: 10s
+      timeout: 5s
+      retries: 5
+
+volumes:
+  postgres_data:

+ 7 - 32
docker-compose.yml

@@ -7,11 +7,11 @@ services:
     ports:
       - "5000:5000"
     environment:
-      - DB_HOST=${DB_HOST:-db}
+      - DB_HOST=${DB_HOST}
       - DB_PORT=${DB_PORT:-5432}
-      - DB_USER=${DB_USER:-liaowang}
-      - DB_PASSWORD=${DB_PASSWORD:-liaowang_secret}
-      - DB_NAME=${DB_NAME:-liaowang_db}
+      - DB_USER=${DB_USER}
+      - DB_PASSWORD=${DB_PASSWORD}
+      - DB_NAME=${DB_NAME}
       - SECRET_KEY=${SECRET_KEY:-change-me-to-a-random-secret}
       - JWT_SECRET_KEY=${JWT_SECRET_KEY:-jwt-secret-change-me-to-random-string}
       - SSO_BASE_URL=${SSO_BASE_URL:-http://192.168.92.61:8200}
@@ -21,15 +21,11 @@ services:
       - SSO_FRONTEND_URL=${SSO_FRONTEND_URL:-http://localhost:5000}
       - SSO_SCOPE=${SSO_SCOPE:-email}
       - SSO_LOGOUT_REDIRECT_URL=${SSO_LOGOUT_REDIRECT_URL:-http://192.168.92.61:9200/login}
+      - SAMPLE_CENTER_BASE_URL=${SAMPLE_CENTER_BASE_URL:-http://192.168.92.61}
+      - SAMPLE_CENTER_APP_ID=${SAMPLE_CENTER_APP_ID}
+      - SAMPLE_CENTER_APP_SECRET=${SAMPLE_CENTER_APP_SECRET}
     env_file:
       - .env
-    depends_on:
-      db:
-        condition: service_healthy
-    volumes:
-      # 开发模式:挂载代码实现热重载
-      - .:/app
-      - /app/venv  # 排除虚拟环境目录
     restart: unless-stopped
     # 健康检查
     healthcheck:
@@ -38,24 +34,3 @@ services:
       timeout: 10s
       retries: 3
       start_period: 15s
-
-  db:
-    image: postgres:16-alpine
-    container_name: liaowang-db
-    environment:
-      POSTGRES_DB: ${DB_NAME:-liaowang_db}
-      POSTGRES_USER: ${DB_USER:-liaowang}
-      POSTGRES_PASSWORD: ${DB_PASSWORD:-liaowang_secret}
-    ports:
-      - "5432:5432"
-    volumes:
-      - postgres_data:/var/lib/postgresql/data
-    restart: unless-stopped
-    healthcheck:
-      test: ["CMD-SHELL", "pg_isready -U ${DB_USER:-liaowang}"]
-      interval: 10s
-      timeout: 5s
-      retries: 5
-
-volumes:
-  postgres_data:

+ 8 - 0
entrypoint.sh

@@ -0,0 +1,8 @@
+#!/bin/bash
+set -e
+
+echo "Running database migration..."
+uv run python migrate_db.py || echo "Warning: migration failed, continuing anyway"
+
+echo "Starting Flask application..."
+exec uv run python run.py

+ 48 - 0
init_db.py

@@ -0,0 +1,48 @@
+"""
+初始化数据库:创建所有表 + 插入默认管理员账号 + 默认爬虫源
+"""
+import os
+import sys
+from dotenv import load_dotenv
+
+# 确保 .env 在 create_app 之前加载
+load_dotenv(os.path.join(os.path.dirname(__file__), '.env'))
+
+from app import create_app, db
+from app.models import User, SpiderSource
+
+def init_db():
+    app = create_app()
+    with app.app_context():
+        # 创建所有表
+        db.create_all()
+        print("Tables created successfully.")
+
+        # 插入默认管理员
+        if not User.query.filter_by(username='admin').first():
+            user = User(username='admin')
+            user.set_password('admin')
+            db.session.add(user)
+            print("Default admin user created: admin / admin")
+        else:
+            print("Admin user already exists.")
+
+        # 插入默认爬虫源
+        if not SpiderSource.query.filter_by(code_identifier='baidusearch').first():
+            source = SpiderSource(
+                name='百度搜索',
+                code_identifier='baidusearch',
+                description='百度搜索引擎爬虫',
+                type='script',
+                status='active'
+            )
+            db.session.add(source)
+            print("Default spider source '百度搜索' created.")
+        else:
+            print("Spider source '百度搜索' already exists.")
+
+        db.session.commit()
+        print("Database initialization complete.")
+
+if __name__ == '__main__':
+    init_db()

+ 134 - 0
migrate.sql

@@ -0,0 +1,134 @@
+-- ============================================
+-- AI LiaoWang Web App 数据库迁移脚本
+-- 数据库: PostgreSQL
+-- 生成时间: 2026-05-18
+-- 说明: 包含所有表的完整建表语句,含 SSO 新增字段
+-- ============================================
+
+-- 1. 用户表 (User) — 已新增 SSO 字段
+CREATE TABLE IF NOT EXISTS "user" (
+    id SERIAL PRIMARY KEY,
+    username VARCHAR(64) NOT NULL,
+    password_hash VARCHAR(256),
+    sso_sub VARCHAR(256) UNIQUE,
+    real_name VARCHAR(100),
+    roles TEXT,
+    email VARCHAR(120),
+    phone VARCHAR(30),
+    avatar_url VARCHAR(500)
+);
+
+CREATE INDEX IF NOT EXISTS ix_user_username ON "user" (username);
+
+-- 插入默认管理员账号 (密码: admin)
+INSERT INTO "user" (username, password_hash, sso_sub, real_name, roles)
+SELECT 'admin',
+       'scrypt:32768:8:1$A9gQwDOkl3PLfi5f$fd967bd171fe965bbb1530b036a5003d83ee61b6463dc273479538584ce39c2e778be3e2e8d17b26b8890df31724eda49edb5cd91096ee3a7093c64a135b519e',
+       NULL, '管理员', '["super_admin"]'
+WHERE NOT EXISTS (SELECT 1 FROM "user" WHERE username = 'admin');
+
+-- 2. 爬虫源表 (SpiderSource)
+CREATE TABLE IF NOT EXISTS spider_source (
+    id SERIAL PRIMARY KEY,
+    name VARCHAR(100) NOT NULL UNIQUE,
+    code_identifier VARCHAR(100) NOT NULL UNIQUE,
+    description TEXT,
+    status VARCHAR(20) DEFAULT 'active',
+    created_at TIMESTAMP DEFAULT NOW(),
+    type VARCHAR(20) DEFAULT 'script',
+    url VARCHAR(500),
+    method VARCHAR(10) DEFAULT 'GET',
+    headers TEXT,
+    params TEXT,
+    search_param_key VARCHAR(50) DEFAULT 'q',
+    selectors TEXT,
+    has_pagination BOOLEAN DEFAULT FALSE,
+    pagination_param VARCHAR(50),
+    pagination_step INTEGER DEFAULT 10,
+    pagination_start INTEGER DEFAULT 0
+);
+
+-- 3. 采集任务表 (SpiderTask)
+CREATE TABLE IF NOT EXISTS collection_task (
+    id SERIAL PRIMARY KEY,
+    keyword VARCHAR(100) NOT NULL,
+    spider_source_id INTEGER NOT NULL REFERENCES spider_source(id),
+    pages INTEGER DEFAULT 1,
+    status VARCHAR(20) DEFAULT 'pending',
+    created_at TIMESTAMP DEFAULT NOW(),
+    finished_at TIMESTAMP
+);
+
+-- 4. 采集结果表 (SpiderResult)
+CREATE TABLE IF NOT EXISTS spider_result (
+    id SERIAL PRIMARY KEY,
+    task_id INTEGER REFERENCES collection_task(id),
+    title VARCHAR(500),
+    abstract TEXT,
+    source VARCHAR(200),
+    cover VARCHAR(500),
+    link VARCHAR(500),
+    published_at VARCHAR(50),
+    has_deep_collection BOOLEAN DEFAULT FALSE,
+    created_at TIMESTAMP DEFAULT NOW()
+);
+
+-- 5. 深度采集表 (DeepCollection)
+CREATE TABLE IF NOT EXISTS deep_collection (
+    id SERIAL PRIMARY KEY,
+    title VARCHAR(500),
+    url VARCHAR(500) NOT NULL UNIQUE,
+    content TEXT,
+    summary TEXT,
+    status VARCHAR(20) DEFAULT 'pending',
+    progress INTEGER DEFAULT 0,
+    progress_msg VARCHAR(200) DEFAULT '',
+    error_msg TEXT,
+    created_at TIMESTAMP DEFAULT NOW(),
+    updated_at TIMESTAMP DEFAULT NOW()
+);
+
+-- 6. AI 模型表 (AIModel)
+CREATE TABLE IF NOT EXISTS ai_model (
+    id SERIAL PRIMARY KEY,
+    name VARCHAR(100) NOT NULL,
+    provider VARCHAR(50) DEFAULT 'openai',
+    api_base VARCHAR(500) NOT NULL,
+    api_key VARCHAR(500) NOT NULL,
+    model_name VARCHAR(200) NOT NULL,
+    system_prompt TEXT,
+    is_active BOOLEAN DEFAULT TRUE,
+    total_tokens BIGINT DEFAULT 0,
+    created_at TIMESTAMP DEFAULT NOW(),
+    updated_at TIMESTAMP DEFAULT NOW()
+);
+
+-- 7. Token 使用日志表 (TokenUsageLog)
+CREATE TABLE IF NOT EXISTS token_usage_log (
+    id SERIAL PRIMARY KEY,
+    model_id INTEGER REFERENCES ai_model(id),
+    prompt_tokens INTEGER DEFAULT 0,
+    completion_tokens INTEGER DEFAULT 0,
+    total_tokens INTEGER DEFAULT 0,
+    request_type VARCHAR(50) DEFAULT 'test',
+    created_at TIMESTAMP DEFAULT NOW()
+);
+
+-- 8. AI 对话表 (AIConversation)
+CREATE TABLE IF NOT EXISTS ai_conversation (
+    id SERIAL PRIMARY KEY,
+    title VARCHAR(200) DEFAULT 'New Chat',
+    user_id INTEGER REFERENCES "user"(id),
+    created_at TIMESTAMP DEFAULT NOW(),
+    updated_at TIMESTAMP DEFAULT NOW()
+);
+
+-- 9. AI 消息表 (AIMessage)
+CREATE TABLE IF NOT EXISTS ai_message (
+    id SERIAL PRIMARY KEY,
+    conversation_id INTEGER NOT NULL REFERENCES ai_conversation(id) ON DELETE CASCADE,
+    role VARCHAR(20) NOT NULL,
+    content TEXT,
+    meta_data TEXT,
+    created_at TIMESTAMP DEFAULT NOW()
+);

+ 54 - 0
migrate_db.py

@@ -0,0 +1,54 @@
+"""
+数据库迁移:为已有的 user 表添加 SSO 新增字段
+"""
+import os
+from dotenv import load_dotenv
+
+load_dotenv(os.path.join(os.path.dirname(__file__), '.env'))
+
+from app import create_app, db
+
+def migrate():
+    app = create_app()
+    with app.app_context():
+        conn = db.engine.connect()
+
+        # 检查并添加 sso_sub
+        result = conn.execute(
+            db.text(
+                "SELECT column_name FROM information_schema.columns "
+                "WHERE table_name='user' AND column_name='sso_sub'"
+            )
+        )
+        if not result.fetchone():
+            conn.execute(db.text('ALTER TABLE "user" ADD COLUMN sso_sub VARCHAR(256)'))
+            conn.execute(db.text('ALTER TABLE "user" ADD CONSTRAINT user_sso_sub_key UNIQUE (sso_sub)'))
+            print("Added column sso_sub + unique constraint")
+        else:
+            print("sso_sub already exists")
+
+        cols = [
+            ("real_name", "VARCHAR(100)"),
+            ("roles", "TEXT"),
+            ("email", "VARCHAR(120)"),
+            ("phone", "VARCHAR(30)"),
+            ("avatar_url", "VARCHAR(500)"),
+        ]
+        for col, col_type in cols:
+            result = conn.execute(
+                db.text(
+                    f"SELECT column_name FROM information_schema.columns "
+                    f"WHERE table_name='user' AND column_name='{col}'"
+                )
+            )
+            if not result.fetchone():
+                conn.execute(db.text(f'ALTER TABLE "user" ADD COLUMN {col} {col_type}'))
+                print(f"Added column {col}")
+            else:
+                print(f"{col} already exists")
+
+        conn.commit()
+        print("Migration complete.")
+
+if __name__ == "__main__":
+    migrate()

+ 1 - 0
migrations/README

@@ -0,0 +1 @@
+Single-database configuration for Flask.

+ 50 - 0
migrations/alembic.ini

@@ -0,0 +1,50 @@
+# A generic, single database configuration.
+
+[alembic]
+# template used to generate migration files
+# file_template = %%(rev)s_%%(slug)s
+
+# set to 'true' to run the environment during
+# the 'revision' command, regardless of autogenerate
+# revision_environment = false
+
+
+# Logging configuration
+[loggers]
+keys = root,sqlalchemy,alembic,flask_migrate
+
+[handlers]
+keys = console
+
+[formatters]
+keys = generic
+
+[logger_root]
+level = WARN
+handlers = console
+qualname =
+
+[logger_sqlalchemy]
+level = WARN
+handlers =
+qualname = sqlalchemy.engine
+
+[logger_alembic]
+level = INFO
+handlers =
+qualname = alembic
+
+[logger_flask_migrate]
+level = INFO
+handlers =
+qualname = flask_migrate
+
+[handler_console]
+class = StreamHandler
+args = (sys.stderr,)
+level = NOTSET
+formatter = generic
+
+[formatter_generic]
+format = %(levelname)-5.5s [%(name)s] %(message)s
+datefmt = %H:%M:%S

+ 113 - 0
migrations/env.py

@@ -0,0 +1,113 @@
+import logging
+from logging.config import fileConfig
+
+from flask import current_app
+
+from alembic import context
+
+# this is the Alembic Config object, which provides
+# access to the values within the .ini file in use.
+config = context.config
+
+# Interpret the config file for Python logging.
+# This line sets up loggers basically.
+fileConfig(config.config_file_name)
+logger = logging.getLogger('alembic.env')
+
+
+def get_engine():
+    try:
+        # this works with Flask-SQLAlchemy<3 and Alchemical
+        return current_app.extensions['migrate'].db.get_engine()
+    except (TypeError, AttributeError):
+        # this works with Flask-SQLAlchemy>=3
+        return current_app.extensions['migrate'].db.engine
+
+
+def get_engine_url():
+    try:
+        return get_engine().url.render_as_string(hide_password=False).replace(
+            '%', '%%')
+    except AttributeError:
+        return str(get_engine().url).replace('%', '%%')
+
+
+# add your model's MetaData object here
+# for 'autogenerate' support
+# from myapp import mymodel
+# target_metadata = mymodel.Base.metadata
+config.set_main_option('sqlalchemy.url', get_engine_url())
+target_db = current_app.extensions['migrate'].db
+
+# other values from the config, defined by the needs of env.py,
+# can be acquired:
+# my_important_option = config.get_main_option("my_important_option")
+# ... etc.
+
+
+def get_metadata():
+    if hasattr(target_db, 'metadatas'):
+        return target_db.metadatas[None]
+    return target_db.metadata
+
+
+def run_migrations_offline():
+    """Run migrations in 'offline' mode.
+
+    This configures the context with just a URL
+    and not an Engine, though an Engine is acceptable
+    here as well.  By skipping the Engine creation
+    we don't even need a DBAPI to be available.
+
+    Calls to context.execute() here emit the given string to the
+    script output.
+
+    """
+    url = config.get_main_option("sqlalchemy.url")
+    context.configure(
+        url=url, target_metadata=get_metadata(), literal_binds=True
+    )
+
+    with context.begin_transaction():
+        context.run_migrations()
+
+
+def run_migrations_online():
+    """Run migrations in 'online' mode.
+
+    In this scenario we need to create an Engine
+    and associate a connection with the context.
+
+    """
+
+    # this callback is used to prevent an auto-migration from being generated
+    # when there are no changes to the schema
+    # reference: http://alembic.zzzcomputing.com/en/latest/cookbook.html
+    def process_revision_directives(context, revision, directives):
+        if getattr(config.cmd_opts, 'autogenerate', False):
+            script = directives[0]
+            if script.upgrade_ops.is_empty():
+                directives[:] = []
+                logger.info('No changes in schema detected.')
+
+    conf_args = current_app.extensions['migrate'].configure_args
+    if conf_args.get("process_revision_directives") is None:
+        conf_args["process_revision_directives"] = process_revision_directives
+
+    connectable = get_engine()
+
+    with connectable.connect() as connection:
+        context.configure(
+            connection=connection,
+            target_metadata=get_metadata(),
+            **conf_args
+        )
+
+        with context.begin_transaction():
+            context.run_migrations()
+
+
+if context.is_offline_mode():
+    run_migrations_offline()
+else:
+    run_migrations_online()

+ 24 - 0
migrations/script.py.mako

@@ -0,0 +1,24 @@
+"""${message}
+
+Revision ID: ${up_revision}
+Revises: ${down_revision | comma,n}
+Create Date: ${create_date}
+
+"""
+from alembic import op
+import sqlalchemy as sa
+${imports if imports else ""}
+
+# revision identifiers, used by Alembic.
+revision = ${repr(up_revision)}
+down_revision = ${repr(down_revision)}
+branch_labels = ${repr(branch_labels)}
+depends_on = ${repr(depends_on)}
+
+
+def upgrade():
+    ${upgrades if upgrades else "pass"}
+
+
+def downgrade():
+    ${downgrades if downgrades else "pass"}

+ 38 - 0
migrations/versions/e8720ded7b05_initial_baseline.py

@@ -0,0 +1,38 @@
+"""initial baseline
+
+Revision ID: e8720ded7b05
+Revises: 
+Create Date: 2026-05-18 12:19:44.299973
+
+"""
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = 'e8720ded7b05'
+down_revision = None
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+    # ### commands auto generated by Alembic - please adjust! ###
+    with op.batch_alter_table('user', schema=None) as batch_op:
+        batch_op.alter_column('password_hash',
+               existing_type=sa.VARCHAR(length=256),
+               type_=sa.String(length=512),
+               existing_nullable=True)
+
+    # ### end Alembic commands ###
+
+
+def downgrade():
+    # ### commands auto generated by Alembic - please adjust! ###
+    with op.batch_alter_table('user', schema=None) as batch_op:
+        batch_op.alter_column('password_hash',
+               existing_type=sa.String(length=512),
+               type_=sa.VARCHAR(length=256),
+               existing_nullable=True)
+
+    # ### end Alembic commands ###

+ 1 - 0
pyproject.toml

@@ -20,6 +20,7 @@ dependencies = [
     "psycopg2-binary>=2.9.10",
     "psutil>=7.2.2",
     "pydantic>=2.13.4",
+    "pyjwt>=2.10.1",
     "python-dotenv>=1.2.2",
     "requests>=2.33.1",
     "sqlalchemy>=2.0.49",

+ 107 - 0
requirements-docker-clean.txt

@@ -0,0 +1,107 @@
+aiofiles==25.1.0
+aiohappyeyeballs==2.6.1
+aiohttp==3.13.5
+aiosignal==1.4.0
+aiosqlite==0.22.1
+alembic==1.18.4
+alphashape==1.3.1
+annotated-doc==0.0.4
+annotated-types==0.7.0
+anyio==4.13.0
+attrs==26.1.0
+beautifulsoup4==4.14.3
+blinker==1.9.0
+brotli==1.2.0
+certifi==2026.4.22
+cffi==2.0.0 ; platform_python_implementation != 'PyPy'
+chardet==7.4.3
+charset-normalizer==3.4.7
+click==8.3.3
+click-log==0.4.0
+colorama==0.4.6 ; sys_platform == 'win32'
+crawl4ai==0.8.6
+cryptography==48.0.0
+cssselect==1.4.0
+distro==1.9.0
+fake-useragent==2.2.0
+fastuuid==0.14.0
+filelock==3.29.0
+flask==3.1.3
+flask-login==0.6.3
+flask-migrate==4.1.0
+flask-sqlalchemy==3.1.1
+frozenlist==1.8.0
+fsspec==2026.4.0
+greenlet==3.5.0
+h11==0.16.0
+h2==4.3.0
+hf-xet==1.5.0 ; platform_machine == 'AMD64' or platform_machine == 'aarch64' or platform_machine == 'amd64' or platform_machine == 'arm64' or platform_machine == 'x86_64'
+hpack==4.1.0
+httpcore==1.0.9
+httpx==0.28.1
+huggingface-hub==1.14.0
+humanize==4.15.0
+hyperframe==6.1.0
+idna==3.14
+importlib-metadata==9.0.0
+itsdangerous==2.2.0
+jinja2==3.1.6
+jiter==0.14.0
+joblib==1.5.3
+jsonschema==4.26.0
+jsonschema-specifications==2025.9.1
+lark==1.3.1
+lxml==5.4.0
+mako==1.3.12
+markdown-it-py==4.2.0
+markupsafe==3.0.3
+mdurl==0.1.2
+multidict==6.7.1
+networkx==3.6.1
+nltk==3.9.4
+numpy==2.4.4
+openai==2.36.0
+packaging==26.2
+patchright==1.59.1
+pillow==12.2.0
+playwright==1.59.0
+playwright-stealth==2.0.3
+propcache==0.5.2
+psutil==7.2.2
+psycopg2-binary==2.9.12
+pycparser==3.0 ; implementation_name != 'PyPy' and platform_python_implementation != 'PyPy'
+pydantic==2.13.4
+pydantic-core==2.46.4
+pyee==13.0.1
+pygments==2.20.0
+pyjwt==2.12.1
+pyopenssl==26.2.0
+python-dotenv==1.2.2
+pyyaml==6.0.3
+rank-bm25==0.2.2
+referencing==0.37.0
+regex==2026.5.9
+requests==2.33.1
+rich==15.0.0
+rpds-py==0.30.0
+rtree==1.4.1
+scipy==1.17.1
+shapely==2.1.2
+shellingham==1.5.4
+sniffio==1.3.1
+snowballstemmer==2.2.0
+soupsieve==2.8.3
+sqlalchemy==2.0.49
+tiktoken==0.12.0
+tokenizers==0.23.1
+tqdm==4.67.3
+trimesh==4.12.2
+typer==0.25.1
+typing-extensions==4.15.0
+typing-inspection==0.4.2
+unclecode-litellm==1.81.13
+urllib3==2.7.0
+werkzeug==3.1.8
+xxhash==3.7.0
+yarl==1.23.0
+zipp==3.23.1

+ 331 - 0
requirements-docker.txt

@@ -0,0 +1,331 @@
+# This file was autogenerated by uv via the following command:
+#    uv export --no-dev --no-hashes -o requirements-docker.txt
+aiofiles==25.1.0
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+aiohappyeyeballs==2.6.1
+    # via aiohttp
+aiohttp==3.13.5
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+    #   unclecode-litellm
+aiosignal==1.4.0
+    # via aiohttp
+aiosqlite==0.22.1
+    # via crawl4ai
+alembic==1.18.4
+    # via flask-migrate
+alphashape==1.3.1
+    # via crawl4ai
+annotated-doc==0.0.4
+    # via typer
+annotated-types==0.7.0
+    # via pydantic
+anyio==4.13.0
+    # via
+    #   crawl4ai
+    #   httpx
+    #   openai
+attrs==26.1.0
+    # via
+    #   aiohttp
+    #   jsonschema
+    #   referencing
+beautifulsoup4==4.14.3
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+blinker==1.9.0
+    # via flask
+brotli==1.2.0
+    # via crawl4ai
+certifi==2026.4.22
+    # via
+    #   httpcore
+    #   httpx
+    #   requests
+cffi==2.0.0 ; platform_python_implementation != 'PyPy'
+    # via cryptography
+chardet==7.4.3
+    # via crawl4ai
+charset-normalizer==3.4.7
+    # via requests
+click==8.3.3
+    # via
+    #   alphashape
+    #   click-log
+    #   crawl4ai
+    #   flask
+    #   nltk
+    #   typer
+    #   unclecode-litellm
+click-log==0.4.0
+    # via alphashape
+colorama==0.4.6 ; sys_platform == 'win32'
+    # via
+    #   click
+    #   tqdm
+crawl4ai==0.8.6
+    # via ai-liaowangweb-app
+cryptography==48.0.0
+    # via pyopenssl
+cssselect==1.4.0
+    # via crawl4ai
+distro==1.9.0
+    # via openai
+fake-useragent==2.2.0
+    # via crawl4ai
+fastuuid==0.14.0
+    # via unclecode-litellm
+filelock==3.29.0
+    # via huggingface-hub
+flask==3.1.3
+    # via
+    #   ai-liaowangweb-app
+    #   flask-login
+    #   flask-migrate
+    #   flask-sqlalchemy
+flask-login==0.6.3
+    # via ai-liaowangweb-app
+flask-migrate==4.1.0
+    # via ai-liaowangweb-app
+flask-sqlalchemy==3.1.1
+    # via
+    #   ai-liaowangweb-app
+    #   flask-migrate
+frozenlist==1.8.0
+    # via
+    #   aiohttp
+    #   aiosignal
+fsspec==2026.4.0
+    # via huggingface-hub
+greenlet==3.5.0
+    # via
+    #   ai-liaowangweb-app
+    #   patchright
+    #   playwright
+    #   sqlalchemy
+h11==0.16.0
+    # via httpcore
+h2==4.3.0
+    # via httpx
+hf-xet==1.5.0 ; platform_machine == 'AMD64' or platform_machine == 'aarch64' or platform_machine == 'amd64' or platform_machine == 'arm64' or platform_machine == 'x86_64'
+    # via huggingface-hub
+hpack==4.1.0
+    # via h2
+httpcore==1.0.9
+    # via httpx
+httpx==0.28.1
+    # via
+    #   crawl4ai
+    #   huggingface-hub
+    #   openai
+    #   unclecode-litellm
+huggingface-hub==1.14.0
+    # via tokenizers
+humanize==4.15.0
+    # via crawl4ai
+hyperframe==6.1.0
+    # via h2
+idna==3.14
+    # via
+    #   anyio
+    #   httpx
+    #   requests
+    #   yarl
+importlib-metadata==9.0.0
+    # via unclecode-litellm
+itsdangerous==2.2.0
+    # via flask
+jinja2==3.1.6
+    # via
+    #   flask
+    #   unclecode-litellm
+jiter==0.14.0
+    # via openai
+joblib==1.5.3
+    # via nltk
+jsonschema==4.26.0
+    # via unclecode-litellm
+jsonschema-specifications==2025.9.1
+    # via jsonschema
+lark==1.3.1
+    # via crawl4ai
+lxml==5.4.0
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+mako==1.3.12
+    # via alembic
+markdown-it-py==4.2.0
+    # via rich
+markupsafe==3.0.3
+    # via
+    #   flask
+    #   jinja2
+    #   mako
+    #   werkzeug
+mdurl==0.1.2
+    # via markdown-it-py
+multidict==6.7.1
+    # via
+    #   aiohttp
+    #   yarl
+networkx==3.6.1
+    # via alphashape
+nltk==3.9.4
+    # via crawl4ai
+numpy==2.4.4
+    # via
+    #   alphashape
+    #   crawl4ai
+    #   rank-bm25
+    #   scipy
+    #   shapely
+    #   trimesh
+openai==2.36.0
+    # via
+    #   ai-liaowangweb-app
+    #   unclecode-litellm
+packaging==26.2
+    # via huggingface-hub
+patchright==1.59.1
+    # via crawl4ai
+pillow==12.2.0
+    # via crawl4ai
+playwright==1.59.0
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+    #   playwright-stealth
+playwright-stealth==2.0.3
+    # via crawl4ai
+propcache==0.5.2
+    # via
+    #   aiohttp
+    #   yarl
+psutil==7.2.2
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+psycopg2-binary==2.9.12
+    # via ai-liaowangweb-app
+pycparser==3.0 ; implementation_name != 'PyPy' and platform_python_implementation != 'PyPy'
+    # via cffi
+pydantic==2.13.4
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+    #   openai
+    #   unclecode-litellm
+pydantic-core==2.46.4
+    # via pydantic
+pyee==13.0.1
+    # via
+    #   patchright
+    #   playwright
+pygments==2.20.0
+    # via rich
+pyjwt==2.12.1
+    # via ai-liaowangweb-app
+pyopenssl==26.2.0
+    # via crawl4ai
+python-dotenv==1.2.2
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+    #   unclecode-litellm
+pyyaml==6.0.3
+    # via
+    #   crawl4ai
+    #   huggingface-hub
+rank-bm25==0.2.2
+    # via crawl4ai
+referencing==0.37.0
+    # via
+    #   jsonschema
+    #   jsonschema-specifications
+regex==2026.5.9
+    # via
+    #   nltk
+    #   tiktoken
+requests==2.33.1
+    # via
+    #   ai-liaowangweb-app
+    #   crawl4ai
+    #   tiktoken
+rich==15.0.0
+    # via
+    #   crawl4ai
+    #   typer
+rpds-py==0.30.0
+    # via
+    #   jsonschema
+    #   referencing
+rtree==1.4.1
+    # via alphashape
+scipy==1.17.1
+    # via alphashape
+shapely==2.1.2
+    # via
+    #   alphashape
+    #   crawl4ai
+shellingham==1.5.4
+    # via typer
+sniffio==1.3.1
+    # via openai
+snowballstemmer==2.2.0
+    # via crawl4ai
+soupsieve==2.8.3
+    # via beautifulsoup4
+sqlalchemy==2.0.49
+    # via
+    #   ai-liaowangweb-app
+    #   alembic
+    #   flask-sqlalchemy
+tiktoken==0.12.0
+    # via unclecode-litellm
+tokenizers==0.23.1
+    # via unclecode-litellm
+tqdm==4.67.3
+    # via
+    #   huggingface-hub
+    #   nltk
+    #   openai
+trimesh==4.12.2
+    # via alphashape
+typer==0.25.1
+    # via huggingface-hub
+typing-extensions==4.15.0
+    # via
+    #   aiosignal
+    #   alembic
+    #   anyio
+    #   beautifulsoup4
+    #   huggingface-hub
+    #   openai
+    #   pydantic
+    #   pydantic-core
+    #   pyee
+    #   pyopenssl
+    #   referencing
+    #   sqlalchemy
+    #   typing-inspection
+typing-inspection==0.4.2
+    # via pydantic
+unclecode-litellm==1.81.13
+    # via crawl4ai
+urllib3==2.7.0
+    # via requests
+werkzeug==3.1.8
+    # via
+    #   flask
+    #   flask-login
+xxhash==3.7.0
+    # via crawl4ai
+yarl==1.23.0
+    # via aiohttp
+zipp==3.23.1
+    # via importlib-metadata

+ 0 - 17
requirements.txt

@@ -1,17 +0,0 @@
-Flask==3.1.2
-Flask-Login==0.6.3
-Flask-Migrate==4.1.0
-Flask-SQLAlchemy==3.1.1
-python-dotenv==1.2.1
-requests==2.32.5
-openai==2.8.1
-beautifulsoup4==4.14.3
-lxml==5.4.0
-Crawl4AI==0.7.8
-playwright==1.57.0
-psutil==7.2.1
-SQLAlchemy==2.0.44
-pydantic==2.11.9
-aiohttp==3.13.3
-greenlet==3.2.4
-aiofiles==25.1.0

+ 7 - 0
run.py

@@ -1,9 +1,16 @@
 from dotenv import load_dotenv
 import os
+import logging
 
 # 加载 .env 文件(在 create_app 之前)
 load_dotenv(os.path.join(os.path.dirname(__file__), '.env'))
 
+# 配置日志
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s %(levelname)s %(name)s: %(message)s',
+)
+
 from app import create_app
 
 app = create_app()

+ 11 - 0
uv.lock

@@ -22,6 +22,7 @@ dependencies = [
     { name = "psutil" },
     { name = "psycopg2-binary" },
     { name = "pydantic" },
+    { name = "pyjwt" },
     { name = "python-dotenv" },
     { name = "requests" },
     { name = "sqlalchemy" },
@@ -44,6 +45,7 @@ requires-dist = [
     { name = "psutil", specifier = ">=7.2.2" },
     { name = "psycopg2-binary", specifier = ">=2.9.10" },
     { name = "pydantic", specifier = ">=2.13.4" },
+    { name = "pyjwt", specifier = ">=2.10.1" },
     { name = "python-dotenv", specifier = ">=1.2.2" },
     { name = "requests", specifier = ">=2.33.1" },
     { name = "sqlalchemy", specifier = ">=2.0.49" },
@@ -2191,6 +2193,15 @@ wheels = [
     { url = "https://mirrors.aliyun.com/pypi/packages/f4/7e/a72dd26f3b0f4f2bf1dd8923c85f7ceb43172af56d63c7383eb62b332364/pygments-2.20.0-py3-none-any.whl", hash = "sha256:81a9e26dd42fd28a23a2d169d86d7ac03b46e2f8b59ed4698fb4785f946d0176" },
 ]
 
+[[package]]
+name = "pyjwt"
+version = "2.12.1"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple/" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/c2/27/a3b6e5bf6ff856d2509292e95c8f57f0df7017cf5394921fc4e4ef40308a/pyjwt-2.12.1.tar.gz", hash = "sha256:c74a7a2adf861c04d002db713dd85f84beb242228e671280bf709d765b03672b" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/e5/7a/8dd906bd22e79e47397a61742927f6747fe93242ef86645ee9092e610244/pyjwt-2.12.1-py3-none-any.whl", hash = "sha256:28ca37c070cad8ba8cd9790cd940535d40274d22f80ab87f3ac6a713e6e8454c" },
+]
+
 [[package]]
 name = "pyopenssl"
 version = "26.2.0"