# AI 意图识别流程梳理（`shudao-aichat` + `shudao-chat-py`）

本文档专门梳理当前项目中两套“意图识别”实现：

- `shudao-aichat`
- `shudao-main/shudao-chat-py`

重点说明：

- 每套流程的入口在哪里
- 输入输出结构是什么
- Prompt 如何构造
- 模型如何调用
- JSON 如何解析与容错
- 意图识别结果如何影响后续问答 / 检索 / 报告流程
- 两套实现的能力差异与架构差异

---

## 1. 总体结论

当前项目存在两套明显不同的意图识别体系：

### `shudao-aichat`

- 更像一个“结构化前置决策器”
- 意图识别产出不仅是类别，还包含：
  - 是否专业问题
  - 路由策略
  - 是否需要离线模型
  - 原始问题
  - 主关键词
  - 扩展关键词
  - 公司名称 / 别名
  - 内部查询场景
  - 摘要回答
  - 前端展示用思考摘要
- 这套结果会直接驱动后面的报告流、检索流和在线/离线路由

### `shudao-chat-py`

- 更像一个“轻量分类器 + RAG 开关”
- 意图识别主要只回答一个问题：
  - 当前消息是不是 `greeting（问候） / faq（常见问题） / query_knowledge_base（知识库查询）`
- 主要影响：
  - 是否直接给固定回复
  - 是否触发 RAG 检索
- 不负责复杂路由，不负责内部查询边界，不输出结构化检索策略

一句话概括：

- `shudao-aichat` 的意图识别是“流程编排中枢”
- `shudao-chat-py` 的意图识别是“问答前分类开关”

---

## 2. 两套流程的入口位置

## 2.1 `shudao-aichat`

### 路由注册

- 主应用在 [main.py](file:///Users/fanhong/UGIT/shudao-aichat/app/main.py#L63-L70) 注册 `intent.router`

### 独立接口入口

- 独立意图识别接口是 [analyze_intent](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L237-L392)
- 路径是：`POST /api/v1/intent/analyze`

### 被主流程内部调用的位置

- 报告主流程在 [report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L489-L527) 内直接调用 `analyze_intent()`

这说明 `aichat` 的意图识别既可以单独调用，也可以作为完整流程中的一个内部节点复用。

## 2.2 `shudao-chat-py`

### 路由注册

- 主应用在 [main.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/main.py#L108-L113) 注册 `api_router`
- `chat` 路由在 [routers/__init__.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/__init__.py#L3-L21) 中挂到 `/apiv1`

### 独立意图识别接口

- 独立接口是 [intent_recognition](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1982-L2072)
- 路径是：`POST /apiv1/intent_recognition`

### 被其他问答接口复用的位置

- 非流式问答 `send_deepseek_message` 在 [chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L959-L1016) 调用意图识别
- 无 DB 流式问答 `stream_chat` 在 [chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1371-L1386) 调用意图识别
- 主流式问答 `stream_chat-with-db` 没有单独先做意图识别，而是直接做 RAG 检索，见 [chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1612-L1668)

这说明 `chat-py` 的意图识别复用并不统一。

---

## 3. `shudao-aichat` 的意图识别流程

## 3.1 输入与输出模型

### 请求模型

- 定义在 [models.py](file:///Users/fanhong/UGIT/shudao-aichat/app/schemas/models.py#L9-L13)

字段：

- `user_question`
- `conversation_history`
- `enable_online_model`

### 响应模型

- 定义在 [models.py](file:///Users/fanhong/UGIT/shudao-aichat/app/schemas/models.py#L15-L29)

字段包括：

- `is_professional_question`
- `route`
- `need_offline_model`
- `origin_question`
- `keywords`
- `intent_description`
- `summary`
- `offline_instruction`
- `intent_scene`
- `company_name`
- `fallback_keywords`
- `company_aliases`
- `thinking_content`

这已经决定了它不是简单分类器，而是完整的结构化判定器。

## 3.2 第一步：进入接口并记录上下文

### 相关代码位置

- 接口入口：[analyze_intent](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L237-L392)
- 本步骤执行区间：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L237-L257)
- 请求模型：[models.py](file:///Users/fanhong/UGIT/shudao-aichat/app/schemas/models.py#L9-L13)

执行内容：

1. 记录开始日志
2. 读取用户问题
3. 读取对话历史
4. 读取是否启用在线模型

这里的特点是：

- 意图识别可以利用 `conversation_history`
- 路由策略会受 `enable_online_model` 影响

## 3.3 第二步：构建 Prompt

### 相关代码位置

- 通用 Prompt 加载在 [prompts.py](file:///Users/fanhong/UGIT/shudao-aichat/app/utils/prompts.py#L20-L58)
- 意图识别 Prompt 构造函数在 [get_intent_prompt](file:///Users/fanhong/UGIT/shudao-aichat/app/utils/prompts.py#L70-L80)
- 模板文件是 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L1-L236)

构建时会替换：

- `{user_question}`
- `{conversation_history}`
- `{enable_online_model}`

### Prompt 里编码的关键规则

模板中定义了以下内容：

1. 角色定位  
   见 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L1-L12)

2. 专业领域边界  
   见 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L13-L53)

3. 输出 JSON 结构  
   见 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L54-L93)

4. 专业 / 非专业判断规则  
   见 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L103-L143)

5. 关键词提取规则  
   见 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L145-L180)

6. `summary` 的组织方式  
   见 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L182-L216)

7. 工作流说明  
   见 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L228-L235)

### 这一步的实际作用

这份 Prompt 本质上已经把意图识别变成了“结构化问答规划”：

- 判定问题类型
- 给出下一跳路由
- 给出检索词
- 给出摘要与展示摘要
- 给出内部查询范围

## 3.4 第三步：代码层追加 system 约束

### 相关代码位置

- system 消息构造与追加：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L258-L272)
- system 约束生效的模型调用区间：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L273-L296)

这里在 Prompt 之外又额外加了一层 system 规则：

- “你是谁”“你好”“天气娱乐”等必须判为非专业问题
- `reasoning_summary` 只能是用户可展示的安全摘要
- 严禁输出原始思维链
- 只能输出合法 JSON

这一步是“Prompt 约束 + system 约束”的双重保险。

## 3.5 第四步：定义结构化输出 schema

### 相关代码位置

- schema 构造函数：[_build_intent_response_format](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L24-L77)
- schema 注入模型调用：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L273-L296)

这里构造了 `response_format=json_schema`，要求模型必须返回一个固定结构的 JSON。

关键字段包括：

- `is_professional_question`
- `intent_scene`
- `company_name`
- `company_aliases`
- `route`
- `need_offline_model`
- `offline_instruction`
- `origin_question`
- `keywords`
- `fallback_keywords`
- `intent_description`
- `summary`
- `reasoning_summary`

这一步的意义是：

- 把模型输出从“尽量像 JSON”升级为“尽量按 schema 输出 JSON”

## 3.6 第五步：调用离线 LLM

### 相关代码位置

- 调用发生在 [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L273-L296)
- 服务实现是 [OfflineLLMService.chat](file:///Users/fanhong/UGIT/shudao-aichat/app/services/offline_llm_service.py#L30-L103)

### 离线模型服务机制

`OfflineLLMService` 的特点：

1. 支持 `response_format`
2. 支持超时
3. 支持重试
4. 当上游不支持 `json_schema` 时会降级

具体降级逻辑：

- `json_schema`
- `json_object`
- `plain`

相关代码：

- 组 payload 变体：[offline_llm_service.py](file:///Users/fanhong/UGIT/shudao-aichat/app/services/offline_llm_service.py#L147-L183)
- 判断是否应该降级：[offline_llm_service.py](file:///Users/fanhong/UGIT/shudao-aichat/app/services/offline_llm_service.py#L192-L219)

### 这一步的实际效果

即使模型上游对结构化输出支持不稳定，这层也尽量把返回值稳定在“可解析”范围内。

## 3.7 第六步：第一次失败后的严格 JSON 重试

### 相关代码位置

- 重试提示词构造：[_build_retry_messages](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L79-L103)
- 重试主循环：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L281-L319)
- JSON 失败后的异常分支：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L300-L319)

### 重试策略

第一次调用失败时，系统不会马上放弃，而是追加一组更强硬的消息：

- 第一个字符必须是 `{`
- 最后一个字符必须是 `}`
- 指定字段必须是数组
- 布尔值必须用 `true/false`
- 严禁输出 Thinking Process / 说明 / 代码块

这一步相当于“人工强约束修正回合”。

## 3.8 第七步：统一 JSON 提取与修复

### 相关代码位置

- [parse_ai_json_response](file:///Users/fanhong/UGIT/shudao-aichat/app/utils/json_parser.py#L61-L113)
- [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L300-L318)

### 解析流程

1. 从模型输出中提取最可能的 JSON 段
2. 尝试直接 `json.loads`
3. 如果失败，修复常见问题
4. 再次解析
5. 还不行再尝试 Python literal 风格
6. 最终抛出 `JSONParseError`

### 关键辅助函数

- 分离思考过程和正式内容：[split_thinking_and_answer](file:///Users/fanhong/UGIT/shudao-aichat/app/utils/json_parser.py#L18-L58)
- 从输出中提取 JSON：[extract_json_from_model_output](file:///Users/fanhong/UGIT/shudao-aichat/app/utils/json_parser.py#L115-L142)
- 查找真正 JSON 起点：[json_parser.py](file:///Users/fanhong/UGIT/shudao-aichat/app/utils/json_parser.py#L178-L237)

### 这一步的意义

`aichat` 的意图识别对“模型脏输出”的容忍度比较高，已经形成统一的 JSON 容错层。

## 3.9 第八步：生成安全可展示的 thinking_content

### 相关代码位置

- 列表统一：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L105-L112)
- 裁剪长度：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L115-L127)
- 兜底展示文本：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L129-L165)
- 组合逻辑：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L168-L186)

### 处理逻辑

1. 先读取模型返回的 `reasoning_summary`
2. 将其统一变成字符串数组
3. 拼接为可展示文本
4. 若内容太短，则自动补一段“安全可展示”的问题理解摘要
5. 严格截断在配置范围内

### 这一步的设计目的

项目明确区分：

- 原始推理链：不能直接暴露
- 可展示的理解摘要：可以返回前端展示

这是 `aichat` 这套意图识别的重要特征。

## 3.10 第九步：本地兜底结果

### 相关代码位置

- 兜底函数：[_build_fallback_intent_result](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L189-L234)
- 触发位置：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L320-L327)

### 兜底逻辑

如果模型两次都没有返回可解析 JSON，则：

- 用简单规则判断是否专业问题
- 按 `enable_online_model` 推导 `route`
- 专业问题的关键词退化为原问题
- 非专业问题返回默认引导语

默认规则：

- 专业问题 + 开启在线模型 → `online_then_offline（先在线后离线）`
- 非专业问题 + 开启在线模型 → `online_only（仅在线）`
- 其他情况 → `offline_only（仅离线）`

### 这一步的意义

即使意图识别模型挂了，后面的问答主流程仍然可以继续跑下去。

## 3.11 第十步：标准化响应并返回

### 相关代码位置

- 返回值标准化与响应构造：[intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L328-L372)
- 响应模型：[models.py](file:///Users/fanhong/UGIT/shudao-aichat/app/schemas/models.py#L15-L29)

### 最终输出字段

这里会统一清洗：

- `keywords`
- `fallback_keywords`
- `company_aliases`

然后组装 `IntentAnalyzeResponse`

### 这里的注意点

- `is_professional_question` 缺失时会偏保守地默认为 `True`
- 说明它的设计倾向是“尽量不中断后续专业流程”

## 3.12 第十一步：被报告主流程消费

`aichat` 的意图识别真正价值体现在后续主流程里。

### 相关代码位置

- 调用意图识别：[report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L489-L527)
- 在线路由分支：[report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L577-L608)
- 非专业问题提前结束：[report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L610-L617)
- 内部查询检索分支：[report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L650-L659)

### 消费方式

1. 先发 SSE 状态 `status:intent`
2. 构造 `IntentAnalyzeRequest`
3. 调用 `analyze_intent()`
4. 将结果通过 SSE 发回前端

### 决策点

#### 1. 路由矫正

- 若未启用在线模型，但意图识别给出 `online_only（仅在线）` 或 `online_then_offline（先在线后离线）`
- 则在 [report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L506-L510) 中强制改为 `offline_only（仅离线）`

#### 2. 非专业问题直接终止后续流程

- 在 [report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L610-L617)

#### 3. `online_only（仅在线）` 直接走在线回答

- 在 [report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L577-L604)

#### 4. `online_then_offline（先在线后离线）` 异步启动在线回答，同时继续检索 / 报告

- 在 [report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L606-L608)

#### 5. `internal_query（内部查询）` 决定后续检索策略

- 在 [report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L650-L659)

### 小结

`aichat` 中的意图识别结果，直接决定：

- 是否继续流程
- 走哪条模型路由
- 是否做在线检索
- 是否做内部查询
- 用哪些关键词检索

---

## 4. `shudao-chat-py` 的意图识别流程

## 4.1 总体定位

和 `aichat` 相比，`chat-py` 的意图识别明显轻量很多。

它主要用于解决三个问题：

1. 这是不是问候语
2. 这是不是关于 AI 助手本身的 FAQ
3. 如果都不是，是不是走知识库查询

它不负责：

- 在线 / 离线路由
- 内部查询边界
- 公司识别
- 扩展检索词
- 安全展示型思考摘要

## 4.2 使用到的几个入口

### 独立意图识别接口

- [intent_recognition](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1982-L2072)

### 非流式问答中的意图识别

- [send_deepseek_message](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L959-L1016)

### 无 DB 流式问答中的意图识别

- [stream_chat](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1371-L1386)

### 主流式问答 `stream/chat-with-db`

- 这里并没有先走意图识别，而是直接 RAG 检索 + 生成，见 [chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1612-L1668)

这意味着 `chat-py` 内部不同问答入口对意图识别的依赖程度不一致。

## 4.3 Prompt 配置与模板文件

### Prompt 加载器

- 通用加载器是 [prompt_loader.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/utils/prompt_loader.py#L12-L221)

### Prompt 配置

- 配置文件是 [prompt_config.yaml](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/config/prompt_config.yaml#L4-L78)

其中意图识别模板配置为：

- `intent_recognition`
- 对应文件：`prompts/yitushibie_template_lite.md`

### 实际模板

- 模板文件是 [yitushibie_template_lite.md](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/prompts/yitushibie_template_lite.md#L1-L46)

### 模板定义的分类体系

模板只定义了 3 类：

- `greeting（问候）`
- `faq（常见问题）`
- `query_knowledge_base（知识库查询）`

并要求返回 JSON：

```json
{
  "intent": "意图类别",
  "confidence": 0.9,
  "search_queries": ["用户原始问题"],
  "direct_answer": "直接回答内容或空字符串"
}
```

### 这套 Prompt 的设计风格

和 `aichat` 不同，这里完全没有：

- `route`
- `need_offline_model`
- `intent_scene`
- `company_name`
- `fallback_keywords`
- `reasoning_summary`

所以它的定位就是“前分类”。

## 4.4 第一步：调用 `qwen_service.intent_recognition`

### 相关代码位置

- [QwenService.intent_recognition](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L73-L116)
- Prompt 加载器：[prompt_loader.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/utils/prompt_loader.py#L209-L221)
- Prompt 配置：[prompt_config.yaml](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/config/prompt_config.yaml#L4-L10)
- Prompt 模板：[yitushibie_template_lite.md](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/prompts/yitushibie_template_lite.md#L1-L46)

### 执行步骤

1. 通过 `load_prompt("intent_recognition", userMessage=message)` 加载意图识别 Prompt
2. 构造一条 `user` 消息
3. 调用 `self.chat(...)`
4. 使用专门的意图识别模型和专门的意图识别 API

### 特别点

QwenService 初始化时给意图识别单独配置了：

- `self.intent_api_url`
- `self.intent_model`

见 [qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L14-L29)

也就是说：

- `chat-py` 的“回答模型”与“意图识别模型”是可以分开的

## 4.5 第二步：实际 HTTP 调用模型

### 相关代码位置

- [QwenService.chat](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L117-L207)
- 调用点见 [qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L82-L85)
- 会把 `model=self.intent_model`
- `api_url=self.intent_api_url`

### 特点

1. 使用单独的 Intent API 配置
2. 若是普通 Qwen3 主模型目标地址，会支持 DeepSeek 回退
3. 但意图识别调用通常不是 Qwen3 主 URL，而是独立 Intent URL，因此不会自动走主问答那套回退逻辑

### 认证头处理

- 若配置了 `settings.intent.token`，会在 [qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L139-L147) 自动带上 Authorization

## 4.6 第三步：用正则从模型输出中提取 JSON

### 相关代码位置

- [qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L85-L115)

### 处理逻辑

1. 去除 Markdown 代码块标记
2. 用正则 `\{.*\}` 匹配最外层 JSON
3. 尝试 `json.loads`
4. 兼容字段名 `intent` / `intent_type`
5. 统一设置 `result["intent_type"]`

### 这里的容错特点

相比 `aichat`：

- `chat-py` 的解析容错更轻
- 没有统一 JSON 修复器
- 没有二次严格 JSON 重试
- 没有 schema 强约束

### 失败时的默认结果

若解析失败，直接返回：

```json
{
  "intent_type": "general_chat",
  "confidence": 0.5,
  "reason": "...",
  "response": ""
}
```

其中 `general_chat` 表示“通用聊天”。

见 [qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L111-L115)

## 4.7 第四步：根据 intent_type 组装直接回复

### 相关代码位置

- [qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L100-L109)

### 处理逻辑

如果解析成功：

- `greeting（问候）`
  - 使用 `direct_answer`，若没有则填默认欢迎语
- `faq（常见问题）`
  - 使用 `direct_answer`，若没有则填默认 FAQ 引导
- 其他类型
  - `response = direct_answer or ""`

这说明 `chat-py` 中“意图识别”其实还承担了“固定回复生成器”的角色。

## 4.8 第五步：独立接口 `/intent_recognition` 的行为

### 相关代码位置

- [intent_recognition](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1982-L2072)
- 请求模型：[chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1976-L1980)
- 调用服务：[qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L73-L116)

### 请求结构

- `message`
- `save_to_db`
- `ai_conversation_id`

### 执行过程

1. 从 `request.state.user` 取用户
2. 调 `qwen_service.intent_recognition(data.message)`
3. 读取：
   - `intent_type`
   - `response_text`
4. 如果 `save_to_db=true` 且意图是 `greeting（问候） / faq（常见问题）`
   - 创建或复用 `AIConversation`
   - 写入 user 消息
   - 写入 AI 消息
   - 返回 `ai_conversation_id` 与 `ai_message_id`
5. 其他情况只返回识别结果，不落库

### 这里的核心逻辑

独立接口更多是为“问候 / FAQ 快速返回 + 可选写历史”设计的，而不是为复杂编排设计的。

## 4.9 第六步：在非流式问答中的使用方式

### 相关代码位置

- [send_deepseek_message](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L959-L1016)
- RAG 检索函数：[_rag_search](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L529-L553)
- 最终回答 Prompt 配置：[prompt_config.yaml](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/config/prompt_config.yaml#L18-L29)
- 最终回答 Prompt 模板：[final_answer_template.md](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/prompts/final_answer_template.md)
- 思考拆分入口：[split_thinking_and_answer](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/utils/thinking_summary.py#L115-L168)

### 执行流程

当 `business_type == 0` 时：

1. 先调用 `qwen_service.intent_recognition(message)`
2. 提取 `intent_type`
3. 如果意图属于：
   - `query_knowledge_base（知识库查询）`
   - `知识库查询`
   - `技术咨询`
4. 才触发 `_rag_search(message, top_k=10)`
5. 再使用 `final_answer` prompt 组织最终问答
6. 调用 `qwen_service.chat(messages)` 生成答案
7. 若响应中含 `<think>`，再调用 `summarize_thinking_content()` 生成可展示摘要

### 这里意图识别的作用

只用于控制：

- 要不要做 RAG 检索

它不参与：

- 在线 / 离线路由
- 场景识别
- 检索范围决定

## 4.10 第七步：在无 DB 流式问答中的使用方式

### 相关代码位置

- [stream_chat](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1371-L1484)
- 意图识别调用点：[chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1371-L1386)
- 流式模型调用：[QwenService.stream_chat](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L208-L256)
- 思考摘要生成：[thinking_summary.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/utils/thinking_summary.py#L261-L380)

### 执行流程

1. 先做 `qwen_service.intent_recognition(message)`
2. 如果结果是知识库查询类，则执行 `_rag_search(message)`
3. 使用 `final_answer` prompt 组织消息
4. 调用 `qwen_service.stream_chat(messages)` 流式输出
5. 若输出带 `<think>`，调用 `summarize_thinking_content()` 把原始思考转成展示摘要

### 这里的关键点

和非流式版本一样，意图识别仍然只承担“RAG 开关”的角色。

## 4.11 第八步：RAG 检索本身怎么做

### 相关代码位置

- [_rag_search](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L529-L553)
- 非流式问答中的调用：[chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L971-L979)
- 无 DB 流式问答中的调用：[chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1383-L1392)
- 带 DB 主聊天中的调用：[chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1612-L1623)

### 执行逻辑

1. 读取 `settings.search.api_url`
2. 调用外部检索服务
3. 请求体：
   - `query`
   - `n_results`
4. 从返回结果提取文档内容
5. 拼成一大段 `rag_context`

### 注意点

这里的 `_rag_search()` 使用的是“用户原问题”直接检索，而不是意图识别结果中的 `search_queries`。

也就是说：

- Prompt 虽然要求模型输出 `search_queries`
- 但当前主链路并没有真正消费这个字段

这是 `chat-py` 意图识别和主流程之间一个比较明显的“设计上有、实现上未充分使用”的点。

## 4.12 第九步：思考过程摘要不是意图识别的一部分

### 相关代码位置

- [thinking_summary.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/utils/thinking_summary.py#L115-L168)
- 归一化与安全过滤在 [thinking_summary.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/utils/thinking_summary.py#L171-L259)
- 摘要生成主函数：[thinking_summary.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/utils/thinking_summary.py#L261-L380)
- 非流式问答中的调用：[chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1001-L1015)
- 无 DB 流式问答中的调用：[chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1442-L1477)

### 与 `aichat` 的区别

`chat-py` 的意图识别结果本身并不带 `thinking_content`。

它的“思考过程摘要”是在后续主回答阶段：

- 从主模型输出里提取 `<think>`
- 再二次总结

而 `aichat` 是在意图识别阶段就直接产出一份安全展示型摘要。

---

## 5. 两套实现的逐步对比

## 5.1 输入输出能力对比

| 维度 | `shudao-aichat` | `shudao-chat-py` |
|---|---|---|
| 输入 | 问题 + 历史 + 在线开关 | 主要是单条问题 |
| 输出 | 结构化决策对象 | 轻量分类结果 |
| 分类粒度 | 专业 / 非专业 + 内部查询 + 路由 | `greeting（问候） / faq（常见问题） / query_knowledge_base（知识库查询）` |
| 关键词 | 主关键词 + fallback 关键词 | Prompt 要求有 `search_queries`，但主流程几乎未使用 |
| 路由 | `offline_only（仅离线）` / `online_only（仅在线）` / `online_then_offline（先在线后离线）` | 无 |
| 内部查询 | 有 `intent_scene` / `company_name` / `company_aliases` | 无 |
| 展示摘要 | 意图识别阶段直接生成 `thinking_content` | 在主回答阶段再总结 `<think>` |

## 5.2 Prompt 设计对比

### `aichat`

- Prompt 更重、更像任务编排器
- 规则包含：
  - 领域边界
  - 路由规则
  - 内部查询规则
  - 关键词规则
  - 摘要与展示摘要规则

### `chat-py`

- Prompt 更轻、更像分类器
- 规则主要围绕：
  - 3 个意图类别
  - `greeting（问候） / faq（常见问题）` 的直接回答
  - `query_knowledge_base（知识库查询）` 需要检索

## 5.3 模型调用与容错对比

### `aichat`

- 使用离线模型服务
- 支持 `json_schema`
- 支持结构化输出降级
- 支持二次严格 JSON 重试
- 支持统一 JSON 提取与修复
- 支持本地兜底结果

### `chat-py`

- 使用单独的 intent 模型与 API
- 用正则做轻量 JSON 抽取
- 失败直接退回 `general_chat（通用聊天）`
- 没有统一 JSON 修复器
- 没有 schema 级强约束

## 5.4 与主流程耦合方式对比

### `aichat`

意图识别是主流程前置控制中心：

- 是否继续执行
- 走哪条路由
- 是否内部查询
- 用哪些词检索

### `chat-py`

意图识别只决定：

- 要不要做 RAG
- 是否直接返回 `greeting（问候） / faq（常见问题）`

所以两边的架构定位完全不同。

---

## 6. 两套流程的完整步骤清单

## 6.1 `shudao-aichat`

1. 路由进入 [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L237-L392)
2. 读取请求模型 [models.py](file:///Users/fanhong/UGIT/shudao-aichat/app/schemas/models.py#L9-L29)
3. 从 [prompts.py](file:///Users/fanhong/UGIT/shudao-aichat/app/utils/prompts.py#L70-L80) 构建意图识别 Prompt
4. 加载模板 [intent_analysis_prompt.md](file:///Users/fanhong/UGIT/shudao-aichat/prompts/intent_analysis_prompt.md#L1-L236)
5. 追加 system 规则 [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L258-L272)
6. 构造 JSON schema [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L24-L77)
7. 调用离线模型 [offline_llm_service.py](file:///Users/fanhong/UGIT/shudao-aichat/app/services/offline_llm_service.py#L30-L103)
8. 若格式异常则严格重试 [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L79-L103)
9. 统一 JSON 提取与修复 [json_parser.py](file:///Users/fanhong/UGIT/shudao-aichat/app/utils/json_parser.py#L61-L142)
10. 生成 `thinking_content` [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L168-L186)
11. 如仍失败则本地兜底 [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L189-L234)
12. 返回结构化结果 [intent.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/intent.py#L328-L372)
13. 在报告主流程中被消费 [report.py](file:///Users/fanhong/UGIT/shudao-aichat/app/api/report.py#L489-L617)

## 6.2 `shudao-chat-py`

1. 路由进入 [chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1982-L2072) 或被问答接口内部调用
2. 通过 [prompt_loader.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/utils/prompt_loader.py#L209-L221) 加载 Prompt
3. Prompt 配置来自 [prompt_config.yaml](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/config/prompt_config.yaml#L4-L23)
4. 模板文件是 [yitushibie_template_lite.md](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/prompts/yitushibie_template_lite.md#L1-L46)
5. 调用 [QwenService.intent_recognition](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L73-L116)
6. 使用专门的 intent 模型和 URL [qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L14-L29)
7. 用正则提取 JSON [qwen_service.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/services/qwen_service.py#L85-L115)
8. 将结果映射为 `intent_type` 和 `response`
9. 在非流式问答中决定是否做 `_rag_search()` [chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L959-L1016)
10. 在流式问答中决定是否做 `_rag_search()` [chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L1371-L1386)
11. 若是 `greeting（问候） / faq（常见问题）`，独立接口可选写入 DB [chat.py](file:///Users/fanhong/UGIT/shudao-main/shudao-chat-py/routers/chat.py#L2004-L2057)

---

## 7. 为什么两套流程会不同

从代码看，两套实现服务于不同阶段的架构目标：

### `shudao-chat-py`

- 更早期
- 更贴近“聊天接口先分类，再决定要不要查库”
- 所以意图识别更轻、更快、更窄

### `shudao-aichat`

- 更偏后期编排服务
- 需要控制：
  - SSE 主流程
  - 报告生成
  - 在线模型 / 离线模型
  - 内部查询边界
  - 文档检索
- 所以意图识别被扩展成“结构化路由器”

---

## 8. 当前实现的几个关键差异与注意点

## 8.1 `aichat` 的优势

- 结构化输出能力更强
- JSON 容错更完整
- 可直接驱动后续复杂流程
- 安全展示型 `thinking_content` 设计更成熟

## 8.2 `chat-py` 的优势

- 逻辑简单
- 响应快
- 接入聊天链路成本低
- `greeting（问候） / faq（常见问题）` 这类简单问题处理直接

## 8.3 `chat-py` 当前的局限

1. 分类粒度太粗
2. Prompt 中定义的 `search_queries` 没被主流程充分消费
3. 没有结构化路由能力
4. 没有统一 JSON 修复层
5. 不支持内部查询边界识别

## 8.4 `aichat` 当前的代价

1. Prompt 更重
2. 输出字段更多
3. 解析链路更复杂
4. 更依赖模型按 schema 输出

---

## 9. 最终总结

如果只看“意图识别”这个名词，两套实现看起来像是在做同一件事；但从代码职责来看，它们其实不是一个层级的能力：

### `shudao-chat-py`

更接近：

- 问句分类器
- RAG 触发器
- `greeting（问候） / faq（常见问题）` 的前置处理器

### `shudao-aichat`

更接近：

- 结构化流程路由器
- 查询场景识别器
- 检索参数生成器
- 问题摘要与展示摘要生成器
- 报告主流程的前置决策节点

如果后续只保留一套更完整的方案，从现有代码能力看，明显是 `shudao-aichat` 这套更适合作为统一的意图识别中枢。

---

## 10. 后续可继续补充的内容

若后续还要继续深挖，建议补以下几份配套分析：

1. `意图识别结果字段字典`
   - 每个字段由谁生成、在哪里消费、有什么业务语义

2. `aichat 与 chat-py 的问答主链对比`
   - 从“用户发问”到“最终回答”的完整对比链路

3. `Prompt 对比文档`
   - 把 `intent_analysis_prompt.md` 和 `yitushibie_template_lite.md` 逐段对照

4. `迁移建议`
   - 如果后续要收敛为一套意图识别能力，哪些部分应该保留、哪些部分应该删掉